mistralai/Mistral-7B-Instruct-v0.1

  • Modal Card
  • Files & version

Model Card for Mistral-7B-Instruct-v0.1

Table of Contents

TL;DR

Mistral-7B-Instruct-v0.1 is an instruction-tuned model based on Mistral-7B-v0.1, designed for improved performance in conversational AI and assistant tasks. It provides robust handling of instruction-following prompts in dialogue. This model is accessible through both the Mistral and Hugging Face Transformers libraries.


Model Details

Model Information

  • Model Type: Instruction-tuned large language model
  • Model Size: 7.24B parameters
  • Tensor Type: BF16
  • Supported Context Length: Up to 8k tokens
  • License: Open for community engagement and contributions

Model Developers

Developed by Mistral AI, with contributions from a diverse team including Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, and others.


Intended Use

Use Cases

Mistral-7B-Instruct-v0.1 is suitable for:

  • Conversational AI: Structured, instruction-following dialogue generation.
  • Educational Assistants: Providing brief explanations or answering questions.
  • Customer Support: Assisting users with general queries in a conversational format.
  • Knowledge Retrieval: Generating concise responses for various inquiries.

Out-of-Scope Use

This model lacks built-in moderation mechanisms and should not be deployed in scenarios that require strict content filtering.


Model Architecture

Mistral-7B-Instruct-v0.1 is based on the Mistral-7B architecture, with the following design features:

  • Grouped-Query Attention: Enhances model efficiency in handling attention queries.
  • Sliding-Window Attention: Improves handling of long input sequences.
  • Byte-fallback BPE tokenizer: Optimized for robust multilingual tokenization.

Training Details

Training Data

Mistral-7B-Instruct-v0.1 is fine-tuned using various publicly available datasets, curated to enhance the model’s instruction-following and dialogue generation capabilities. For comprehensive details, refer to the release paper and blog post.

Inference

Inference can be performed via both Mistral’s proprietary library and Hugging Face Transformers. Mistral’s framework supports low-latency deployments.


Limitations

The model currently does not include moderation mechanisms. Users should exercise caution when deploying Mistral-7B-Instruct-v0.1 in applications that require strict content safety and moderation.


Community

Mistral AI invites community contributions, particularly for enhancing the alignment between Mistral’s tokenizer and Transformers. Contributions, including pull requests to refine the model, are encouraged.

Known Issues and Troubleshooting

  • Transformer Compatibility: Users may encounter a
    KeyError
    related to the 'mistral' key in Transformers library. Updating to
    transformers-v4.33.4
    or later should resolve this.
  • Tokenizer Alignment: Contributions to improve tokenizer consistency between Mistral and Transformers are welcome.

Citation

If you use Mistral-7B-Instruct-v0.1 in your research, please cite:

@misc{mistralai2024mistral7b,
  author = {Mistral AI},
  title = {Mistral-7B-Instruct-v0.1: Fine-tuned Large Language Model for Instruction Following},
  year = {2024},
  url = {https://github.com/mistralai/mistral-models},
  publisher = {Mistral AI}
}