HuggingFaceTB/SmolLM-1.7B-Instruct

  • Modal Card
  • Files & version

SmolLM-Instruct

Model creator: HuggingFaceTB
Original model: SmolLM Series (135M, 360M, 1.7B parameters)

Description

The SmolLM-Instruct series consists of small language models available in three sizes: 135M, 360M, and 1.7B parameters. These models are pre-trained on SmolLM-Corpus, a curated collection of high-quality educational and synthetic data designed specifically for LLMs. SmolLM-Instruct models are fine-tuned on publicly available datasets, optimized for instruction-based tasks.

Changelog

VersionDescription
v0.1Initial release. Fine-tuned on the permissive subset of WebInstructSub, combined with StarCoder2-Self-OSS-Instruct. Direct Preference Optimization (DPO) was applied for one epoch on HelpSteer (135M and 1.7B models) and argilla/dpo-mix-7k for the 360M model.
v0.2Fine-tuning mix updated with datasets better suited for small models, including 2k everyday conversations (llama3.1-70B), Magpie-Pro-300K-Filtered, StarCoder2-Self-OSS-Instruct, and OpenHermes-2.5. Improved coherence and responsiveness to standard prompts.

In v0.2, SmolLM-360M-Instruct has a 63.3% win rate over v0.1 on AlpacaEval. For more details, see here.

Best Use Cases for SmolLM-Instruct Models

SmolLM-Instruct models are well-suited for the following applications:

  • Interactive Applications: Building efficient chatbots or virtual assistants.
  • Content Generation: Assisting in drafting articles, reports, and creative writing.
  • Customer Support: Automating responses to general inquiries.
  • Educational Tools: Providing tutoring and answering general knowledge questions.

These models are designed to handle instruction-following tasks effectively, especially in lightweight and resource-constrained environments.

Model Architecture

SmolLM-Instruct models are decoder-only models based on optimized transformer architecture. The training data includes a wide range of high-quality educational and synthetic datasets, ensuring the models can generate human-like, contextually relevant text responses.