Model creator: HuggingFaceTB
Original model: SmolLM Series (135M, 360M, 1.7B parameters)
The SmolLM-Instruct series consists of small language models available in three sizes: 135M, 360M, and 1.7B parameters. These models are pre-trained on SmolLM-Corpus, a curated collection of high-quality educational and synthetic data designed specifically for LLMs. SmolLM-Instruct models are fine-tuned on publicly available datasets, optimized for instruction-based tasks.
Version | Description |
---|---|
v0.1 | Initial release. Fine-tuned on the permissive subset of WebInstructSub, combined with StarCoder2-Self-OSS-Instruct. Direct Preference Optimization (DPO) was applied for one epoch on HelpSteer (135M and 1.7B models) and argilla/dpo-mix-7k for the 360M model. |
v0.2 | Fine-tuning mix updated with datasets better suited for small models, including 2k everyday conversations (llama3.1-70B), Magpie-Pro-300K-Filtered, StarCoder2-Self-OSS-Instruct, and OpenHermes-2.5. Improved coherence and responsiveness to standard prompts. |
In v0.2, SmolLM-360M-Instruct has a 63.3% win rate over v0.1 on AlpacaEval. For more details, see here.
SmolLM-Instruct models are well-suited for the following applications:
These models are designed to handle instruction-following tasks effectively, especially in lightweight and resource-constrained environments.
SmolLM-Instruct models are decoder-only models based on optimized transformer architecture. The training data includes a wide range of high-quality educational and synthetic datasets, ensuring the models can generate human-like, contextually relevant text responses.