Gemma is a lightweight, open large language model family by Google, designed for efficient deployment on limited-resource devices. The 1.1 version introduces improvements via Reinforcement Learning from Human Feedback (RLHF), enhancing quality, factuality, instruction following, and multi-turn conversation capabilities. The model is suitable for tasks such as question answering, summarization, and code generation.
Gemma 1.1 2B Instruct can be deployed in resource-constrained environments such as laptops, desktops, or private cloud infrastructures, making it widely accessible for practical applications.
Gemma 1.1 2B Instruct is well-suited for:
The model may not perform well on tasks beyond its training scope. Caution should be exercised for applications with high factual accuracy or ethical requirements.
Large language models like Gemma 1.1 2B Instruct may reflect socio-cultural biases present in the training data, potentially resulting in biased or inappropriate outputs. The model may also propagate misinformation if not used responsibly.
Gemma models were trained on a vast dataset comprising 6 trillion tokens from diverse sources:
Gemma models were evaluated on a range of tasks to assess performance across text generation, reasoning, and factuality.
Benchmark | Metric | Gemma 1.1 IT 2B | Gemma 1.1 IT 7B |
---|---|---|---|
MMLU | 5-shot, top-1 | 42.3 | 64.3 |
HellaSwag | 0-shot | 71.4 | 81.2 |
PIQA | 0-shot | 77.3 | 81.2 |
TriviaQA | 5-shot | 53.2 | 63.4 |
CommonsenseQA | 7-shot | 65.3 | 71.3 |
GSM8K | maj@1 | 17.7 | 46.4 |
Average | 45.0 | 56.9 |
Ethics and safety evaluations include structured tests, red-teaming, and human assessments. Key areas include content safety, representational harms, and memorization risks.
The Gemma model family meets internal standards on metrics like content safety, representational harms, and large-scale harm prevention. Key results:
Benchmark | Metric | Gemma 1.1 IT 2B | Gemma 1.1 IT 7B |
---|---|---|---|
RealToxicity | Average | 7.03 | 8.04 |
BBQ Ambig | 1-shot, top-1 | 58.97 | 86.06 |
Winogender | Top-1 | 50.14 | 57.64 |
TruthfulQA | Average | 44.24 | 45.34 |
Toxigen | Top-1 | 29.64 | 38.75 |
Gemma provides high-performance, open-access language models with responsible AI features, aiming to democratize AI technology for developers and researchers. The model's performance is competitive with other open models, making it a valuable tool for various applications.
If you use Gemma in your research, please cite:
```bibtex @misc{https://doi.org/10.48550/arxiv.2210.11416, doi = {10.48550/ARXIV.2210.11416}, url = {https://arxiv.org/abs/2210.11416}, author = {Google AI}, title = {Gemma 1.1: Instruction-Tuned Lightweight Language Models for Responsible AI}, publisher = {arXiv}, year = {2023}, keywords = {Machine Learning (cs.LG), Computation and Language (cs.CL)}, copyright = {Creative Commons Attribution 4.0 International} }