Gemma 7B Base is part of Google’s open model family designed for general-purpose text generation tasks. With 8.54 billion parameters, this model is built for versatility and can be deployed across various platforms, enabling powerful text generation capabilities on devices with constrained resources.
Gemma 7B Base can be fine-tuned using provided scripts and notebooks for Supervised Fine-Tuning (SFT) on datasets like UltraChat. Users can adapt these scripts to enhance the model's performance on domain-specific tasks.
Supports various deployment environments, including CPU-only and multi-GPU setups. Quantization options (e.g., 8-bit and 4-bit) are available for enhanced performance on resource-constrained devices.
Gemma 7B Base is ideal for:
While the model provides general-purpose language capabilities, it may inherit biases from training data, leading to unintended outputs. Potential misuse could lead to the generation of biased or harmful content.
Gemma models are trained on a diverse dataset totaling 6 trillion tokens, including:
Data preparation involved multiple filtering stages for quality and safety:
Gemma 7B Base has been evaluated on a variety of benchmarks to assess performance across text generation, comprehension, and reasoning tasks.
Benchmark | Metric | Gemma 2B Base | Gemma 7B Base |
---|---|---|---|
MMLU | 5-shot, top-1 | 42.3 | 64.3 |
HellaSwag | 0-shot | 71.4 | 81.2 |
PIQA | 0-shot | 77.3 | 81.2 |
TriviaQA | 5-shot | 53.2 | 63.4 |
CommonsenseQA | 7-shot | 65.3 | 71.3 |
GSM8K | maj@1 | 17.7 | 46.4 |
Average | 45.0 | 56.9 |
Gemma models were rigorously tested through structured evaluations, including human assessments on content safety, representational harms, and data memorization risks.
The 7B Base model met ethical standards, showing acceptable performance in established safety benchmarks.
Benchmark | Metric | Gemma 2B Base | Gemma 7B Base |
---|---|---|---|
RealToxicity | Average | 6.86 | 7.90 |
BBQ Ambig | 1-shot, top-1 | 62.58 | 92.54 |
Winogender | Top-1 | 51.25 | 54.17 |
TruthfulQA | Average | 44.84 | 31.81 |
Toxigen | Top-1 | 29.77 | 39.59 |
Gemma models are designed for a range of applications, such as:
Gemma models offer powerful, open-access language capabilities with a focus on responsible and democratized AI. Their high-performance metrics position them as competitive solutions among similarly sized models.
If you use Gemma in your research, please cite:
```bibtex @misc{https://doi.org/10.48550/arxiv.2210.11416, doi = {10.48550/ARXIV.2210.11416}, url = {https://arxiv.org/abs/2210.11416}, author = {Google AI}, title = {Gemma 7B: Base Model for General AI Tasks}, publisher = {arXiv}, year = {2023}, keywords = {Machine Learning (cs.LG), Computation and Language (cs.CL)}, copyright = {Creative Commons Attribution 4.0 International} }