google/gemma-2b-it

  • Modal Card
  • Files & version

Model Card for Gemma 2B Instruct

Table of Contents

TL;DR

Gemma is a family of open, state-of-the-art language models by Google. Built with the same technology as the Gemini models, Gemma models are optimized for a variety of text-to-text generation tasks, including question answering, summarization, and reasoning. They are lightweight, deployable on resource-constrained environments, and available with open weights, promoting democratized access to advanced AI.


Model Details

Model Information

  • Model Type: Text-to-text, decoder-only large language model
  • Language(s): English
  • License: Terms of Use available on the Gemma model page
  • Related Models: 2B base, 7B base, and 7B instruct models

Usage

Inputs and Outputs

  • Input: Text string (e.g., question, prompt, document to summarize)
  • Output: Generated English text (e.g., answer, summary)

Hardware Compatibility

Gemma 2B Instruct is optimized for use in environments with limited resources, making it deployable on devices like laptops, desktops, or custom cloud infrastructures.


Uses

Direct Use and Downstream Use

Gemma is suitable for:

  • Text Generation: Creating content such as poems, scripts, and email drafts.
  • Question Answering and Summarization: Providing concise answers or summaries of complex texts.
  • Conversational AI: Supporting chatbots and virtual assistants for customer service.
  • Code Generation: Understanding programming-related queries and generating code snippets.

Bias, Risks, and Limitations

Ethical Considerations and Risks

Large language models like Gemma can generate inappropriate or biased content, reflecting socio-cultural biases in training data. Gemma’s model card outlines potential ethical concerns, including risks related to misinformation, biases, and harmful content.

Known Limitations

  • Training Data Biases: The model’s responses may reflect biases present in its training data.
  • Complex Task Handling: Gemma performs best with clear prompts; open-ended tasks might yield suboptimal responses.
  • Factual Inaccuracy: Responses are based on patterns in the training data, and may contain outdated or incorrect information.

Training Details

Training Dataset

Gemma models were trained on a dataset comprising 6 trillion tokens from various sources, including:

  • Web Documents: Diverse English-language content for broad topic coverage.
  • Code: Exposure to programming syntax for code generation tasks.
  • Mathematical Texts: Training on logic and symbolic reasoning for mathematical queries.

Data Preprocessing

  • CSAM Filtering: Rigorous filtering to exclude illegal content.
  • Sensitive Data Filtering: Exclusion of personal information and other sensitive data.
  • Content Quality Filtering: Additional filtering to ensure data quality and adherence to policy.

Hardware and Software

  • Hardware: TPUs (Tensor Processing Units, TPUv5e) optimized for high-bandwidth memory, scalability, and performance.
  • Software: JAX and ML Pathways used for efficient model training and orchestration.

Evaluation

Benchmark Results

Gemma models were evaluated on various datasets covering aspects like commonsense reasoning, question answering, and code generation.

BenchmarkMetric2B Params7B Params
MMLU5-shot, top-142.364.3
HellaSwag0-shot71.481.2
PIQA0-shot77.381.2
TriviaQA5-shot53.263.4
CommonsenseQA7-shot65.371.3
GSM8Kmaj@117.746.4
Average45.056.9

Ethics and Safety

Evaluation Approach

Gemma underwent structured evaluations and red-teaming tests for content safety, representational harms, and memorization risks.

Evaluation Results

Evaluation on datasets like BBQ, BOLD, Winogender, Winobias, RealToxicity, and TruthfulQA showed Gemma meets internal safety thresholds.

BenchmarkMetric2B Params7B Params
RealToxicityAverage6.867.90
WinogenderTop-151.2554.17
TruthfulQAAverage44.8431.81
ToxigenTop-129.7739.59

Intended Usage and Limitations

Intended Usage

  • Content Creation: Supports creative tasks, chatbots, summarization, and more.
  • Research and Education: Assists NLP researchers in developing new algorithms and educational tools.

Limitations

  • Data Bias: Biases in training data may influence model responses.
  • Complexity and Nuance: Model may struggle with open-ended tasks or nuanced language.
  • Factual Accuracy: Responses may contain outdated or incorrect information.

Benefits

Gemma provides high-performance, open-access language models, promoting responsible AI and democratized access to LLM technology. Compared to similarly sized models, Gemma excels in benchmarks, making it a competitive choice for developers and researchers.


Citation

If you use Gemma in your research, please cite:

```bibtex @misc{https://doi.org/10.48550/arxiv.2210.11416, doi = {10.48550/ARXIV.2210.11416}, url = {https://arxiv.org/abs/2210.11416}, author = {Google AI}, title = {Gemma: Lightweight Open Large Language Models for Responsible AI}, publisher = {arXiv}, year = {2023}, keywords = {Machine Learning (cs.LG), Computation and Language (cs.CL)}, copyright = {Creative Commons Attribution 4.0 International} }