If you're familiar with T5, FLAN-T5 provides enhanced performance across tasks. For the same parameter count, these models have been fine-tuned on over 1,000 additional tasks, supporting more languages and achieving strong few-shot performance. FLAN-PaLM 540B, for example, achieves state-of-the-art performance on benchmarks like five-shot MMLU at 75.2%. Instruction finetuning has proven effective for usability across a wide range of tasks.
FLAN-T5 XL is intended for:
More information is needed on out-of-scope applications.
According to Rae et al. (2021), language models like FLAN-T5 can potentially be used for harmful language generation. FLAN-T5 should be assessed for safety and fairness concerns before deployment in sensitive applications.
FLAN-T5 has not been rigorously tested in real-world applications and may generate inappropriate or biased content based on the training data.
FLAN-T5 should not be used for unacceptable cases, such as generating harmful or abusive speech.
The model was fine-tuned on a diverse set of tasks, enhancing zero-shot and few-shot performance across multiple languages and NLP tasks. Refer to the research paper for a complete list of tasks.
The model was trained on TPU v3 or TPU v4 pods using the
t5x
jax
The authors evaluated FLAN-T5 XL on 1,836 tasks across multiple languages. For more detailed quantitative evaluation, see the research paper’s Table 3.
FLAN-T5 XL achieved competitive performance across tasks, often surpassing the baseline T5 models in various benchmarks.
Please cite FLAN-T5 XL as follows:
@misc{https://doi.org/10.48550/arxiv.2210.11416, doi = {10.48550/ARXIV.2210.11416}, url = {https://arxiv.org/abs/2210.11416}, author = {Chung, Hyung Won and Hou, Le and Longpre, Shayne and Zoph, Barret and Tay, Yi and Fedus, William and Li, Eric and Wang, Xuezhi and Dehghani, Mostafa and Brahma, Siddhartha and Webson, Albert and Gu, Shixiang Shane and Dai, Zhuyun and Suzgun, Mirac and Chen, Xinyun and Chowdhery, Aakanksha and Narang, Sharan and Mishra, Gaurav and Yu, Adams and Zhao, Vincent and Huang, Yanping and Dai, Andrew and Yu, Hongkun and Petrov, Slav and Chi, Ed H. and Dean, Jeff and Devlin, Jacob and Roberts, Adam and Zhou, Denny and Le, Quoc V. and Wei, Jason}, keywords = {Machine Learning (cs.LG), Computation and Language (cs.CL), FOS: Computer and information sciences}, title = {Scaling Instruction-Finetuned Language Models}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} }
This model card was auto-generated, incorporating details from the official FLAN-T5 model card.