If you're familiar with T5, FLAN-T5 provides enhanced performance. For the same parameter count, these models have been fine-tuned on over 1,000 additional tasks, supporting more languages and achieving strong few-shot performance. FLAN-PaLM 540B, for example, achieves state-of-the-art performance on benchmarks like five-shot MMLU at 75.2%. Instruction finetuning has proven effective for usability across a wide range of tasks.
FLAN-T5 Base is intended for:
According to Rae et al. (2021), language models like FLAN-T5 can potentially be used for harmful language generation. FLAN-T5 should be assessed for safety and fairness concerns before deployment in sensitive applications.
FLAN-T5 has not been rigorously tested in real-world applications and may generate inappropriate or biased content based on the training data.
FLAN-T5 should not be used for unacceptable cases, such as generating harmful or abusive speech.
The model was fine-tuned on a diverse set of tasks, enhancing zero-shot and few-shot performance across multiple languages and NLP tasks. Refer to the research paper for a complete list of tasks.
The model was trained on TPU v3 or TPU v4 pods using the
t5x
jax
The authors evaluated FLAN-T5 Base on 1,836 tasks across multiple languages. For more detailed quantitative evaluation, see the research paper’s Table 3.
FLAN-T5 Base achieved competitive performance, including a benchmark score of 77.98, surpassing the baseline model google/t5-v1_1-base with a score of 68.82.
Please cite FLAN-T5 Base as follows:
@misc{https://doi.org/10.48550/arxiv.2210.11416, doi = {10.48550/ARXIV.2210.11416}, url = {https://arxiv.org/abs/2210.11416}, author = {Chung, Hyung Won and Hou, Le and Longpre, Shayne and Zoph, Barret and Tay, Yi and Fedus, William and Li, Eric and Wang, Xuezhi and Dehghani, Mostafa and Brahma, Siddhartha and Webson, Albert and Gu, Shixiang Shane and Dai, Zhuyun and Suzgun, Mirac and Chen, Xinyun and Chowdhery, Aakanksha and Narang, Sharan and Mishra, Gaurav and Yu, Adams and Zhao, Vincent and Huang, Yanping and Dai, Andrew and Yu, Hongkun and Petrov, Slav and Chi, Ed H. and Dean, Jeff and Devlin, Jacob and Roberts, Adam and Zhou, Denny and Le, Quoc V. and Wei, Jason}, keywords = {Machine Learning (cs.LG), Computation and Language (cs.CL), FOS: Computer and information sciences}, title = {Scaling Instruction-Finetuned Language Models}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} }
This model cars is auto generated based on the original model card on huggingface.