Google Introduces Gemini 3.1 Flash-Lite

March 4, 2026 Natalia Ganeva

Credit: Google

Google has unveiled the Gemini 3.1 Flash-Lite, the ultra-fast and lowest-cost model in the Gemini 3 line. Priced at just $0.25 per million input tokens and $1.50 per million output tokens.

Gemini 3.1 Flash-Lite outperforms the Gemini 2.5 Flash model in time to first token generation by 2.5x and outputs response data 45% faster, while maintaining similar or higher response quality. This low latency is essential for improving workflow efficiency, making the new model attractive for developers building responsive solutions and real-time applications.

The AI model achieved an impressive ELO score of 1432 on Arena.ai and outperformed other AI models of similar skill levels in reasoning and multimodal processing. In the GPQA Diamond and MMMU Pro tests, the algorithm scored 86.9% and 76.8%, respectively, outperforming some larger Gemini AI models of previous generations, such as Gemini 2.5 Flash.

The model is available through Google AI Studio and the Vertex AI platform. Gemini 3.1 Flash-Lite is currently in preview mode so developers can test the model and provide feedback before its full launch.

You May Also Like

Leave a Reply Cancel reply