Gemini 3.1 Flash-Lite Brings DeepMind's AI to More Developers at Lower Cost

# Gemini 3.1 Flash-Lite Brings DeepMind's AI to More Developers at Lower Cost DeepMind is making its Gemini models more accessible.

Gemini 3.1 Flash-Lite Brings DeepMind's AI to More Developers at Lower Cost

DeepMind is making its Gemini models more accessible. The company launched Gemini 3.1 Flash-Lite, a budget-friendly variant of its latest AI series, available in preview starting today.

Priced at $0.25 per million input tokens and $1.50 per million output tokens, Flash-Lite is the company's "fastest and most cost-efficient Gemini 3 series model," according to the announcement. It outperforms the previous 2.5 Flash model with 2.5x faster time to first answer token and a 45% boost in output speed, according to the Artificial Analysis benchmark.

The model achieves an Elo score of 1432 on the Arena.ai leaderboard and scores 86.9% on GPQA Diamond and 76.8% on MMMU Pro—benchmarks measuring reasoning and multimodal understanding. Notably, it beats larger Gemini models from prior generations on some metrics.

"Built for high-volume developer workloads at scale," Flash-Lite targets tasks like translation, content moderation, UI generation, and simulations. It comes with adjustable thinking levels, letting developers control how much computation the model spends on reasoning—a feature useful for managing high-frequency workflows.

Early adopters on AI Studio and Vertex AI include Latitude, Cartwheel, and Whering, who testers credited with handling complex inputs "with the precision of a larger-tier model."

The model is available in preview via the Gemini API in Google AI Studio and for enterprises through Vertex AI.