Quick Take
MiniMax-M3 and Google’s Gemini 3.5 Flash (medium) represent two distinct approaches to model deployment. Released on June 1, 2026, MiniMax-M3 enters the market with a zero-cost pricing structure. Gemini 3.5 Flash (medium), released slightly earlier on May 19, 2026, operates on a paid tier but provides detailed performance data regarding token speed and latency.
Benchmark Read
Both models demonstrate high levels of capability across various metrics. MiniMax-M3 holds an Intelligence index of 54.7 and a Coding index of 43.4. Gemini 3.5 Flash (medium) leads slightly with an Intelligence index of 54.8 and a Coding index of 43.9.
In specific benchmarks, MiniMax-M3 outperforms Gemini 3.5 Flash in:
- GPQA: 0.929 vs 0.921
- IFBench: 0.8286 vs 0.7456
- LCR: 0.74 vs 0.71
- TerminalBench Hard: 0.4242 vs 0.3939
Conversely, Gemini 3.5 Flash (medium) shows stronger results in:
- HLE: 0.399 vs 0.371
- SciCode: 0.53 vs 0.454
- TAU2: 0.9561 vs 0.8889
Math index data is currently unknown for both models.
Cost and Speed
Cost is the primary differentiator. MiniMax-M3 is priced at $0.00 per 1M tokens for both input and output. In contrast, Gemini 3.5 Flash (medium) costs $1.50 per 1M input tokens and $9.00 per 1M output tokens, resulting in a blended cost of $3.38 per 1M tokens.
Regarding performance, Gemini 3.5 Flash (medium) provides clear metrics: an output speed of 206.494 tokens per second and a time to first token of 11.76 seconds. Corresponding performance data for MiniMax-M3 remains unknown.
Best Fit
MiniMax-M3 is best suited for users or developers looking to minimize infrastructure costs while maintaining high-level performance in logic and instruction following. Gemini 3.5 Flash (medium) is better suited for enterprise applications where performance guarantees, speed metrics, and integration with Google’s development tools are required.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!