long-context

2026-06-16 zhipu

GLM-5.2 Ships Its Weights: Open Models Have Made the Frontier a Quarterly Refresh

Zhipu released GLM-5.2 weights under MIT, with a 1M context, a long-horizon focus, and a tunable thinking budget. Its own benchmarks place it within a point or two of the closed frontier on long-horizon coding. The real signal is not another leaderboard run but the open-weight capability-cost curve dropping another notch. Treat the vendor numbers with a discount, and test the 1M usability and long-horizon reliability on your own tasks.

open-weights long-context frontier-models

Read analysis

2026-06-14 zhipu

GLM-5.2 Goes Fully Open: Zhipu Turns America's Ban Into a Selling Point

Zhipu released GLM-5.2 and declared it fully open the same week Anthropic's Fable was pulled. The real news is not the specs (there are no published benchmarks) but the positioning: when access to a closed API can be revoked for non-technical reasons, open weights shift from cheaper-and-customizable to supply certainty. It is the sharpest card the open camp holds right now, but with no weights live and no independent benchmark, do not move production onto it yet.

open-weights long-context coding-models

Read analysis

2026-06-10 deepseek

DeepSeek V4 Moves 1M Context Into the Cost-Structure Era

DeepSeek V4 matters because it turns 1M context from a capability demo into a cost, routing, and product-default problem for builders.

frontier-models frontier-progress ai-infra

Read analysis

2026-06-10 deepseek

DeepSeek V4's Open-Weight and API Strategy Is a Distribution Play

DeepSeek V4 pressures closed frontier models by pairing open weights with same-day API availability, compatibility, and a clear migration path.

frontier-models ai-infra inference

Read analysis

2026-06-10 minimax

MiniMax M3 Puts Long-Context Cost Into the Architecture Layer

MiniMax M3's real signal is not another 1M context window; it is MSA trying to lower long-context cost before serving tricks begin.

frontier-models frontier-progress long-context

Read analysis

2026-06-10 minimax

MiniMax M3: The Real Story Is Sparse Attention Making 1M Context Affordable, Not the 59% Leaderboard Line

M3's real signal is MSA cutting per-token compute at 1M context to 1/20 of the prior generation, with 15x faster decoding. The cost curve of long-context agents is pushed down by a Chinese lab. But the weights were not open on launch day; 'open source in 10 days' is the sincerity test.

frontier-models long-context ai-infra

Read analysis

2026-06-10 minimax

MiniMax M3's Adoption Bottleneck Is the Serving Ecosystem

M3's hard part is not the model card; it is whether vLLM and the broader serving stack can support MSA's block-sparse attention efficiently.

frontier-models long-context ai-infra

Read analysis