2026-06-10

Cohere North Mini Code: Open-Weight Coding Models Are Now Competing on Self-Hostability and License Cleanliness, Not Parameter Count

Cohere, a company known for closed enterprise models, ships its first developer-facing agentic coding model: a 30B MoE (3B active) under Apache 2.0 that runs on a single H100. The 33.4 Coding Index isn't the story — the bet on sovereign self-hosting is.

open-weight agents coding

Cohere North Mini Code: Open-Weight Coding Models Are Now Competing on Self-Hostability and License Cleanliness, Not Parameter Count — Photo / Unsplash

Summary

On 2026-06-09, Cohere released North Mini Code — its first model aimed at developers, and its first open-source one. The specs are deliberately modest: a 30B-total, 3B-active mixture-of-experts model, Apache 2.0 license, 256K context (up to 64K per generation), and a minimum hardware requirement of a single H100 at FP8. Cohere frames it as a “small, efficient agentic coding model, built for the sovereign developer ecosystem.”

Separate the signal from the noise first. The benchmark story is unremarkable: Cohere’s own framing is “competitive among similarly sized models,” which translates to a 33.4 on the Artificial Analysis Coding Index. That number shouldn’t be the headline — it claims no leaderboard wins and doesn’t touch the frontier closed models. What’s worth a builder’s attention is where the company chose to plant its flag: an open-weight coding model you can self-host on a single card, under a license clean enough to ship commercially without a legal review.

Put differently, the competition among open-weight coding models is quietly switching tracks. For the past year the contest was about parameter count and leaderboard rank. North Mini Code is going after a different axis — whether you can take a good-enough coding model, run it entirely inside your own private environment, and not have the license choke you. For a company that until now built almost exclusively closed enterprise models, that pivot says more than any score.

What happened

Cohere shipped four access paths at once: download the weights on Hugging Face, call the Cohere API, deploy to its managed Model Vault inference platform, or route through OpenRouter. The model was also trained specifically for compatibility with OpenCode, the open-source coding agent, though Cohere says it “works with most coding agents.”

A few numbers from the spec sheet are worth pulling out. The 30B-total, 3B-active MoE structure means only a small slice of experts actually computes at inference time, which is why Cohere can claim “strong software development performance without demanding extensive hardware” and set the floor at a single H100 (FP8). The 256K total context and 64K maximum generation give a 30B-class coding model a workable envelope — enough to take in mid-sized codebase context and emit chunked changes.

On capability, Cohere offers two sets of figures. First, quality: across agentic coding and terminal tasks — SWE-Bench Verified, SWE-Bench Pro, Terminal Bench v2, Terminal Bench Hard — it is “competitive against leading open-source models of a similar size,” netting a 33.4 Coding Index. Note Cohere’s own word is competitive, not leading — honest phrasing that builders shouldn’t inflate into “it beats the field.” Second, speed: under identical concurrency and hardware, North Mini Code reaches up to 2.8x the output throughput of Devstral Small 2 and a 30% advantage in inter-token latency — but on time-to-first-token (TTFT), Devstral Small 2 holds a slight edge.

Cohere explicitly calls this “the first — but certainly not the last” of a new generation of models, the opening move toward “a more open and sovereign developer ecosystem.” The subtext: this isn’t a one-off open-source PR stunt, it’s the start of a product line.

Why it matters

The real story here is Cohere’s pivot, and how it drags the competition among open-weight coding models away from “scale” and back toward “hostability plus license.”

Start with the license, because that’s the layer most easily glossed over by the word “open.” Apache 2.0 is a clean, commercially permissive license: you can use it commercially, ship closed-source derivatives, deploy privately — no “research only,” no “non-commercial,” no “renegotiate above some MAU threshold” riders. This has to be held apart from many other “open-weight” models on the market — a fair number of so-called open releases ship under research-only or custom licenses with commercial restrictions that an enterprise legal team will halt on sight. For a team that needs to put a model inside a commercial product and pass a compliance review, the gap between Apache 2.0 and a license that “looks open but is restricted in use” is the gap between “ship it” and “join the legal queue.” By choosing Apache 2.0, Cohere wrote “commercial-safe, no strings” directly into the value proposition.

Now the pivot. Cohere’s label has long been “closed enterprise-model vendor” — the Command series, enterprise retrieval, private deployments, customers who are large companies and governments. What it’s now betting on with 30B MoE plus Apache 2.0 is a “sovereign developer ecosystem”: letting customers own a full agentic coding stack inside their own data centers and compliance boundaries, beholden to no external API and no vendor lock. That line is actually consistent with Cohere’s DNA — its core customers (finance, public sector, healthcare, telecom) are precisely the buyers who care most about data never leaving the building, models being self-hostable, and supply chains being auditable. Handing that audience a coding model that fits in their own racks, carries a clean license, and can run agents fits them better than a higher-scoring model they can only reach through an API.

Don’t miss the competitive calculus either. Rather than fight the hyperscalers head-on for leaderboard rank on general coding models — a war of attrition Cohere’s size can’t win — it shifts to a niche the big players haven’t seriously worked and where Cohere already has a customer base and brand trust: sovereign, self-hosted, commercially licensed. This is positioning to its strengths, not a “we can build a general frontier model too” slugfest.

Builder impact

If you’re building a coding agent or need to deploy a coding model inside a private environment, North Mini Code belongs on the shortlist — but evaluate it with three judgments in hand, not on the hype of “Cohere went open.”

First, confirm your pain point is actually “self-hosting plus compliance.” If your constraints are that data can’t leave a private environment, weights must be auditable, and derivatives must be commercially usable, then the Apache 2.0 + single-H100 combination is aimed straight at you; it solves the exact wall you can’t route around. Conversely, if you have no hard self-hosting requirement and just want the strongest coding capability, the 33.4 Coding Index tells you this isn’t built for you — frontier closed APIs remain the lower-friction choice on raw capability. Getting its positioning straight up front saves a lot of mismatched evaluation time.

Second, treat the benchmarks as a “good-enough bar,” not a capability promise. Cohere only commits to “competitive in its size class,” not leading. The pragmatic move is to run an end-to-end evaluation on your own real codebase and agent workflow, measure success and rework rates on your tasks, then decide. Test the agentic scenarios in particular — Cohere explicitly trained it for workflows like understanding and orchestrating sub-agents, mapping systems architecture, and running code reviews, which is exactly where it should be stress-tested.

Third, be clear about what “open source” does and doesn’t solve. It solves control: the weights are in your hands, self-hostable, commercially usable, modifiable. It does not solve the capability ceiling — a 30B / 3B-active small model still trails the frontier on the hardest open-ended coding tasks, a function of size that openness can’t backfill. And don’t read the deep OpenCode tie as pure upside: Cohere trained it specifically for OpenCode compatibility, which means it’s smoothest there; other agent frameworks “mostly work,” but the full-strength form may be discounted.

A practical path: pull the weights from Hugging Face, stand up an instance on one H100, wire it into your existing agent framework, and run real tasks. Treat “single-card self-hosting plus a clean license” as its most certain value, and “is the capability enough” as a hypothesis you validate with your own data.

What to ignore

The misread to actively kill: “Cohere open-sourced a coding model too, so the open camp gained another leaderboard contender.” Wrong on both ends. On one end, the 33.4 Coding Index is right there — it never claimed leaderboard wins, Cohere’s word is competitive, and any press or community framing that elevates it to “beats the frontier” is doing the model’s bragging for it. On the other end, pinning this release’s value to its score gets the point exactly backward; North Mini Code’s value was never in the number, it’s in the one-two of “single-card hostable plus Apache 2.0 commercial-safe.” Stare at the leaderboard and you’ll miss what it’s actually selling.

Equally worth discounting is the reflex that “open source = free = cheaper.” Open source gives you control and compliance freedom, not “no cost.” You still pay for the H100, for the ops and monitoring of self-hosting, and for the engineering time to evaluate and integrate. For teams with no hard self-hosting need, calling a closed API is likely both cheaper and lower-effort. Open source answers “can you hold it in your own hands,” not “will it save you money” — conflating the two is the most common cognitive slip when evaluating open-weight models.

Finally, don’t get led by the “2.8x throughput” data point. It’s a vendor-run test against a single competitor, and it lost on TTFT. The speed advantage is real, but it comes with conditions and edges — measure it against your own concurrency and latency requirements rather than treating a favorable comparison as a universal conclusion.

Sources

Introducing North Mini Code: Cohere's first model for developers / official