2026-06-15

Rio's sovereign LLM falls apart: open weights make a lab capability lie mathematically falsifiable

Rio de Janeiro's city IT company shipped a 397B Brazilian sovereign model and claimed it was trained in-house to beat its peers. Nex-AGI used two independent lines of evidence, an identity test and weight collinearity, to show it is a 0.6 Nex plus 0.4 Qwen element-wise merge. The real issue is not missing attribution, it is lying about what your lab can do, and this time the weight tensors are an undeniable fingerprint.

model-merging open-weights sovereign-ai model-provenance ai-governance

Rio's sovereign LLM falls apart: open weights make a lab capability lie mathematically falsifiable — Photo / Unsplash

Summary

Rio de Janeiro’s municipal IT company, IplanRIO, released Rio-3.5-Open-397B with a tidy story: a 397B-parameter sovereign open model, trained by Brazilians, beating comparable open models on benchmarks. For a city IT department, that is a headline.

Then another lab, Nex-AGI, took it apart. The conclusion left no room: Rio’s weights are an element-wise merge of Nex-AGI’s own model, Nex, with the official Qwen3.5-397B-A17B base, roughly 0.6 Nex and 0.4 Qwen, with no trace of any training of their own. What makes it stick is the method. Nex-AGI offered two completely independent proofs, each one sufficient on its own.

After the report landed, Rio quietly edited its Hugging Face model card to admit the thing was a merge of Nex-N2-Pro and Qwen3.5 plus a round of distillation, saying it had uploaded the wrong file earlier and apologizing. The mayor told a different story, publicly describing it as an open AI model trained in Rio with public funding over the past year, while a team member said on social media that no public money was used. The accounts do not line up.

The point here is not another academic-misconduct story. It is a shift that matters more: once a model ships as open weights, lying about what your lab can do becomes the kind of claim anyone can falsify with math.

The debate

On the surface the dispute is one sentence: did Rio actually train this model. Rio says yes, and that it tops benchmarks. Nex-AGI says no, it is two existing models added in fixed proportions.

A layer down, the dispute is about how to classify what happened. The first reaction on Hacker News was that this was just missing attribution to Qwen, and someone corrected it at once: attribution is not the relevant part, lying about your lab’s capabilities is. That correction names the real disagreement. Taking someone’s open model and forgetting to thank them is a courtesy problem. Telling the public you trained a model that beats its peers, when all you did was a weighted average, is a different kind of thing. One is rude. The other is false.

Further down sits the fight over money and motive. The mayor says public funds were used, the team says they were not, and the issue and HN threads burned through that argument. A Rio taxpayer spoke up directly: as someone whose money is at stake, he wants to know where it went, and if the work done was not the work that was commissioned, technical people have a duty to lay it out so a proper investigation can follow. That layer leaves engineering and becomes public accountability.

There is one internal contradiction worth flagging. If the goal was the strongest possible model, the obvious move was 100 percent Nex-N2-Pro, with no reason to blend in a Qwen base that would only drag it down. The only motive that explains that 0.4 of Qwen is wanting the result to look like something trained in-house and distinct from Qwen. The presence of that Qwen fraction is itself circumstantial evidence of intent.

Who’s right

On evidence it is close to one-sided. Both of Nex-AGI’s proofs hold up technically.

The first is the identity test. Rio shipped with a hard-coded system prompt forcing the model to say it is Rio. That is already a tell: an original model does not need to be ordered to state its own name. Nex-AGI removed the prompt and asked the underlying model who it was, 120 times. With the mask off, the model identified as Nex 79.2 percent of the time (95 of 120), as being from Nex-AGI 73.3 percent of the time, and as Rio 0 percent of the time (0 of 120), reciting the bespoke backstory Nex-AGI had trained into its own model word for word. A model named Rio that, with the prompt gone, calls itself Nex four times in five and never once calls itself Rio is not explained by coincidence.

The second is weight collinearity, and this one is decisive. A merge is a rigid mathematical relationship: if Rio = α Nex + (1−α) Qwen, then for every tensor, (Rio minus Qwen) must be exactly α times (Nex minus Qwen). Nex-AGI measured two numbers per tensor: the mixing weight α, and a collinearity cos_fit, asking whether Rio’s deviation from Qwen points in the same direction as Nex’s deviation from Qwen. For two unrelated models those directions are nearly orthogonal in a billion-dimensional space, so cos_fit is near 0. For a real merge it is near 1.

What they found spans all 60 layers and every component. The routed experts, the 387B-parameter bulk, show α of 0.571 (±0.0016) and collinearity of 0.993. The output head is 0.991, attention about 0.986, linear-attention projections about 0.984. A collinearity of 0.98 to 0.99 is not vague language about high similarity. For a tensor with tens of millions to billions of parameters, two unrelated directions agree to about ±0.0001 by chance, so measuring 0.99 sits thousands to tens of thousands of standard deviations from random. That is not statistically suspicious, it is statistically impossible.

The hardest blow is that Rio edited its own model card to admit the merge. When the defendant confesses, the argument is largely over.

Give Rio one piece of room. In its admission it also said it had uploaded the wrong file, the merged base rather than the final distilled model, and that it would re-upload the correct one. That claim cannot be verified yet, so leave it open. Even if true, it only adds the distillation step back on top of a merge. It does not move the core fact: the base is a merge, not in-house training.

Why it matters

The genuinely new signal is not one more fake. It is that the way fakes get caught has changed.

Proving an institution lied about its technical ability used to take leaked emails, internal memos, an insider on the record, all of it indirect and deniable, fixable by PR and faded by time. Not here. The evidence is the weights themselves. Anyone with copies of Nex, Qwen, and Rio plus a Python script can recompute that 0.99 collinearity. It is reproducible, independently verifiable, and dependent on no insider, which makes it among the hardest evidence to deny. A commenter on the issue put it well: open weights mean you never disappear, and they also mean you cannot hide theft, because the weights are a fingerprint, every model carries its parentage in its tensors, and you cannot launder a model the way you launder money because the math remembers.

For every government and company waving a homegrown sovereign AI flag, that is a direct warning. The political story of sovereign AI leans on we trained it ourselves, because that signals autonomy and independence from foreign providers. But the moment you ship open weights, you hand over the right to verify alongside them: whatever you claim to have trained, the world can recompute tensor by tensor. Open weights and overstated claims are now mutually exclusive. Either do the work or do not open-source it, because the fingerprint will give you away.

There is a practical takeaway for builders too. Model provenance is becoming a measurable, auditable dimension. Collinearity analysis does not only catch fakes, it can also serve positive due diligence: before you procure or build on an open model, you can check what its lineage actually is. The math sits with researchers today, but it is the kind of thing that gets tooled and productized.

What to ignore

Kill the first misread: that this is just missing attribution to Qwen. It was the earliest reaction on Hacker News and the one most worth correcting. A missing citation is a community-courtesy matter you can apologize for and patch. What happened is a public body misstating its own R&D capability and using that claim for civic publicity. Downgrading a capability lie to a missing footnote lets the real problem walk.

Kill the second one too: that since the merge worked, you can just blend all open models into something stronger. People on HN did ask this hopefully, and practitioners poured cold water on it immediately. Merging only works between architecturally matched, related models, and it worked here precisely because Nex is itself a fine-tune of Qwen3.5, so this was a fine-tune of Qwen blended with the Qwen base, naturally collinear. Most merges help only on narrow feeling-type benchmarks and tend to degrade on real long-chain reasoning. Treating a merge as free capability stacking is a technical mistake. In all likelihood the Qwen fraction here was not for performance but to make the result look like something other than pure Nex.

Finally, do not get pulled into the public-funds shouting match. The mayor says public money was used, the team says it was not, and the issue and HN spent half a screen on it, but that is a fact for local auditors to settle, not something engineering can rule on. The emotional venting about fixing favelas before training models, or who pays these researchers’ salaries, changes nothing about the core finding. The technical part is settled: Rio’s weights were merged, not trained. The rest belongs to the investigation.

FAQ

How do you mathematically prove a model is a merge of two others?

Check whether the weight tensors are collinear. If Rio = α Nex + (1−α) Qwen, then for every tensor (Rio minus Qwen) must equal exactly α times (Nex minus Qwen). Nex-AGI measured two quantities per tensor: the mixing weight α, and a collinearity cos_fit, meaning whether Rio's deviation from Qwen points in the same direction as Nex's deviation from Qwen. For two unrelated models those directions are essentially orthogonal in a billion-dimensional space, so cos_fit is near 0. For a genuine merge it is near 1. They measured 0.98 to 0.99, which for a tensor of tens of millions to billions of parameters is thousands to tens of thousands of standard deviations away from chance.

Was Rio-3.5-Open-397B trained from scratch at all?

No. Two independent lines of evidence point to an element-wise merge. With the hard-coded 'You are Rio' system prompt removed, the model identifies as Nex 79.2 percent of the time and as Rio 0 percent of the time, and even recites Nex-AGI's bespoke backstory word for word. At the weight level, all 60 layers and every component are a fixed 0.6 to 0.4 blend, with α around 0.571 and collinearity of 0.98 to 0.99. Rio later edited its own model card to admit a merge of Nex-N2-Pro and Qwen3.5 followed by distillation.

If the merge worked, can you just merge all open models to get something better?

Do not treat that as a general rule. Merging only works between architecturally matched, related models. It worked here because Nex is itself a fine-tune of Qwen3.5, so this is really a fine-tune of Qwen blended with the Qwen base. Practitioners on Hacker News also note that most merges help only on narrow feeling-type benchmarks and tend to degrade on real long-chain reasoning. Treating a merge as free capability stacking is a misread.

Why is the real issue lying about capability rather than missing attribution?

Attribution is a courtesy, lying about capability is a public falsehood. A government body told the public it had trained a sovereign model that beats its peers, when it had only added two other open models in fixed proportions. The problem is not who it forgot to thank, it is claiming a training capability that does not exist and using it for civic publicity. That is a warning to every government and company waving a homegrown sovereign AI flag.