Arcee Trinity Mini: US-Trained Moe Model

halJordan · 2025-12-02T01:50:01 1764640201

Looks like a less good version of qwen 30b3a which makes sense bc it is slightly smaller. If they can keep that effiency going into the large one it'll be sick.

Trinity Large [will be] a 420B parameter model with 13B active parameters. Just perfect for a large Ram pool @ q4.

davidsainez · 2025-12-02T04:46:13 1764650773

Excited to put this through its paces. It seems most directly comparable to GPT-OSS-20B. Comparing their numbers on the Together API: Trinity Mini is slightly less expensive ($0.045/$0.15 v $0.05/$0.20) and seems to have better latency and throughput numbers.

htrp · 2025-12-02T01:49:23 1764640163

Trinity Nano Preview: 6B parameter MoE (1B active, ~800M non-embedding), 56 layers, 128 experts with 8 active per token

Trinity Mini: 26B parameter MoE (3B active), fully post-trained reasoning model

They did pretraining on their own and are still training the large version on 2048 B300 GPUs

Balinares · 2025-12-02T14:09:21 1764684561

Interesting. Always glad to see more open weight models.

I do appreciate that they openly acknowledge the areas where they followed DeepSeek's research. I wouldn't consider that a given for a US company.

Anyone tried these as a coding model yet?

bitwize · 2025-12-02T01:39:01 1764639541

A moe model you say? How kawaii is it? uwu

ghc · 2025-12-02T02:13:04 1764641584

Capitalization makes a surprising amount of difference here...

donw · 2025-12-02T03:07:36 1764644856

Meccha at present, but it may reach sugoi levels with fine-tuning.

noxa · 2025-12-02T01:40:55 1764639655

I hate that I laughed at this. Thanks ;)

ksynwa · 2025-12-02T04:37:06 1764650226

> Trinity Large is currently training on 2048 B300 GPUs and will arrive in January 2026.

How long does the training take?

arthurcolle · 2025-12-02T06:10:09 1764655809

Couple days or weeks usually. No one is doing 9 month training runs

trvz · 2025-12-02T06:20:08 1764656408

Moe ≠ MoE

cachius · 2025-12-02T06:31:26 1764657086

azinman2 · 2025-12-02T06:42:34 1764657754

The HN title uses incorrect capitalization.

rbanffy · 2025-12-02T07:56:55 1764662215

I was eagerly waiting for the Larry and Curly models.

m4rtink · 2025-12-02T09:21:50 1764667310