Nice of them to make even a 0.3B model, just too bad it was the only one that wasn't MoE. I've been wanting more small MoEs since Qwen 30B A3B.
this post was submitted on 02 Jul 2025
17 points (100.0% liked)
Technology
1149 readers
67 users here now
A tech news sub for communists
founded 2 years ago
MODERATORS
On a random note, I'd really love to see this approach explored more. It would be really handy to have models that can learn and evolve over time through usage https://github.com/babycommando/neuralgraffiti