large language models

MiniMax M2.7: A Self‑Evolving Chinese LLM Aiming to Automate Reinforcement Learning Research

Cary Huang
March 21, 2026
News, Technology

MiniMax’s new M2.7 large language model is not just another entrant in the crowded frontier-model race. The Shanghai-based startup is using M2.7 and its predecessors as active participants in their own training loop, automating 30–50% of the reinforcement learning (RL)… Read More »MiniMax M2.7: A Self‑Evolving Chinese LLM Aiming to Automate Reinforcement Learning Research

Inside Xiaomi’s MiMo‑V2‑Pro: A 1T‑Parameter Agentic LLM Challenging GPT‑5.2 on Cost and Capability

Cary Huang
March 20, 2026
News, Technology

Xiaomi has moved from being a hardware-first player in smartphones and electric vehicles into the center of the frontier AI race. With MiMo‑V2‑Pro, a 1‑trillion parameter foundation model designed explicitly for autonomous agents, the company is now competing with OpenAI… Read More »Inside Xiaomi’s MiMo‑V2‑Pro: A 1T‑Parameter Agentic LLM Challenging GPT‑5.2 on Cost and Capability

Mamba‑3: Open Source State Space Models Challenge Transformers on Speed, Cost and Reasoning

Cary Huang
March 20, 2026
News, Technology

Since Google’s 2017 “Attention Is All You Need” paper, the Transformer architecture has been the default foundation for large language models (LLMs). It powers systems like ChatGPT and Gemini, but its quadratic compute and heavy memory footprint make large-scale inference… Read More »Mamba‑3: Open Source State Space Models Challenge Transformers on Speed, Cost and Reasoning

Inside Nvidia’s Nemotron 3 Super: A 120B-Parameter Hybrid Model Built for Multi‑Agent Workloads

Cary Huang
March 12, 2026
News, Technology

Multi-agent AI systems are moving from research demos into production software engineering, cybersecurity, and operations workflows. But they come with a hard economic constraint: they can generate up to 15x the token volume of standard chat use cases, putting serious… Read More »Inside Nvidia’s Nemotron 3 Super: A 120B-Parameter Hybrid Model Built for Multi‑Agent Workloads

MiniMax M2.5: China’s Cut-Rate Frontier Model Aiming to Turn AI into a Full-Time Worker

Cary Huang
February 14, 2026
AI, News, Technology

MiniMax, a Shanghai-based AI startup, is pushing aggressively into the frontier-model tier with its new M2.5 family—positioning it not just as another chatbot, but as a cheap, always-on digital worker. With performance that approaches top offerings from Anthropic and Google… Read More »MiniMax M2.5: China’s Cut-Rate Frontier Model Aiming to Turn AI into a Full-Time Worker

Inside Arcee’s Trinity Large: A 400B-Parameter U.S. Open Source MoE With a Rare Raw Checkpoint

Cary Huang
February 2, 2026February 2, 2026
AI, Machine Learning, News, Technology

Arcee, a San Francisco–based AI lab, has released what it positions as a new U.S.-made, frontier-scale open model: Trinity Large, a 400-billion parameter mixture-of-experts (MoE) language model, alongside a rare raw 10T-token checkpoint, Trinity-Large-TrueBase. For AI researchers, ML engineers, and… Read More »Inside Arcee’s Trinity Large: A 400B-Parameter U.S. Open Source MoE With a Rare Raw Checkpoint

Inside LinkedIn’s Next-Gen Recommender: Why Prompting Failed and Small, Distilled Models Won

Cary Huang
January 24, 2026January 24, 2026
Infrastructure, News

LinkedIn has spent more than 15 years building large-scale AI-powered recommendation systems for jobs, people, and content. As the company moved to design a “next-gen” recommendation stack, it confronted a question that many machine learning leaders are asking: should you… Read More »Inside LinkedIn’s Next-Gen Recommender: Why Prompting Failed and Small, Distilled Models Won

MIT’s Recursive Language Models: A Systems Approach to 10M-Token Contexts

Cary Huang
January 23, 2026January 23, 2026
News, Orchestration

MIT CSAIL researchers are proposing a different answer to the long-context problem in large language models: don’t keep stretching the context window—change how the system uses it. Their new Recursive Language Models (RLMs) framework treats long prompts as an external… Read More »MIT’s Recursive Language Models: A Systems Approach to 10M-Token Contexts

NeurIPS 2025: Five Papers That Show Why AI Progress Is Now Systems-Limited

Cary Huang
January 20, 2026January 20, 2026
Infrastructure, Machine Learning, News, Orchestration

NeurIPS has long been the place where new architectures, training tricks and evaluation benchmarks quietly change how real systems are built. The 2025 edition continued that pattern — but with a sharper message for anyone working on LLMs, agentic systems… Read More »NeurIPS 2025: Five Papers That Show Why AI Progress Is Now Systems-Limited

MiroThinker 1.5: How a 30B Open-Weight Model Challenges Trillion-Parameter AI Agents

Cary Huang
January 9, 2026January 9, 2026
News, Technology

MiroMind’s MiroThinker 1.5 arrives at a moment when many technical leaders are reassessing how much model size really buys them in production. With just 30 billion parameters, the new open-weight model is positioned as a direct challenger to trillion-parameter agentic… Read More »MiroThinker 1.5: How a 30B Open-Weight Model Challenges Trillion-Parameter AI Agents