inclusionAI
Browse models from inclusionAI
Models · 5
Ring-mini-2.0 is a Mixture-of-Experts (MoE) model oriented toward high-throughput inference and extensively optimized on the Ling 2.0 architecture. It uses 16B total parameters with approximately 1.4B activated per token and is reported to deliver comprehensive reasoning performance comparable to sub-10B dense LLMs. The model shows strong results on logical reasoning, code generation, and mathematical tasks, supports 128K context windows, and reports generation speeds of 300+ tokens per second.
Input type
Context128.00K
Input$0.07/M tokens
Output$0.7/M tokens
Ling-mini-2.0 is an open-source Mixture-of-Experts (MoE) large language model designed to balance strong task performance with high inference efficiency. It has 16B total parameters, with approximately 1.4B activated per token (about 789M non-embedding). Trained on over 20T tokens and refined via multi-stage supervised fine-tuning and reinforcement learning, it is reported to deliver strong results in complex reasoning and instruction following while keeping computational costs low. According to the upstream release, it reaches top-tier performance among sub-10B dense LLMs and in some cases matches or surpasses larger MoE models.
Input type
Context128.00K
Input$0.07/M tokens
Output$0.28/M tokens
Ling-flash-2.0 is an open-source Mixture-of-Experts (MoE) language model developed under the Ling 2.0 architecture. It features 100 billion total parameters, with 6.1 billion activated during inference (4.8B non-embedding). Trained on over 20 trillion tokens and refined with supervised fine-tuning and multi-stage reinforcement learning, the model demonstrates strong performance against dense models up to 40B parameters. It excels in complex reasoning, code generation, and frontend development.
Input type
Context128.00K
Input$0.28/M tokens
Output$2.8/M tokens
Trained on over 20 trillion tokens and refined with supervised fine-tuning and multi-stage reinforcement learning, the model demonstrates strong performance against dense models up to 40B parameters. It excels in complex reasoning, code generation, and frontend development.
Input type
Context128.00K
Input$0.28/M tokens
Output$2.8/M tokens
Ling-1T is a trillion-parameter sparse mixture-of-experts (MoE) model developed by inclusionAI, optimized for efficient and scalable reasoning. Featuring approximately 50 billion active parameters per token, it is pre-trained on over 20 trillion reasoning-dense tokens, supports a 128K context length, and utilizes an Evolutionary Chain-of-Thought (Evo-CoT) process to enhance its reasoning depth. The model achieves state-of-the-art performance across complex benchmarks, demonstrating strong capabilities in code generation, software development, and advanced mathematics. In addition to its core reasoning skills, Ling-1T possesses specialized abilities in front-end code generation—combining semantic understanding with visual aesthetics—and exhibits emergent agentic capabilities, such as proficient tool use with minimal instruction tuning. Its primary use cases span software engineering, professional mathematics, complex logical reasoning, and agent-based workflows that demand a balance of high performance and efficiency.
Input type
Context128.00K
Input$0.56/M tokens
Output$2.24/M tokens