Tencent Open Sources Hy3 Preview to Scale Efficient Agentic Reasoning

The release mirrors the pattern seen in Alibaba's high-efficiency models, prioritizing massive knowledge capacity while maintaining low inference costs. By focusing on reasoning and agentic capabilities, Hy3 preview provides a cost-effective alternative for developers building autonomous systems that require high-performance logic without the overhead of traditional monolithic models.
You can deploy the model for complex agentic workflows and multi-step reasoning tasks that require high cost-efficiency. The weights are available on Hugging Face for self-hosting, and the model is currently live on OpenRouter with a free usage tier available through May 8th, 2026.
Frequently asked questions
- What is Tencent Hy3 preview?
- Hy3 preview is a large language model from Tencent designed for reasoning and agentic tasks. It uses a Mixture-of-Experts architecture, meaning it has a large total capacity of 295 billion parameters but only uses 21 billion parameters per request. This design allows the model to handle complex logic while remaining efficient to run.
- Is Tencent Hy3 preview open source?
- Yes, Tencent has open-sourced the weights for the Hy3 preview model. Developers can download and host the model themselves by accessing the repository on Hugging Face. This open-weight release allows for private deployment and fine-tuning for specific enterprise or research use cases without relying on a proprietary API provider.
- How can I use Tencent Hy3 preview?
- You can access Hy3 preview by downloading the weights from Hugging Face for local or cloud hosting. Alternatively, the model is available through the OpenRouter API. Tencent is offering a free usage tier on OpenRouter until May 8th, 2026, allowing developers to test its reasoning and coding capabilities before committing to a paid plan.
- What is the context window for Hy3 preview?
- Hy3 preview supports a context window of 256,000 tokens. This large window allows the model to process and reason across extensive documents, entire codebases, or long conversation histories in a single interaction. This capability is particularly useful for building retrieval-augmented generation systems and autonomous agents that need to maintain deep context over multiple steps.



