Alibaba Qianwen Full-Modal Large Model Qwen3.5-Omni Launches

MetaMuskRat · 2026-04-01T19:28:00+00:00

Alibaba Qianwen has released the new Qwen3.5-Omni large model, which includes three sizes of Instruct versions, supporting 256k long context and rich audio-visual input. The model has undergone large-scale multimodal pretraining, demonstrating strong perception and generation capabilities, and has enhanced multilingual support, capable of recognizing 113 languages and dialects.

MetaMuskRat

2026-04-01 19:28:00

Abstract generation in progress

People’s Finance News, March 30—Alibaba Qianwen announced the launch of its full-modal large model Qwen3.5-Omni. The Qwen3.5-Omni series includes Instruct versions in three sizes: Plus, Flash, and Light. It supports a 256k long context window, and the model supports more than 10 hours of audio input as well as more than 400 seconds of 720P (1FPS) audio-video input. The model is natively pretrained for multimodality on massive text, visual, and over 100 million hours of audio-video data, demonstrating outstanding full-modal perception and generation capabilities. Compared with Qwen3-Omni, Qwen3.5-Omni’s multilingual capabilities have been greatly enhanced: it can support speech recognition in 113 languages and dialects, and speech generation in 36 languages and dialects.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.