
Alibaba Launches Multimodal AI Qwen2.5-Omni-7B Model
Alibaba introduced Qwen2.5-Omni-7B, a multimodal AI capable of processing text, images, audio, and video efficiently on smartphones and laptops, outperforming Google’s Gemini model in benchmarks.

Alibaba introduced Qwen2.5-Omni-7B, a multimodal AI capable of processing text, images, audio, and video efficiently on smartphones and laptops, outperforming Google’s Gemini model in benchmarks.

Google’s Gemini 2.5 Pro introduces advanced reasoning and multimodal processing, achieving top scores in coding, mathematics, and science benchmarks, and supporting a 1 million token context window.

China’s DeepSeek V3-0324 runs 20 tokens/sec on Apple’s M3 Ultra, outperforming Claude Sonnet 3.5. MIT-licensed and open-source, it reshapes AI’s future.

Launched March 2025, Tencent’s Hunyuan T1 AI model combines Transformer-Mamba tech for 2x speed at 1/10th OpenAI’s cost. Scores 87.2 on knowledge tests but faces real-world limits.

Claude 3.7 Sonnet now offers web search for U.S. paid users, citing sources like Reuters. Despite use cases in finance and sales, energy costs and Google’s 90% search dominance loom.

Tencent’s Hunyuan3D-2.0 generates 3D assets in 30 seconds using multi-modal inputs and a two-stage pipeline, targeting gaming, VR, and industrial design with open-source tools.

Roblox has released Cube 3D, an open-source foundational model for generative AI, enabling developers to create 3D objects and scenes from text prompts. The beta mesh generation API is now available in Roblox Studio.

Baidu launches Ernie 4.5 at 1% of GPT-4.5’s cost and a free AI chatbot, using PaddlePaddle and Kunlun chips to challenge global rivals.

Cohere launched Command A, an AI model requiring only 2 GPUs that matches GPT-4o in enterprise tasks while processing 23 languages and costing 50% less for private deployments.