All Topics

Technology

Art

Alibaba's Qwen3.7-Plus combines visual AI with autonomous agent capabilities for coding and app navigation

Jonathan Kemper

17d ago· 3 min readenNews

Summary

Alibaba's Qwen team has released Qwen3.7-Plus, a proprietary multimodal AI model that combines visual perception with agent capabilities like coding, tool use, and GUI navigation. Built on the text-only Qwen3.7, it functions as a "multimodal interactive hybrid agent" capable of recognizing real-world scenes, reading screens, operating interfaces, writing code from visual templates, and navigating mobile apps. In a demo, an agent built on the model autonomously developed a vocabulary learning app over eleven hours, producing over 10,000 lines of code across 1,000 agent calls. The model leads on-screen understanding in Qwen's benchmarks but shows mixed overall performance. It is priced well below Western frontier models and does not have open weights.

Source

bskyAlibaba's Qwen3.7-Plus combines visual AI with autonomous agent capabilities for coding and app navigationthe-decoder.com

Key quotes

· 3 pulled

Billed as a 'multimodal interactive hybrid agent,' the model is designed to recognize real-world scenes, read screen content, operate graphical interfaces, write code from visual templates, and navigate mobile apps end to end.

Using Qwen3.7-Plus, the team had a hybrid agent system build...

Qwen3.7-Plus is a proprietary offering with no open weights, priced well below Western frontier models.

Snippet from the RSS feed

Alibaba's Qwen team has released Qwen3.7-Plus, a multimodal agent model that combines visual perception, GUI operation, and coding in a single agent loop. In a demo, an agent built on the model autonomously developed a vocabulary learning app, producing o

You might also wanna read

Alibaba's Qwen3-VL AI Model Demonstrates Advanced Video Analysis Capabilities

Alibaba has released a technical report on its Qwen3-VL multimodal AI model, demonstrating exceptional capabilities in processing and analyz

the-decoder.com·6mo ago

Alibaba Releases Qwen3.5 Medium AI Models with Open Source Licensing and Near Sonnet 4.5 Performance

Alibaba's Qwen AI team has released the Qwen3.5 Medium Model series, consisting of four new large language models with agentic tool calling

venturebeat.com·3mo ago

Alibaba Cloud Releases Qwen3-Omni: Native End-to-End Multimodal AI Model

Qwen3-Omni is a natively end-to-end, omni-modal large language model developed by Alibaba Cloud's Qwen team. It represents a significant adv

github.com·9mo ago

Alibaba Cloud Launches Qwen3-Omni: Native Multimodal AI Model with Real-Time Speech Generation

Qwen3-Omni is a new multimodal large language model from Alibaba Cloud's Qwen team that can process text, audio, images, and video natively

Product Hunt·2mo ago

Qwen3: Alibaba Cloud's Open-Source Large Language Model Series for Coding Agents

Qwen3 is a large language model (LLM) series developed by the Qwen team at Alibaba Cloud, hosted on Product Hunt. The page showcases multipl

Product Hunt·11mo ago

Qwen Releases Updated Qwen3-30B-A3B-Instruct-2507 Non-Thinking Mode Model

Qwen (Alibaba's AI team) released an updated version of their Qwen3-30B-A3B model, named Qwen3-30B-A3B-Instruct-2507. This is a non-thinking

huggingface.co·10mo ago

Comments

No comments yet. Be the first.