Post cover image
figure-1: Qwen3.6 running in llama.cpp

May 10, 2026

Run Qwen3.6-35B-A3B on 6GB VRAM Using Llama.cpp (~30 tps)

In 2026, the latest release of the Qwen3.6–35B-A3B AI model, combined with recent updates to Llama.cpp, marks a significant improvement…

Minyang Chen

7 min read