Li Auto’s Mach M100 Dataflow Chip Hits 1280 TOPS, Outperforms Nvidia ThorU

Release date：2026-06-16 Number of clicks：169

At Li Auto’s Family Tech Day, CEO Li Xiang officially launched the Mach M100 – the world’s first dataflow‑architecture automotive AI chip. CTO Xie Yan detailed the 5nm SoC’s architecture and performance, explaining why automakers need custom silicon.

With Moore’s Law slowing and AI compute demand exploding, traditional Von Neumann architectures waste cycles on scheduling overhead. Li Auto’s full‑stack approach – chip, OS, autonomous driving model, and vehicle hardware – aims to solve problems off‑the‑shelf chips cannot.

The Mach M100 features a grid + ring bus interconnect, letting data flow directly to compute units. It packs 24 Arm Cortex‑A78AE cores (2.3GHz) for safety/scheduling, and a dedicated NPU occupying half the die with 56 compute clusters . Peak performance: 1280 TOPS , with >82% utilization . Memory: 8‑channel LPDDR5x, 273GB/s bandwidth .

In live benchmarks vs. Nvidia ThorU, Mach M100 delivered multiple‑times performance gains across CNN, UniAD, and Li Auto’s VLA model. In LLM inference (35B parameters), it reached 2.7x prefill and 1.5x decode vs. Nvidia DGX Spark. The architecture paper is accepted at ISCA 2026 – Li Auto is the first automaker at the top conference.

With dual Mach M100 chips, Li Auto’s VLA training scales 50% in imitation learning and 10x in RL, model size, and compute. End‑to‑end latency is 0.28 seconds – faster than human reaction – enabling complex scenarios like unmarked roads, traffic‑police gestures, and low obstacles.

Beyond autonomous driving, Mach M100 also runs onboard local LLMs for other AI functions, forming the compute core of Li Auto’s embodied AI system.

ICgoodFind: Li Auto’s dataflow chip redefines automotive AI compute – besting ThorU in real tests and proving vertical integration matters.

Home

TELEPHONE CONSULTATION

Chip Products