Huawei Cloud is redefining the AI‑cloud race by shifting its focus from cheap token rates to the underlying infrastructure that powers its services.

After announcing that its three large‑language‑model (LLM) APIs on the ModelArts platform are priced at a fraction of what Western rivals charge, Huawei highlighted the role of domestically produced silicon in driving the discount. Public pricing tables show Huawei’s token rates for input and output are roughly 77 % lower than those of Microsoft Azure’s Bedrock or Amazon Web Services’ Bedrock.

Token pricing remains the standard billing model for AI services. A request is charged by multiplying the number of input and output tokens by a per‑token rate. Huawei’s ModelArts platform delivers three LLMs that average 30 tokens per second. The pricing page lists the input rate for its largest model at $0.00002 per token and the output rate at $0.00003 per token, compared with $0.00010 and $0.00015, respectively, on Azure.

The cost advantage is largely the result of Huawei’s investment in Chinese‑made chips. Its AI‑optimized processors, built on the Kirin architecture, are designed for high‑throughput inference and are integrated into the cloud’s data‑center hardware. A LinkedIn post by a Huawei engineer noted that the use of these chips reduces the energy and cooling requirements for large‑model inference, allowing Huawei to pass savings on to customers.

Beyond pricing, Huawei is positioning its cloud services around a broader ecosystem of AI‑native infrastructure. The company’s GaussDB database, launched in 2019, is a distributed relational system that supports high‑volume AI workloads. In 2021, Huawei released the PanGu family of multimodal LLMs, including PanGu‑Σ and PanGu‑π, which are trained on Chinese‑language corpora and can be accessed through ModelArts. The company also promotes EulerOS, a Linux‑based operating system that is optimized for AI workloads and is available as an open‑source project under the OpenAtom Foundation.

Industry analysts say that token pricing alone does not capture the full value proposition of an AI cloud provider. A recent article in 36Kr highlighted that the competition in the AI‑cloud market is just beginning, with providers differentiating on factors such as latency, model availability, and integration with existing data pipelines. Huawei’s emphasis on its domestic silicon supply chain, coupled with its AI‑native database and operating system, is intended to address those broader concerns.

The shift also reflects regulatory and geopolitical dynamics. Western sanctions on Huawei have restricted the company’s access to certain high‑performance GPUs, prompting a pivot toward in‑house silicon. Meanwhile, the Chinese government’s 2025 data‑security guidelines require AI companies to limit the use of “unsafe” data, a policy that Huawei claims is compatible with its domestic data‑center architecture.

For enterprises, the lower token rates could translate into significant cost savings for large‑scale inference workloads, especially in regions where Huawei’s data centers are located. However, the company’s cloud service availability remains limited outside of China, and some customers have expressed concerns about data residency and compliance with local regulations.

In the coming months, Huawei is expected to expand its ModelArts offering with additional models and to open new data‑center regions in Southeast Asia. The company has also announced plans to integrate its AI services with its Mobile Services ecosystem, potentially enabling seamless AI features across Huawei devices.

Overall, Huawei’s strategy signals a broader trend in the AI‑cloud market: providers are moving beyond token pricing to emphasize hardware, infrastructure, and ecosystem integration as the key drivers of competitive advantage.