What are the capabilities of Huawei’s chips and software systems?
Follow me in understanding why Huawei's chips are such a big deal
Hi all, this is a delayed update on Huawei, which was first published in AI Supremacy as a guest post. It features a special guest appearance by , a Washington, D.C.-based semiconductor and AI analyst.
The Substack community really is incredible. When I messaged Mary Clare MacMahon that her recent Huawei piece on CUDA vs. CAAN was very informative, she said she referenced my old piece on Huawei for hers. It all goes full circle.
See original primer piece published in Jan 2025: Is Huawei like China’s Nvidia plus Google?
I updated some details throughout the piece since its May publication, as so much has happened in the last few weeks, most notably:
WSJ reported in late April that “Huawei has approached some Chinese tech companies about testing the technical feasibility of the new chip, called the Ascend 910D, people familiar with the matter said.” (WSJ)
Trump looks to scrap Biden-era rule that aimed to limit exports of AI chips, exploring semi tariffs (FT)
Nvidia looking to “downgrade” H20 again, Jensen Huang says next chip after H20 for China won't be from Hopper series (Reuters)
You’ve likely heard of Huawei during the infamous Meng Wanzhou trials in Canada, where she was put under house arrest due to U.S. charges against her for allegedly violating sanctions on Iran. I still recall the founder Ren Zhengfei, the father of Meng, went on an exclusive with CNBC while I was working there and was trying to pull all the cards to have the public court of opinion work in his favor and release his eldest daughter from detainment despite having kept a pretty low profile until then. And despite being on the U.S. trade blacklist for nearly six years, it appears to have continued innovating and releasing more high-end smartphones and tablets over the years.
However, since her return to the HQ, the company has remained relatively low-key and has primarily focused on its Harmony OS capabilities and R&D efforts. Until being thrust back into the spotlight again recently with Nvidia’s H20 chip ban on China, which was first suggested by President Joe Biden and now formally imposed by President Donald Trump. Prior to the trade war and the battle of semiconductor chips, it was just another tech firm in China.
Huawei is no mysterious company in China. Many know it for its sprawling castle/villa-designed campus and often bidding out top talent with crazy money. It also runs its own scholarship program to promote STEM and hires many domestically trained talents to go straight into its almost-like management trainee programs.
However, in the West, it has been deemed a mysterious and untouchable black box that just churns out products that seemingly shock Western innovators every time it does. Today, let’s try to demystify it a little.
So, fast forward to the present day. The U.S. government imposed new curbs on Huawei. Under the new guidance issued by the U.S. Commerce Department, the use of Ascend chips “anywhere in the world” by any company could be interpreted as a violation of American export controls last month.
Chips - Ascend Series
“And so you know, Huawei is a formidable technology company.” — Jensen Huang, Nvidia
In early May, Nvidia CEO Jensen Huang raised concerns to U.S. lawmakers about Huawei’s growing AI capabilities, according to media reports. The H20 ban has led to speculations and (confirmations) that demand for Huawei chips have shot up as there is no better option left for Chinese AI companies then and after the ban was announced, Huawei timely announced its new Ascend 920 chip that will be rolled out later this year as it steps up its game to close the gap in technological capabilities. And shortly after the series of news, Jensen Huang said that the U.S. semiconductor export controls on China have been “a failure,” and are causing more harm to American businesses than to China.
After its most recent earnings report, Jensen Huang went on to Bloomberg Technology and was asked by the reporter whether the increase in demand for the reasoning service Blackwell has made up for the loss of the China market for the current period outlook.
Jensen said, “I guess so, but you cannot underestimate the importance of the China market. This is the second-largest AI market. This is the home of the world’s largest population of AI researchers and we want the all of the world’s researchers and the world’s developers building on American stacks.”
However, the Chinese might think differently, as the risk of being cut off keeps becoming a reality in advanced technology. And as we know, within the hopper architecture, H20s are the lowest specs they possibly can be.
“The limitations (from the U.S. government) are quite stringent, quite limited, if you will. H20 is as far down as we can take a hopper. We don’t know how to make it even less, and that’s really the limit,” said Huang.
“Whatever we make, (it) ultimately has to add value to the market. It’s a really tight rope, because the Chinese competitors have evolved and advanced greatly over the last year. Like everybody else, they’re doubling, quadrupling their capabilities every year, and the volume is increasing substantially. And remember, these are data center chips. They don’t have to be small and can be quite large.”
And highlighting his biggest concern: “Without American technology, the availability of Chinese technology will fill the market. Whatever we offer has to be competitive and add value to the market.”
To better understand the nuances, I spoke to Ray Wang, Washington D.C.-based semi analyst who broke down the technicalities in the most accessible way for me. [See our 20-minute interview here, recorded on May 9, 9 pm HKT]
Topics we covered:
What are the actual capabilities of Huawei's Ascend 910C and 920 chips, and how do they compare to Nvidia's H20s?
How may the lack of access to H20s affect Chinese AI firms?
What does it mean when people say Huawei has advanced systems that, despite their chips being less advanced, they can still work very well for AI training and inference?
Explanation of the software and hardware integration process? How do CANN and CUDA compare?
Who are the major players in China in chip design and manufacturing? And what is the relationship between Huawei and SMIC?
What is the Ascend 920?
It is a chip created at the advanced 6-nanometre node level and expected to begin mass production later this year, which can make it a proper, fair alternative to Nvidia’s H20 chips.
The last generation of Huawei AI chips, the Ascend 910C GPUs, has already proven to have been used in inference training use cases by Chinese AI labs, such as in DeepSeek’s R1. According to Chinese media outlet AGI Hunt, Ascend 910C achieves 60% of NVIDIA H100’s performance, delivering strong inference results. They are built on SMIC (China’s answer to TSMC, albeit lagging much behind) 7-nanometer N+2 process with 53 billion transistors. It is said that on a per-GPU basis, its performance matches Nvidia’s H100s (see more detailed specs comparison here). However, some say even though the Ascend 910B and 910C continue to improve, they still lag about ~5 years behind in terms of performance, per-watt, and interconnect density compared to Nvidia’s H100s. But as of now, they’re good enough, or as good a replacement as Chinese AI companies can get their hands on.
wrote that this gap won’t be closed in one or two product cycles. It’s a function of fab capacity, export enforcement, and ecosystem readiness that makes the U.S. remain in the lead in chips in the long run, even as open-weight parity inches closer.
How did Huawei get here?
In many ways, industry experts are saying the ban on Huawei under U.S. trade restrictions since 2019 has pushed the private company to focus on R&D and laid out the foundation of their capabilities today for AI chips. It originally relied on chip imports from American vendors such as Intel and Qualcomm, but the ban effectively severed this supply chain, potentially in hopes of hindering Huawei's ability to manufacture new laptops and phones at scale. But similarly to what happened with DeepSeek, “necessity is the mother of invention.”
Refer to the Deep dive into Huawei piece that looked into the history of Huawei’s cloud build up and its six-year grind.
But as I have written in the article highlighted above, Huawei did not take on the approach like other big tech did, whether in China or the US. It didn't go cloud-first to try to sell to enterprises. Instead, it honed in on privacy and security, which then often won the hearts of Chinese SOEs and government entities.
It largely embraced what
wrote as “AI in a box” approach, which essentially served its clients the 一条龙服务 full stack service, starting from the foundations of hardware—including “packaging GPUS, networking equipment, server racks, databases, container services.” Then, on top of that, almost like a cherry on top service, is the software side of the AI model—Pangu, which then you can tweak and modify. At its core, the company has always been a hardware equipment business.In an exclusive conversation between AI Proem and Nuo Jiang, Founder of GroundAI and former Chief Digital Officer at Huawei in Singapore, Huawei’s unique strengths lie within its R&D and hardware capacity.
“First, the company is the world’s second-largest company by its investment in R&D, after Alphabet, which gives strategic investment and endurance to long-term chip design. Second, the company’s DNA is in hardware design and manufacturing. Third, owning the leading Chinese semiconductor company HiSilicon, gives it a massive advantage over its peers. Lastly, its end-to-end capability, from material science, chip design, manufacture, and cloud stack to its mobile operating system, e.g., Harmony OS, is unparalleled globally. All together, they create a compelling ecosystem.”
But what will a Nvidia H20 ban truly mean for China’s semi sector? Ray Wang said Huawei is using this opportunity to build out their own semiconductor foundry which will be a positive push for China’s AI chip making capabilities.
Is that it will add the competition for SMIC, which currently monopolizes China’s chip-making space, despite not being able to match TSMC’s capabilities, it still remains the only fab for all chip orders in China.
This potentially adds a possibility for China to make advanced chips (< 7 nanometers).
It can potentially mean Huawei can be self-sufficient and increase overall production capacity.
[By the way, Kyle Chan did a great background piece on Huawei; see it here.]
CUDA vs. CANN
In my original Huawei piece, I included a section on the software but removed it in the end. So today, I want to spend some proper time delving deeper into this.
Nvidia has a CUDA system, and Huawei has a toolset abbreviated as CANN. Both CUDA and CANN are tools that help programmers make their applications run faster by using the power of graphics cards. CUDA is more general-purpose and widely used in various fields, while CANN is designed to focus more on making AI applications work better on Huawei's devices.
According to Nvidia’s data, over 2 million registered developers utilize CUDA, and on average, 438,000 downloads of CUDA occur each month. In comparison, there is no publicly available information on the adoption rate of CANN. However, some have said that the same size and type of training or inference run on Huawei’s CAAN is often slower than that of CUDA. Thus, the number of developers utilizing CANN is probably 1/10 of the CUDA developers in the ecosystem.
Now, CUDA has largely been popular due to its free access and developer friendliness. Huawei has taken note of that and launched its own two platforms: MindSpore and CANN. To get a more technical breakdown, check out Mary Clare McMahon’s coverage here. What she accurately points out is that Huawei has realised that to be truly competitive and considered as an alternative to Nvidia for AI clients, it has to undermine Nvidia’s software moat, which means it’s more than just reaching parity in hardware GPU performance but upping its own software capabilities. But first, what is Nvidia’s CUDA?
CUDA stands for “Compute Unified Device Architecture.” It is a technology created by Nvidia that allows programmers to use the power of Nvidia graphics cards (GPUs) to perform complex calculations much faster than regular computers (CPUs).
Parallel Computing: CUDA lets computers use many small processing units in the GPU to work on different parts of a problem simultaneously。
Programming: With CUDA, programmers write code that tells the GPU what to do. They can write this code in languages like C or Python, which are easier to understand than machine language.
Applications: CUDA is used in many fields, such as video games, scientific research, and artificial intelligence (AI), because it can handle large amounts of data quickly.
What is Huaweis’ CANN? And MindSpore?
CANN stands for Compute Architecture for Neural Networks, which Huawei developed. It’s similar to CUDA but specifically designed for AI and machine learning tasks.
Focus on AI: While CUDA can be used for many calculations, CANN is tailored for neural network tasks. Neural networks are models that help computers learn from data (like how we know from experience).
Efficiency: CANN allows developers to create AI applications that run efficiently on Huawei’s chips. This means they can build intelligent applications that recognize images, understand speech, or make predictions based on data.
Compatibility: Just like CUDA works with Nvidia GPUs, CANN works with Huawei's hardware, helping developers take advantage of the specific features of Huawei’s chips.
Together, CANN and MindSpore work as a comprehensive AI software stack designed to optimize AI model development and deployment on Huawei’s Ascend AI processors.
CANN provides the tools and runtime needed to execute high-performance machine learning models on Huawei hardware. Meanwhile, MindSpore is designed for ease of development, automatic differentiation, and seamless model training and deployment. It seamlessly integrates with CANN to leverage Ascend hardware efficiently.
Beyond enhancing the user experience and deploying engineers to China’s leading AI firms to better support CAAN integration, Huawei is focused on prioritizing its compatibility with open-source machine learning frameworks such as PyTorch. As Mary wrote, “the challenge for Huawei, then, is not only to make PyTorch run on Ascend hardware but also to make it run well enough that developers don’t notice they’ve switched ecosystems.”
CloudMatrix System
SemiAnalysis, the leading semiconductor research firm, recently wrote that the CloudMatrix 384 is “China’s newest and most powerful domestic solution,” directly competing with the GB200 NVL72, and “in some metrics is more advanced than Nvidia’s rack scale solution.”
The importance of the system is that although Huawei’s chips are still technically a generation behind the leading Nvidia chips, its scale-up solution is said to be a generation ahead of Nvidia and AMD’s current products on the market. What does that mean?
Huawei is compensating for its less powerful individual processors by deploying many more of them and leveraging a highly scalable, high-bandwidth optical interconnect architecture. And though it requires more space and energy to power the compute, this has not been a constraint for Huawei. This approach has temporarily now made CloudMatrix a viable alternative in China (likely domestic demand is so high that exporting this will not be a priority in the short-term) where Nvidia's cutting-edge GPUs are restricted or unavailable, despite higher costs and energy use.
Lower performance chips: Huawei's CloudMatrix 384 uses 384 Ascend 910C processors, which individually have lower performance and efficiency compared to Nvidia's more advanced Blackwell GPUs, but collectively deliver about 300 petaFLOPS of BF16 compute-roughly 1.7 times Nvidia GB200 NVL72's ~180 petaFLOPS, according to SemiAnalysis.
The CloudMatrix system achieves this through sheer scale and a sophisticated all-optical mesh network that provides much higher memory capacity (3.6× more HBM memory) and bandwidth (over twice Nvidia's).
Power efficiency trade-off: CloudMatrix consumes nearly four times more power and is about 2.3 times less efficient per watt than Nvidia's system.
Huawei's approach reflects a "brute force" strategy to compensate for less advanced chip manufacturing technology by maximizing parallelism, memory, and interconnect speed at the system level rather than per-chip performance.
And its capabilities were recognized by Jensen Huang. In the same Bloomberg interview mentioned above, he added that though Huawei’s best offer is currently only “comparable to a H200,” its CloudMatrix system is able to scale up to an even larger system than Nvidia’s latest grace Blackwell.
“And so you know, Huawei is a formidable technology company,” said Huang.
As for the Alibaba and Tencent of the world, they have “no choice” but to buy from Huawei in the face of U.S. policies in place, said Jensen. At the same time, adding that it makes sense and is “prudent” for the Chinese developers to develop their stack on Huawei. As we can see, he’s caught between a rock and a hard place.
Parting Words
This answers many of the questions around what Huawei is doing, and it can really replace Nvidia for Chinese AI firms. But as I wrote in this Fortune Op-ed, we’re still in such an infancy stage of AI adoption, and in many ways it is still too early to say who has “won”. The decoupling and bifurcation in the industry have and will continue to push China to be more self-reliant.
Whether it's in LLM development or semiconductor advancement - DeepSeek or Huawei, now willingly or not, being touted as national champions will rally the support of national-level capital, policies, and support in China. And if the commercial choice is removed from companies (no more Nvidia chips), then capitalists (businesses) will also lean towards them. Hence, Jensen Huang’s plea to not close off the Chinese market to his business.
Speaking to Ray Wang, the semi-analyst, also reconfirmed my belief that export control will only hinder China’s AI development in the short term, but in the long term, it is actually pushing China to find more self-reliant and innovative solutions. Although U.S. companies like OpenAI and Meta currently still have the huge upper hand in access to cheap compute and higher quality hardware, that gap could be shrinking exponentially day by day.
Excellent read! I feel the bottleneck for Huawei’s in-house silicon would be SMIC or a foundry that can help Huawei to enhance to an advanced node. US Export Control enacts rules, that prohibits pure play foundries (example: TSMC and now Intel Foundry Services) to ship to Huawei.
Keen to know how Huawei HiSilicon Ascend “AI Chip” future development stands against test of US Export Control enacts Control. Would they vertically stack dies for greater TOPS/mm² or perhaps hit a breakthrough with micro architecture, to keep up with need to train new models (and scale inferencing or improve its throughput)
The risk that US export controls simply push Huawei et al. to innovate even further while depriving US firms of R&D dollars is absolutely real and commonly noted. But how do you think we should balance that against two other considerations?
1. The first is that in another policy debate – the tariff question – anti-tariff economists often point to the fact that protecting domestic industries makes them sluggish by depriving them of pressure to adopt best practices, also called X-inefficiency. Yes, Huawei is clearly getting space in the China market to profit and tinker because export controls push out TSMC/Nvidia. But by the same token, for all the resources Beijing is willing to put into AI, it still obeys the laws of physics, and its resources are finite. Beyond the progress Huawei makes because of export controls, every yuan spent on energy-inefficient architectures or domestic firm subsidization is money that cannot be spent on other, broader US concerns with China, such as PLAN cruisers or subsidies for electric cars. I'm not sure how big this effect is, though – is it big enough to warrant discussion alongside the positive effects of export controls for China's tech stack?
2. Second, designers and proponents of US controls since the 10/22 rule argue that the bite won't come for a couple years. (Lennart Heim's Deepseek blog post argued that a cluster started in 2023 after the export control revisions won't be replaced until 2026 https://blog.heim.xyz/deepseek-what-the-headlines-miss/.) How can we assess not only the impressiveness of Huawei's advances given current constraints but also their magnitude: i.e., will they will be enough to overcome obstacles for China that come with a time lag?