The era of AI supercomputing power sees fierce competition, with multiple efforts accelerating breakthroughs in the domestic chip industry

robot
Abstract generation in progress

AI compute power becomes the starting point for reshaping the chip industry.

In recent years, as Moore’s Law has slowed and single-chip performance has struggled to meet the explosive demand for compute power, the global industrial community has evolved two breakthrough paths: advanced packaging and system-level integration of ultra-node architectures. Against this backdrop, every link in China’s domestic chip industry chain—including EDA, advanced packaging, semiconductor equipment, and high-speed interconnect technologies—is accelerating its layout in the AI compute power field.

When it comes to domestic industry trends, Wang Xiaolong, director of the corporate division at Chips and Algorithms Research, told Securities Times reporter that, with China’s semiconductor industry’s independent controllability strategy being driven deeper, although process nodes are constrained to some degree, the domestic industrial chain can still carve out a path with Chinese characteristics for semiconductor development through “moderate process nodes + advanced packaging + system and ecosystem optimization.” This is expected to reduce the structural disadvantages and systemic risks China faces in competition for the next round of AI and advanced computing industries.

EDA competition shifts toward system-level integration

As the topmost layer of the chip industry, EDA practitioners have a strong sense of how AI is reshaping the chip design industry.

“From multi-die to ultra-nodes, system-level complexity is unprecedented. In the AI hardware arena, the challenge customers face is no longer just designing a single chip; rather, it includes systemic risks brought by Chiplet advanced packaging, heterogeneous integration, high-bandwidth memory, ultra-high-speed interconnects, efficient power networks, and AI data center architectures. This includes overheating and warping of the full system caused by insufficient attention to thermal management; power network design defects that lead to fusing at the package connection under heavy load; and a lack of a system-level signal management perspective, which results in a few tens of millions of dollars worth of tape-outs being unable to be powered on after assembly,” Ling Feng, founder and chairman of Chips and Semiconductor, said at a recent press conference.

Ling Feng noted that to address the issues above, EDA vendors need to establish the “System-Level Integration and Collaboration (STCO)” concept, achieving coordinated design across computation, networking, power supply, cooling, and system architecture.

The three global EDA giants have validated industry trends with real mergers and acquisitions backed by hard cash. In 2025, Synopsys Technology (Synopsys) acquired Ansys, the world’s largest simulation EDA company, for USD 35 billion,补齐 multi-physics simulation capabilities and strengthening end-to-end analysis from chips to systems.

Domestic AI chip makers are also actively laying out and investing at the ecosystem level. Sun Guoliang, senior vice president and chief product officer of Moore Threads (沐曦股份), recently introduced at the SEMICON forum that Moore Threads has built a complete GPU product matrix based on a unified, self-developed architecture, covering scenarios such as AI training, inference, graphics rendering, and scientific intelligence. It also provides a self-developed software stack that fully supports mainstream ecosystems, and it is actively promoting the development of open-source ecosystems.

In Wang Xiaolong’s view, a good software ecosystem is crucial to improving hardware utilization efficiency, which will accelerate the transition of China-made AI chips from “usable as a substitute” to “truly good for use on their own.” For example, behind the mainstream breakout of domestic large models such as DeepSeek and Qianwen is a significant improvement in how domestic AI chips utilize compute efficiently.

Hybrid bonding boosts compute-power core technology

On the hardware side, in the era of large-scale AI compute power, when a single chip faces three major bottlenecks—power consumption, area, and yield—advanced packaging has become the new “carrier of Moore’s Law.” Take TSMC’s CoWoS as an example: each generation integrates more GPUs, larger HBM, and stronger interconnects. At present, major AI chip players including NVIDIA and AMD have already achieved cross-class compute-power leaps for AI chips through advanced packaging technologies.

At this year’s SEMICON forum, Guo Xiaochao, market director for the foundry business unit at Wuhan Xinchip Integrated Circuit Co., Ltd., discussed the latest industry trends. He pointed out that the advanced packaging market, especially in the 2.5D/3D segment, is expanding rapidly. Industry mainstream solutions have evolved from CoWoS-S to CoWoS-L, SoW, and 3.5D XDSiP. The scale of integration continues to grow, and hybrid bonding is the key to achieving high-density interconnects—and also the core technology for improving compute power. This not only requires breakthroughs in process, but also demands coordinated cooperation among design methodologies, materials, and equipment.

On the domestic equipment front, North Huachuang recently released a 12-inch chip-to-wafer (D2W) hybrid bonding equipment. It is understood that this equipment focuses on meeting the extreme interconnect requirements for applications across the entire 3D integration domain, such as SoC, HBM, and Chiplet. It breaks through key process challenges including damage-free pickup of micron-level ultra-thin chips, nano-level ultra-high-precision alignment, and void-free, high-quality, stable bonding. It achieves a better balance between nano-level alignment precision and high-speed bonding production capacity, and has become a manufacturer in China that is the first to complete client-side process validation for D2W hybrid bonding equipment.

TEL Technology also rolled out a 3D IC series at the SEMICON forum, covering several new products such as fusion bonding and laser debonding, with a focus on Chiplet heterogeneous integration, three-dimensional stacking, and HBM-related applications.

In recent years, hybrid bonding equipment has become the fastest-growing subsegment within semiconductor equipment. Yole, a market consulting company, predicts that by 2030 its global market size will exceed USD 1.7 billion, with the expected CAGR for D2W hybrid bonding equipment as high as 21%.

However, relevant executives at major large-scale semiconductor equipment companies also pointed out that although the hybrid bonding equipment market is growing rapidly, it also faces challenges such as alignment precision, clean environment requirements, and warpage accommodation/capacity. Meanwhile, different hybrid bonding application scenarios have differences in how interfacial materials are selected. The combinations of dielectric materials such as SiCN (amorphous-state material) and copper each have their own advantages and disadvantages. Surface morphology, particle control, and wafer warpage directly affect bonding yield. Three-dimensional integration relies on coordinated efforts across the industry.

A white paper on the ultra-node technology system is released

Another breakthrough path for expanding AI compute power is ultra-node system integration. By leveraging high-speed interconnect technology, compute units are expanded from single-node and rack-level ultra-nodes (hundreds of AI chips) to cluster-level ultra-nodes (tens of millions of AI chips). When ultra-nodes are combined with advanced packaging, a “supercomputer” is created, consisting of large numbers of AI chips, HBM, high-speed interconnect networks, and liquid-cooling thermal management systems.

Domestic leading companies have also innovated and implemented solutions in the ultra-node field. On March 26, at the annual Zhongguancun Forum, Inspur (中科曙光) launched the world’s first wireless-cable rack-box ultra-node scaleX40. According to the introduction, traditional ultra-nodes rely on fiber-optic and copper-cable interconnects and generally suffer from pain points such as long deployment cycles, high operational complexity, and many potential failure points. scaleX40 uses a first-level orthogonal wireless-cable interconnect architecture, enabling direct plug-in connections between compute nodes and switch nodes, eliminating at the root performance losses caused by cabling and operational risks.

scaleX40 integrates 40 GPU cards per single node. Total compute power exceeds 28 PFlops. Total HBM memory capacity exceeds 5 TB. Total memory bandwidth for accessing reaches over 80 TB/s. It forms high-density compute-power units, meeting the training and inference needs of trillion-parameter large models.

Li Bin, senior vice president of Inspur, said that the significance of scaleX40 is not only improving performance, but also rebuilding the compute-power delivery logic—pushing compute power from “engineering-based construction” toward “productized supply,” and greatly reducing the entry barrier and deployment cost for high-end compute power.

At the industry level, on March 29, the “Ultra-Node Technology System White Paper” (hereinafter referred to as the “White Paper”), jointly completed by the Shanghai Artificial Intelligence Laboratory together with upstream and downstream enterprises across the AI industry chain such as Qimeng Molar and Moore Threads, was officially released. The White Paper aims to support large-scale deployment of ultra-nodes by addressing core pain points such as hard heterogenous coordination, low cross-domain scheduling efficiency, and complex engineering-based deployment—providing theoretical guidance from the perspective of industry practice.

Qimeng Molar believes that the value of future ultra-nodes will be reflected more in whether computing, storage, interconnect, scheduling, and runtime resources can be organized into unified coordinated system units, and whether high bandwidth, low latency, high utilization, and sustainable scalability can be maintained at even larger scales. Ultra-nodes are no longer just “a combination of more acceleration chips,” but a new architectural unit that determines whether a system can maintain effective coordination under large-scale conditions.

(Source: Securities Times)

A huge amount of information, precise analysis—everything is in the Sina Finance APP

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin