Google's new algorithm claims to "6x compress KV cache," putting pressure on the U.S. stock storage sector.

robot
Abstract generation in progress

On Wednesday after the U.S. stock market opened, while overall market sentiment was still decent, the storage sector moved in the opposite direction and weakened. As of the time of writing, Micron Technology fell 3.57%, SanDisk dropped 4.12%, Western Digital and Seagate Technology also declined.

Many sources attribute today’s market movements to Google. The AI giant earlier launched a compression algorithm called TurboQuant that may reduce memory requirements for AI systems.

According to Google, TurboQuant compression technology aims to reduce the memory footprint of large language models and vector search engines. The algorithm mainly targets the bottleneck caused by key-value caches used to store high-frequency access information in AI systems. As context windows expand, these caches are becoming the main memory bottleneck.

TurboQuant can compress key-value caches to 3-bit precision without retraining or fine-tuning the models, while maintaining nearly the same accuracy. Tests on open-source models like Gemma and Mistral show that this technology can achieve approximately 6x compression of key-value cache memory.

Additionally, testing on NVIDIA’s H100 accelerators shows that, compared to unquantized key vectors, this algorithm can deliver up to an 8x performance boost. Researchers also noted that this technology is not limited to AI models but also supports vector retrieval capabilities for large-scale search engines.

Google plans to showcase TurboQuant at the International Conference on Learning Representations (ICLR 2026) in April.

It’s clear that although the application prospects of this technology still carry some uncertainty, the market has already begun to price in expectations of a shift in memory demand.

Regarding the latest developments, Andrew Rocha, a TMT analyst at Wells Fargo, commented: “As context windows continue to expand, the data stored in KV caches grows explosively, increasing the demand for memory capacity. TurboQuant is directly compressing this cost curve. If widely adopted, this would be a positive for memory cost trends.

Rocha also stated that this technology could influence future assessments of memory capacity requirements.

He wrote: “If the memory specifications needed for these AI applications are significantly reduced, the market will soon reassess how much memory capacity is truly necessary.”

However, Rocha also pointed out that it is still unclear whether this technology is only applicable within Google’s own ecosystem or if it can be extended to other AI labs. Additionally, it remains uncertain whether test results in laboratory environments can be smoothly translated into real-world production performance.

Notably, as a key player stirring the storage sector, Google has not benefited much from this. The company’s stock price briefly fell below $290 on Wednesday, down nearly 17% from the February high of $349, and is only a short distance from the critical psychological level of 20%.

(Article source: Cailian Press)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin