Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Launchpad
Be early to the next big token project
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Google's new algorithm claims to "6x compress KV cache," putting pressure on the U.S. stock storage sector.
On Wednesday after the U.S. stock market opened, while overall market sentiment was still decent, the storage sector moved in the opposite direction and weakened. As of the time of writing, Micron Technology fell 3.57%, SanDisk dropped 4.12%, Western Digital and Seagate Technology also declined.
Many sources attribute today’s market movements to Google. The AI giant earlier launched a compression algorithm called TurboQuant that may reduce memory requirements for AI systems.
According to Google, TurboQuant compression technology aims to reduce the memory footprint of large language models and vector search engines. The algorithm mainly targets the bottleneck caused by key-value caches used to store high-frequency access information in AI systems. As context windows expand, these caches are becoming the main memory bottleneck.
TurboQuant can compress key-value caches to 3-bit precision without retraining or fine-tuning the models, while maintaining nearly the same accuracy. Tests on open-source models like Gemma and Mistral show that this technology can achieve approximately 6x compression of key-value cache memory.
Additionally, testing on NVIDIA’s H100 accelerators shows that, compared to unquantized key vectors, this algorithm can deliver up to an 8x performance boost. Researchers also noted that this technology is not limited to AI models but also supports vector retrieval capabilities for large-scale search engines.
Google plans to showcase TurboQuant at the International Conference on Learning Representations (ICLR 2026) in April.
It’s clear that although the application prospects of this technology still carry some uncertainty, the market has already begun to price in expectations of a shift in memory demand.
Regarding the latest developments, Andrew Rocha, a TMT analyst at Wells Fargo, commented: “As context windows continue to expand, the data stored in KV caches grows explosively, increasing the demand for memory capacity. TurboQuant is directly compressing this cost curve. If widely adopted, this would be a positive for memory cost trends.”
Rocha also stated that this technology could influence future assessments of memory capacity requirements.
He wrote: “If the memory specifications needed for these AI applications are significantly reduced, the market will soon reassess how much memory capacity is truly necessary.”
However, Rocha also pointed out that it is still unclear whether this technology is only applicable within Google’s own ecosystem or if it can be extended to other AI labs. Additionally, it remains uncertain whether test results in laboratory environments can be smoothly translated into real-world production performance.
Notably, as a key player stirring the storage sector, Google has not benefited much from this. The company’s stock price briefly fell below $290 on Wednesday, down nearly 17% from the February high of $349, and is only a short distance from the critical psychological level of 20%.
(Article source: Cailian Press)