Researchers at Tsinghua University and Z.ai built IndexCache to eliminate redundant computation in sparse attention models like DeepSeek and GLM. The training-free technique cuts 75% of indexer ...