[2606.02011] Extreme Low-Bit Inference in Reasoning Models: Failure Modes and Targeted Recovery
Abstract page for arXiv paper 2606.02011: Extreme Low-Bit Inference in Reasoning Models: Failure Modes and Targeted Recovery
America Forever Bytes
Other
Abstract page for arXiv paper 2606.02011: Extreme Low-Bit Inference in Reasoning Models: Failure Modes and Targeted Recovery
I want to address some topics in quantization, with some specifics for games. We do "quantization" any time we take a high precision val...
Abstract page for arXiv paper 2605.29756: LFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMs
Abstract page for arXiv paper 2605.28686: Constrained Symplectic Quantization: Disclosing the Deterministic Framework Behind Quantum Field Theory