[2606.01952] Randomized Least Squares Value Iteration itself is Joint Differentially Private
Abstract page for arXiv paper 2606.01952: Randomized Least Squares Value Iteration itself is Joint Differentially Private
America Forever Bytes
Technology
Abstract page for arXiv paper 2606.01952: Randomized Least Squares Value Iteration itself is Joint Differentially Private
Abstract page for arXiv paper 2606.01655: MINTS: Minimalist Thompson Sampling
Abstract page for arXiv paper 2606.02355: SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training
Abstract page for arXiv paper 2510.10544: PAC-Bayesian Reinforcement Learning Trains Generalizable Policies
Abstract page for arXiv paper 2510.11711: Reinforced sequential Monte Carlo for amortised sampling
Abstract page for arXiv paper 2605.31273: Survival Reinforcement Learning: Toward Scalable Self-Supervised RL
Abstract page for arXiv paper 2605.31328: Reinforcement Learning Amplifies Emergent Misalignment from Harmless Rewards
Abstract page for arXiv paper 2605.31524: Value Functions as Supermartingale Certificates
Abstract page for arXiv paper 2605.31044: The Challenges of Using Reinforcement Learning for Controlling Industrial Energy Systems
Abstract page for arXiv paper 2605.30824: Planner-Centric Reinforcement Learning for Deep Research with Structure-Aware Reward
Abstract page for arXiv paper 2605.30576: Uncertainty-Aware and Temporally Regulated Expert Advice in Reinforcement Learning for Autonomous Driving
Abstract page for arXiv paper 2605.30461: Scalable Constrained Multi-Agent Reinforcement Learning via State Augmentation and Consensus for Separable Dynamics
Abstract page for arXiv paper 2605.31289: The Terminal Representation in Reinforcement Learning
New autonomous system helps stair-climbing robots recover from falls using reinforcement learning and a robotic arm.
Abstract page for arXiv paper 2605.28918: When LLM Reward Design Fails: Diagnostic-Driven Refinement for Sparse Structured RL
Abstract page for arXiv paper 2605.29032: Theoretical Foundations and Effective Algorithms for Policy-Aware Simulator Learning
Abstract page for arXiv paper 2605.29002: FedQHD: Closed-Form Function-Space Federated Reinforcement Learning
Abstract page for arXiv paper 2605.29564: VE2VF: Vision-Enabled to Vision-Free Distillation via Real-world Reinforcement Learning for Robust Contact-Rich Manipu...
Abstract page for arXiv paper 2605.30160: On Distributional Reinforcement Learning in Chaotic Dynamical Systems
Abstract page for arXiv paper 2605.30244: Reinforcement Learning with Robust Rubric Rewards
Abstract page for arXiv paper 2510.11499: Offline Reinforcement Learning with Generative Trajectory Policies
Abstract page for arXiv paper 2605.29190: When RL Suppresses Its Own Vocabulary: Recovering Reasoning Diversity in Puzzle-to-Math Transfer
Abstract page for arXiv paper 2605.28810: Affective Music Recommendation: A Rollout-Based World Model for Offline Preference Optimization
Abstract page for arXiv paper 2605.27556: Accelerating Reinforcement Learning Training Using Simulation Surrogate Models
Abstract page for arXiv paper 2509.26442: Extensions of Robbins-Siegmund Theorem with Applications in Reinforcement Learning
Abstract page for arXiv paper 2510.03534: Long-Term Mapping of the Douro River Plume with Multi-Agent Reinforcement Learning
Abstract page for arXiv paper 2605.28290: Adaptive Bandit Algorithms for Contextual Matching Markets
Abstract page for arXiv paper 2605.28317: Hybrid Neural World Models
Abstract page for arXiv paper 2605.28247: IRDS: Interpretable RLVR Data Selection via Verifier-Coupled Sparse Autoencoder Coverage
Abstract page for arXiv paper 2605.28273: Global Policy-Space Response Oracles for Two-Player Zero-Sum Games