General Lipschitz: Certified Robustness Against Resolvable Semantic Transformations via Transformation-Dependent Randomized Smoothing Paper • 2309.16710 • Published Aug 17, 2023
NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizer Paper • 2209.14937 • Published Sep 29, 2022
Sparse and Transferable Universal Singular Vectors Attack Paper • 2401.14031 • Published Jan 25, 2024
SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers Paper • 2410.07383 • Published Oct 9, 2024
Stable Low-rank Tensor Decomposition for Compression of Convolutional Neural Network Paper • 2008.05441 • Published Aug 12, 2020
Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts Paper • 2506.05229 • Published Jun 5 • 37
Geopolitical biases in LLMs: what are the "good" and the "bad" countries according to contemporary language models Paper • 2506.06751 • Published about 1 month ago • 72
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models Paper • 2506.06395 • Published Jun 5 • 126
Test-Time Reasoning Through Visual Human Preferences with VLMs and Soft Rewards Paper • 2503.19948 • Published Mar 25
Exploring the Latent Capacity of LLMs for One-Step Text Generation Paper • 2505.21189 • Published May 27 • 62
The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models Paper • 2311.05928 • Published Nov 10, 2023 • 1
Memory-Efficient Backpropagation through Large Linear Layers Paper • 2201.13195 • Published Jan 31, 2022 • 1
Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction Paper • 2202.00441 • Published Feb 1, 2022 • 1
General Covariance Data Augmentation for Neural PDE Solvers Paper • 2301.12730 • Published Jan 30, 2023
CLEAR: Character Unlearning in Textual and Visual Modalities Paper • 2410.18057 • Published Oct 23, 2024 • 210
ConDiff: A Challenging Dataset for Neural Solvers of Partial Differential Equations Paper • 2406.04709 • Published Jun 7, 2024
MaxInfo: A Training-Free Key-Frame Selection Method Using Maximum Volume for Enhanced Video Understanding Paper • 2502.03183 • Published Feb 5 • 4
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published Mar 24 • 119
Combining Flow Matching and Transformers for Efficient Solution of Bayesian Inverse Problems Paper • 2503.01375 • Published Mar 3 • 5