Towards Principled Disentanglement for Domain Generalization Paper • 2111.13839 • Published Nov 27, 2021
Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation Paper • 2202.01336 • Published Feb 2, 2022
The Impact of Symbolic Representations on In-context Learning for Few-shot Reasoning Paper • 2212.08686 • Published Dec 16, 2022
Discovering Hierarchical Latent Capabilities of Language Models via Causal Representation Learning Paper • 2506.10378 • Published Jun 12, 2025 • 2
EvoLM: In Search of Lost Language Model Training Dynamics Paper • 2506.16029 • Published Jun 19, 2025
AlgoTune: Can Language Models Speed Up General-Purpose Numerical Programs? Paper • 2507.15887 • Published Jul 19, 2025
Prescriptive Scaling Reveals the Evolution of Language Model Capabilities Paper • 2602.15327 • Published Feb 17 • 3
Prescriptive Scaling Reveals the Evolution of Language Model Capabilities Paper • 2602.15327 • Published Feb 17 • 3
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark Paper • 2304.03279 • Published Apr 6, 2023 • 2
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training Paper • 2406.10670 • Published Jun 15, 2024 • 4
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published Jun 17, 2024 • 55
Eliminating Position Bias of Language Models: A Mechanistic Approach Paper • 2407.01100 • Published Jul 1, 2024 • 8
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models Paper • 2412.02674 • Published Dec 3, 2024