Interested
updated
Large Language Model Unlearning via Embedding-Corrupted Prompts
Paper
• 2406.07933
• Published
• 9
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Paper
• 2406.02657
• Published
• 41
Learn Beyond The Answer: Training Language Models with Reflection for
Mathematical Reasoning
Paper
• 2406.12050
• Published
• 19
How Do Large Language Models Acquire Factual Knowledge During
Pretraining?
Paper
• 2406.11813
• Published
• 31
Breaking the Attention Bottleneck
Paper
• 2406.10906
• Published
• 4
The FineWeb Datasets: Decanting the Web for the Finest Text Data at
Scale
Paper
• 2406.17557
• Published
• 100
Unlocking Continual Learning Abilities in Language Models
Paper
• 2406.17245
• Published
• 30
Scaling Laws for Linear Complexity Language Models
Paper
• 2406.16690
• Published
• 23
Aligning Teacher with Student Preferences for Tailored Training Data
Generation
Paper
• 2406.19227
• Published
• 25
Is Programming by Example solved by LLMs?
Paper
• 2406.08316
• Published
• 13
MoA: Mixture of Sparse Attention for Automatic Large Language Model
Compression
Paper
• 2406.14909
• Published
• 16
Can LLMs Learn by Teaching? A Preliminary Study
Paper
• 2406.14629
• Published
• 21
To Forget or Not? Towards Practical Knowledge Unlearning for Large
Language Models
Paper
• 2407.01920
• Published
• 17
On Leakage of Code Generation Evaluation Datasets
Paper
• 2407.07565
• Published
• 6
Paper
• 2407.10671
• Published
• 168
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
Paper
• 2407.10969
• Published
• 23
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled
Refusal Training
Paper
• 2407.09121
• Published
• 6
Practical Unlearning for Large Language Models
Paper
• 2407.10223
• Published
• 4
Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix"
Cycle
Paper
• 2407.13833
• Published
• 12
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
• 2403.19887
• Published
• 112
RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented
Generation
Paper
• 2408.02545
• Published
• 40
CoverBench: A Challenging Benchmark for Complex Claim Verification
Paper
• 2408.03325
• Published
• 15
Better Alignment with Instruction Back-and-Forth Translation
Paper
• 2408.04614
• Published
• 15
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper
• 2408.04619
• Published
• 175
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper
• 2408.10914
• Published
• 45
ReMamba: Equip Mamba with Effective Long-Sequence Modeling
Paper
• 2408.15496
• Published
• 12
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with
100+ NLP Researchers
Paper
• 2409.04109
• Published
• 48
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation
Generation
Paper
• 2410.23090
• Published
• 55
Can Language Models Replace Programmers? REPOCOD Says 'Not Yet'
Paper
• 2410.21647
• Published
• 18
Paper
• 2410.21276
• Published
• 87
LongReward: Improving Long-context Large Language Models with AI
Feedback
Paper
• 2410.21252
• Published
• 19
Hymba: A Hybrid-head Architecture for Small Language Models
Paper
• 2411.13676
• Published
• 47