TAUR-dev/qwen25_vl_7b_element_lookup_01format_09_coordinate_02reflect_thrsh20_no_feedback_10_20 Updated Oct 21
OLMo-150M and OLMo-1B Pretrained Models Collection Pretrained models from scratch used in "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining". • 12 items • Updated Jul 7 • 4
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation Paper • 2501.05414 • Published Jan 9 • 2
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Paper • 2409.12183 • Published Sep 18, 2024 • 39
ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models Paper • 2505.13444 • Published May 19 • 17
ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models Paper • 2505.13444 • Published May 19 • 17