Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2407.07612

Asymmetric Graph Error Control with Low Complexity in Causal Bandits

Paper • 2408.11240 • Published Aug 20, 2024 • 1
Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2

Papers - Text - Reasoning - Causal Chains

Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2

Papers - Encodings - SPE - Sinusoidal Position Encoding

Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2

Papers - Encodings - No Positional Encodings - NoPE

The Impact of Positional Encoding on Length Generalization in Transformers

Paper • 2305.19466 • Published May 31, 2023 • 2
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 54
Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2
Round and Round We Go! What makes Rotary Positional Encodings useful?

Paper • 2410.06205 • Published Oct 8, 2024 • 2

Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2
Symbolic Learning Enables Self-Evolving Agents

Paper • 2406.18532 • Published Jun 26, 2024 • 12
Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6, 2024 • 117
a-m-team/AM-Thinking-v1

Text Generation • 33B • Updated May 14 • 772 • • 199

Papers - Encodings - LPE - Learnable Position Encodings

Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2

Ankush Collection

Transformer Articles

DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention

Paper • 2309.14327 • Published Sep 25, 2023 • 22
MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Paper • 2407.08083 • Published Jul 10, 2024 • 32
Memory^3: Language Modeling with Explicit Memory

Paper • 2407.01178 • Published Jul 1, 2024 • 4
Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2

Papers - Reasoning

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Paper • 2402.14848 • Published Feb 19, 2024 • 20
Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 50
How Far Are We from Intelligent Visual Deductive Reasoning?

Paper • 2403.04732 • Published Mar 7, 2024 • 23
Learning to Reason and Memorize with Self-Notes

Paper • 2305.00833 • Published May 1, 2023 • 5

Asymmetric Graph Error Control with Low Complexity in Causal Bandits

Paper • 2408.11240 • Published Aug 20, 2024 • 1
Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2

Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2
Symbolic Learning Enables Self-Evolving Agents

Paper • 2406.18532 • Published Jun 26, 2024 • 12
Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6, 2024 • 117
a-m-team/AM-Thinking-v1

Text Generation • 33B • Updated May 14 • 772 • • 199

Papers - Text - Reasoning - Causal Chains

Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2

Papers - Encodings - LPE - Learnable Position Encodings

Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2

Papers - Encodings - SPE - Sinusoidal Position Encoding

Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2

Ankush Collection

Transformer Articles

DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention

Paper • 2309.14327 • Published Sep 25, 2023 • 22
MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Paper • 2407.08083 • Published Jul 10, 2024 • 32
Memory^3: Language Modeling with Explicit Memory

Paper • 2407.01178 • Published Jul 1, 2024 • 4
Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2

Papers - Encodings - No Positional Encodings - NoPE

The Impact of Positional Encoding on Length Generalization in Transformers

Paper • 2305.19466 • Published May 31, 2023 • 2
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 54
Teaching Transformers Causal Reasoning through Axiomatic Training

Paper • 2407.07612 • Published Jul 10, 2024 • 2
Round and Round We Go! What makes Rotary Positional Encodings useful?

Paper • 2410.06205 • Published Oct 8, 2024 • 2

Papers - Reasoning

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Paper • 2402.14848 • Published Feb 19, 2024 • 20
Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 50
How Far Are We from Intelligent Visual Deductive Reasoning?

Paper • 2403.04732 • Published Mar 7, 2024 • 23
Learning to Reason and Memorize with Self-Notes

Paper • 2305.00833 • Published May 1, 2023 • 5

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs