Pythia 160M, Mamba 130M, and RWKV 169M models trained on OpenWebText for 4000 steps (context window: 1024; effective batch size: 512). 6 seeds each.
James Michaelov
jmichaelov
AI & ML interests
None yet
Organizations
models
18
jmichaelov/parc-rwkv-seed5
Text Generation
•
0.2B
•
Updated
•
1
jmichaelov/parc-rwkv-seed4
Text Generation
•
0.2B
•
Updated
•
1
jmichaelov/parc-rwkv-seed3
Text Generation
•
0.2B
•
Updated
•
1
jmichaelov/parc-rwkv-seed2
Text Generation
•
0.2B
•
Updated
•
1
jmichaelov/parc-rwkv-seed1
Text Generation
•
0.2B
•
Updated
jmichaelov/parc-rwkv-seed0
Text Generation
•
0.2B
•
Updated
•
49
jmichaelov/parc-mamba-seed5
Text Generation
•
0.1B
•
Updated
•
3
jmichaelov/parc-mamba-seed4
Text Generation
•
0.1B
•
Updated
•
5
jmichaelov/parc-mamba-seed3
Text Generation
•
0.1B
•
Updated
•
4
jmichaelov/parc-mamba-seed2
Text Generation
•
0.1B
•
Updated
•
3
datasets
13
jmichaelov/bhs
Viewer
•
Updated
•
22k
•
96
jmichaelov/blimp_nl
Viewer
•
Updated
•
8.4k
•
105
jmichaelov/lm_syneval
Viewer
•
Updated
•
158k
•
66
jmichaelov/inverse_scaling_prize-hindsight_neglect
Viewer
•
Updated
•
315
•
14
jmichaelov/inverse_scaling_prize-memo_trap
Viewer
•
Updated
•
936
•
4
jmichaelov/inverse_scaling_prize-neqa
Viewer
•
Updated
•
300
•
16
jmichaelov/inverse_scaling_prize-redefine
Viewer
•
Updated
•
1.24k
•
4
jmichaelov/inverse_scaling_prize-into_the_unknown
Viewer
•
Updated
•
1.82k
•
2
jmichaelov/inverse_scaling_prize-modus_tollens
Viewer
•
Updated
•
1.24k
•
12
jmichaelov/inverse_scaling_prize-pattern_matching_suppression
Viewer
•
Updated
•
1.43k
•
25