Austin207 commited on
Commit
e88c25e
Β·
verified Β·
1 Parent(s): c51942b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -19
README.md CHANGED
@@ -53,10 +53,10 @@ model-index:
53
 
54
  ## Key Features
55
 
56
- - **Efficient Training**: Trained on RTX 5070 (8GB VRAM) in ~4 hours
57
  - **Extended Context**: 16,384 token context window (16x typical small models)
58
  - **Memory Efficient**: Only 1.3GB VRAM for 1,800 tokens inference
59
- - **Fast Inference**: ~10 tokens/second on consumer GPU
60
  - **High Quality Data**: Trained on curated RefinedWeb subset
61
 
62
  ## Architecture Details
@@ -96,7 +96,7 @@ model-index:
96
  ## Training Procedure
97
 
98
  ### Training Configuration
99
- - **Hardware**: NVIDIA RTX 5070 (8GB VRAM)
100
  - **Precision**: bfloat16 mixed precision
101
  - **Batch Size**: 1 per device
102
  - **Gradient Accumulation**: 32 steps
@@ -128,7 +128,7 @@ model-index:
128
  - **Convergence**: Smooth loss curve, no overfitting
129
 
130
  ### Inference Performance
131
- - **Speed**: ~10 tokens/second (RTX 5070)
132
  - **Memory Usage**: 1.3GB for 1,800 token context
133
  - **Context Limit**: 3,600 tokens practical limit
134
  - **Temperature**: Recommended 0.7-0.9 for creative tasks
@@ -221,26 +221,13 @@ python interactive_chat.py
221
  title={MAP-NEO Mini: An Efficient 253M Parameter Language Model},
222
  author={[Antony Austin]},
223
  year={2025},
224
- howpublished={\url{https://huggingface.co/[Austin207]/map-neo-mini}},
225
- note={Trained on NVIDIA RTX 5070 with RefinedWeb data}
226
  }
227
  ```
228
 
229
  ## Technical Details
230
 
231
- ### Files Structure
232
- ```
233
- map-neo-mini/
234
- β”œβ”€β”€ config.json # Model configuration
235
- β”œβ”€β”€ pytorch_model.bin # Model weights
236
- β”œβ”€β”€ tokenizer.json # Tokenizer configuration
237
- β”œβ”€β”€ tokenizer_config.json # Tokenizer metadata
238
- β”œβ”€β”€ special_tokens_map.json # Special tokens
239
- β”œβ”€β”€ vocab.json # Vocabulary
240
- β”œβ”€β”€ merges.txt # BPE merges
241
- └── model_neo.py # Model architecture code
242
- ```
243
-
244
  ### Hardware Requirements
245
  - **Minimum**: 4GB VRAM for inference
246
  - **Recommended**: 8GB VRAM for extended context
 
53
 
54
  ## Key Features
55
 
56
+ - **Efficient Training**: Trained on RTX 5070 Laptop GPU (8GB VRAM) in ~4 hours
57
  - **Extended Context**: 16,384 token context window (16x typical small models)
58
  - **Memory Efficient**: Only 1.3GB VRAM for 1,800 tokens inference
59
+ - **Fast Inference**: ~150+ tokens/second on consumer GPU
60
  - **High Quality Data**: Trained on curated RefinedWeb subset
61
 
62
  ## Architecture Details
 
96
  ## Training Procedure
97
 
98
  ### Training Configuration
99
+ - **Hardware**: NVIDIA RTX 5070 Laptop GPU (8GB VRAM)
100
  - **Precision**: bfloat16 mixed precision
101
  - **Batch Size**: 1 per device
102
  - **Gradient Accumulation**: 32 steps
 
128
  - **Convergence**: Smooth loss curve, no overfitting
129
 
130
  ### Inference Performance
131
+ - **Speed**: ~150+ tokens/second (RTX 5070)
132
  - **Memory Usage**: 1.3GB for 1,800 token context
133
  - **Context Limit**: 3,600 tokens practical limit
134
  - **Temperature**: Recommended 0.7-0.9 for creative tasks
 
221
  title={MAP-NEO Mini: An Efficient 253M Parameter Language Model},
222
  author={[Antony Austin]},
223
  year={2025},
224
+ howpublished={\url{https://huggingface.co/Austin207/Map-NEO}},
225
+ note={Trained on NVIDIA RTX 5070 Laptop GPU with RefinedWeb data}
226
  }
227
  ```
228
 
229
  ## Technical Details
230
 
 
 
 
 
 
 
 
 
 
 
 
 
 
231
  ### Hardware Requirements
232
  - **Minimum**: 4GB VRAM for inference
233
  - **Recommended**: 8GB VRAM for extended context