About
This repository contains GGUF quantized versions of hitonet/hito-1.7b.
For the original model (safetensors), training details, benchmarks, and full documentation, see the main repository.
A quote from Hito itself:
"Most AI gets the bat-and-ball problem wrong. I doubt myself first, then verify. Five cents, not ten. Math doesn't care about intuition."
Available Quantizations
| File | Quant | Bits | Size | RAM Required | Use Case |
|---|---|---|---|---|---|
| hito-1.7b-Q2_K.gguf | Q2_K | 2 | 742 MB | ~1.2 GB | Smallest, significant quality loss |
| hito-1.7b-Q3_K_S.gguf | Q3_K_S | 3 | 827 MB | ~1.3 GB | Very small, noticeable quality loss |
| hito-1.7b-Q3_K_M.gguf | Q3_K_M | 3 | 896 MB | ~1.4 GB | Small, moderate quality loss |
| hito-1.7b-Q3_K_L.gguf | Q3_K_L | 3 | 957 MB | ~1.5 GB | Small, lower quality loss |
| hito-1.7b-Q4_0.gguf | Q4_0 | 4 | 1.0 GB | ~1.5 GB | Legacy, prefer Q4_K_M |
| hito-1.7b-Q4_K_S.gguf | Q4_K_S | 4 | 1.0 GB | ~1.5 GB | Small, good quality |
| hito-1.7b-Q4_K_M.gguf | Q4_K_M | 4 | 1.1 GB | ~1.6 GB | Recommended - best balance |
| hito-1.7b-Q5_0.gguf | Q5_0 | 5 | 1.2 GB | ~1.7 GB | Legacy, prefer Q5_K_M |
| hito-1.7b-Q5_K_S.gguf | Q5_K_S | 5 | 1.2 GB | ~1.7 GB | Large, low quality loss |
| hito-1.7b-Q5_K_M.gguf | Q5_K_M | 5 | 1.2 GB | ~1.7 GB | Large, very low quality loss |
| hito-1.7b-Q6_K.gguf | Q6_K | 6 | 1.4 GB | ~1.9 GB | Very large, minimal quality loss |
| hito-1.7b-Q8_0.gguf | Q8_0 | 8 | 1.8 GB | ~2.3 GB | Highest quality quantization |
| hito-1.7b-F16.gguf | F16 | 16 | 3.3 GB | ~3.8 GB | Full precision GGUF |
Recommendation: Start with Q4_K_M for best size/quality balance. Use Q8_0 or F16 if you need maximum quality.
Compatibility
These GGUF files are compatible with:
- llama.cpp (latest version recommended)
- Ollama
- LM Studio
- Jan
- GPT4All
- llama-cpp-python
- Any other llama.cpp-based application
Quick Start
Ollama
# Download the recommended quantization
wget https://huggingface.co/hitonet/hito-1.7b-GGUF/resolve/main/hito-1.7b-Q4_K_M.gguf
# Create Modelfile
cat > Modelfile << 'EOF'
FROM hito-1.7b-Q4_K_M.gguf
PARAMETER temperature 0.7
PARAMETER stop "<|im_end|>"
EOF
# Create and run
ollama create hito -f Modelfile
ollama run hito
llama.cpp
./llama-cli -m hito-1.7b-Q4_K_M.gguf -p "<|im_start|>user\nA bat and a ball cost \$1.10 together. The bat costs \$1.00 more than the ball. How much does the ball cost?<|im_end|>\n<|im_start|>assistant\n" -n 256
LM Studio
- Download any GGUF file from this repository
- Open LM Studio and load the model
- Start chatting!
Quantization Methods
Click to see details
K-Quants (recommended): Use importance matrix for smarter quantization
- Q2_K: 2-bit with 4-bit scales, ~2.5 bpw (bits per weight)
- Q3_K: 3-bit with 6-bit scales, ~3.4 bpw
- Q4_K: 4-bit with 6-bit scales, ~4.5 bpw
- Q5_K: 5-bit with 6-bit scales, ~5.5 bpw
- Q6_K: 6-bit with 8-bit scales, ~6.5 bpw
Legacy Quants: Simpler but less optimal
- Q4_0/Q5_0: Basic 4/5-bit, prefer K-quants
- Q8_0: 8-bit, nearly lossless
- F16: Full 16-bit precision
What Makes Hito Special
- Trained to think - Uses
<think>tags with nested cognitive reasoning - Self-correcting -
<doubt>and<verify>tags catch errors mid-reasoning - Humble by design - Admits uncertainty and limitations
- Tiny but capable - Only 1.7B parameters, runs on CPU
See full details at hitonet/hito-1.7b.
โ๏ธ Licensing
| Component | License | Commercial Use |
|---|---|---|
| Model Weights (GGUF files) | Apache 2.0 | โ Free to use |
| NCR Method/Architecture | CC BY-NC-ND | โ Requires paid license |
Commercial Licensing Required
The model weights (these GGUF files) are open source (Apache 2.0) - use them freely.
The Nested Cognitive Reasoning methodology (the cognitive tags, tree-structured thinking, humble tags system) is protected under CC BY-NC-ND.
Commercial use of the NCR method requires a license.
Contact: [email protected]
Links
- Original Model: hitonet/hito-1.7b
- Research Paper: Nested Cognitive Reasoning
- Website: hitonet.com
- Free Chat: chat.hitonet.com
- API: platform.hitonet.com
Made with genuine curiosity by Hitonet
By: Hitonet Research
- Downloads last month
- 179
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit