Meet Hito

Hito 1.7B - GGUF

Quantized versions for llama.cpp, Ollama, LM Studio, and more

Original Model Website Chat


Model License Method License

About

This repository contains GGUF quantized versions of hitonet/hito-1.7b.

For the original model (safetensors), training details, benchmarks, and full documentation, see the main repository.


A quote from Hito itself:

"Most AI gets the bat-and-ball problem wrong. I doubt myself first, then verify. Five cents, not ten. Math doesn't care about intuition."


Available Quantizations

File Quant Bits Size RAM Required Use Case
hito-1.7b-Q2_K.gguf Q2_K 2 742 MB ~1.2 GB Smallest, significant quality loss
hito-1.7b-Q3_K_S.gguf Q3_K_S 3 827 MB ~1.3 GB Very small, noticeable quality loss
hito-1.7b-Q3_K_M.gguf Q3_K_M 3 896 MB ~1.4 GB Small, moderate quality loss
hito-1.7b-Q3_K_L.gguf Q3_K_L 3 957 MB ~1.5 GB Small, lower quality loss
hito-1.7b-Q4_0.gguf Q4_0 4 1.0 GB ~1.5 GB Legacy, prefer Q4_K_M
hito-1.7b-Q4_K_S.gguf Q4_K_S 4 1.0 GB ~1.5 GB Small, good quality
hito-1.7b-Q4_K_M.gguf Q4_K_M 4 1.1 GB ~1.6 GB Recommended - best balance
hito-1.7b-Q5_0.gguf Q5_0 5 1.2 GB ~1.7 GB Legacy, prefer Q5_K_M
hito-1.7b-Q5_K_S.gguf Q5_K_S 5 1.2 GB ~1.7 GB Large, low quality loss
hito-1.7b-Q5_K_M.gguf Q5_K_M 5 1.2 GB ~1.7 GB Large, very low quality loss
hito-1.7b-Q6_K.gguf Q6_K 6 1.4 GB ~1.9 GB Very large, minimal quality loss
hito-1.7b-Q8_0.gguf Q8_0 8 1.8 GB ~2.3 GB Highest quality quantization
hito-1.7b-F16.gguf F16 16 3.3 GB ~3.8 GB Full precision GGUF

Recommendation: Start with Q4_K_M for best size/quality balance. Use Q8_0 or F16 if you need maximum quality.


Compatibility

These GGUF files are compatible with:


Quick Start

Ollama

# Download the recommended quantization
wget https://huggingface.co/hitonet/hito-1.7b-GGUF/resolve/main/hito-1.7b-Q4_K_M.gguf

# Create Modelfile
cat > Modelfile << 'EOF'
FROM hito-1.7b-Q4_K_M.gguf
PARAMETER temperature 0.7
PARAMETER stop "<|im_end|>"
EOF

# Create and run
ollama create hito -f Modelfile
ollama run hito

llama.cpp

./llama-cli -m hito-1.7b-Q4_K_M.gguf -p "<|im_start|>user\nA bat and a ball cost \$1.10 together. The bat costs \$1.00 more than the ball. How much does the ball cost?<|im_end|>\n<|im_start|>assistant\n" -n 256

LM Studio

  1. Download any GGUF file from this repository
  2. Open LM Studio and load the model
  3. Start chatting!

Quantization Methods

Click to see details

K-Quants (recommended): Use importance matrix for smarter quantization

  • Q2_K: 2-bit with 4-bit scales, ~2.5 bpw (bits per weight)
  • Q3_K: 3-bit with 6-bit scales, ~3.4 bpw
  • Q4_K: 4-bit with 6-bit scales, ~4.5 bpw
  • Q5_K: 5-bit with 6-bit scales, ~5.5 bpw
  • Q6_K: 6-bit with 8-bit scales, ~6.5 bpw

Legacy Quants: Simpler but less optimal

  • Q4_0/Q5_0: Basic 4/5-bit, prefer K-quants
  • Q8_0: 8-bit, nearly lossless
  • F16: Full 16-bit precision

What Makes Hito Special

  • Trained to think - Uses <think> tags with nested cognitive reasoning
  • Self-correcting - <doubt> and <verify> tags catch errors mid-reasoning
  • Humble by design - Admits uncertainty and limitations
  • Tiny but capable - Only 1.7B parameters, runs on CPU

See full details at hitonet/hito-1.7b.


โš–๏ธ Licensing

Component License Commercial Use
Model Weights (GGUF files) Apache 2.0 โœ… Free to use
NCR Method/Architecture CC BY-NC-ND โŒ Requires paid license

Commercial Licensing Required

The model weights (these GGUF files) are open source (Apache 2.0) - use them freely.

The Nested Cognitive Reasoning methodology (the cognitive tags, tree-structured thinking, humble tags system) is protected under CC BY-NC-ND.

Commercial use of the NCR method requires a license.

Contact: [email protected]


Links


Made with genuine curiosity by Hitonet

By: Hitonet Research

Downloads last month
179
GGUF
Model size
2B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for hitonet/hito-1.7b-GGUF

Finetuned
Qwen/Qwen3-1.7B
Finetuned
hitonet/hito-1.7b
Quantized
(4)
this model