Supertonic-2 CoreML
This repository provides CoreML exports of Supertonic 2 for macOS and iOS.
It focuses on on-device inference with multiple >=8-bit quantization variants.
GitHub repo (code + demo app): https://github.com/Nooder/supertonic-2-coreml
Code & demo
The GitHub repo contains:
- Swift demo app (CoreML pipeline + UI):
supertonic2-coreml-ios-test/
- CoreML tooling + tests:
scripts/
- Docs:
docs/
What is included
models/: CoreML model packages by variant (>=8-bit only)
resources/: voice styles, embeddings, and text normalization assets
manifest.json: list of artifacts with checksums and sizes
SHA256SUMS: sha256 checksums for all files
tests/: smoke tests for CoreML model loading
Quickstart (iOS / macOS)
- Pick a variant from
models/ (see the quant matrix in docs/quant-matrix.md).
- Bundle the corresponding CoreML packages and
resources/ into your app.
- Use the Swift demo app in the GitHub repo
supertonic-2-coreml as the
reference implementation.
Required files (checklist)
Bundle the following into your app:
- CoreML packages for your chosen variant:
duration_predictor_mlprogram.mlpackage
text_encoder_mlprogram.mlpackage
vector_estimator_mlprogram.mlpackage
vocoder_mlprogram.mlpackage
resources/voice_styles/
resources/embeddings/
resources/onnx/unicode_indexer.json
resources/onnx/tts.json
Minimal iOS integration
let service = try TTSService(computeUnits: .all)
let result = try service.synthesize(
text: "Hello from CoreML!",
language: .en,
voiceName: "F1",
steps: 20,
speed: 1.0,
silenceSeconds: 0.3
)
print("WAV file:", result.url)
To select a specific variant, update the CoreML folder name in
TTSService (the demo defaults to coreml_int8).
Example: iOS 18 int8_both
This variant uses int8 weights for multiple stages on iOS 18.
Bundle these files in your app:
Resources/
coreml_ios18_int8_both/
duration_predictor_mlprogram.mlpackage
text_encoder_mlprogram.mlpackage
vector_estimator_mlprogram.mlpackage
vocoder_mlprogram.mlpackage
voice_styles/
embeddings/
onnx/
unicode_indexer.json
tts.json
In the Swift demo app, update the CoreML folder name to point at
coreml_ios18_int8_both (the app defaults to coreml_int8).
Choosing a variant
Use the folder naming to select the right artifact:
coreml_int8: faster, lower fidelity
coreml_compressed: smaller memory (linear8)
coreml_ios18_*: for iOS 18 CoreML runtime (>=8-bit only)
4-bit variants are intentionally excluded due to quality.
Variant matrix (quick view)
| Variant folder |
Quantization (by name) |
Intended target |
Notes |
coreml |
full precision (mixed) |
general |
baseline quality |
coreml_int8 |
int8 (all stages) |
general |
faster, lower fidelity |
coreml_compressed |
linear8 |
general |
smaller memory |
coreml_ios18 |
full precision (mlprogram) |
iOS 18+ |
best quality on iOS 18 |
coreml_ios18_int8_vocoder_only |
int8 (vocoder only) |
iOS 18+ |
balanced |
coreml_ios18_int8_both |
int8 (multiple stages) |
iOS 18+ |
fastest, more loss |
coreml_compressed_ios18 |
linear8 |
iOS 18+ |
smallest memory |
For deeper guidance, see docs/compatibility-matrix.md and docs/quant-matrix.md.
Steps vs. quality (quick guide)
| Steps |
Speed |
Quality |
| 10 |
fastest |
lowest |
| 20 |
balanced |
good |
| 30 |
slowest |
best |
Troubleshooting
- Missing resource error: Ensure
resources/ folders are bundled and named exactly.
- Model not found: Confirm the CoreML folder name (e.g.,
coreml_ios18_int8_both).
- Fails to load on device: Check iOS deployment target matches your variant.
Tests
The tests/test_coreml_models.py script runs a simple smoke test that loads
all stages (duration predictor, text encoder, vector estimator, vocoder) with
dummy inputs.
Attribution and license
This CoreML export is derived from Supertone/supertonic-2.
Model weights are licensed under OpenRAIL-M (see LICENSE).
Sample code is MIT-licensed (see NOTICE and UPSTREAM.md).