Improve model size detection: replace ad-hoc string parsing with reliable params_b field in MODELS dict ab92e0d Luigi commited on Oct 12, 2025
Set better defaults for free-tier users: Qwen3-1.7B model, 1024 max tokens, search disabled 2cae073 Luigi commited on Oct 12, 2025
Adjust duration estimation for H200 performance - reduce conservative estimates de766da Luigi commited on Oct 12, 2025
Use actual parameter count for AOT decision instead of string matching e3e334f Luigi commited on Oct 12, 2025
Make AOT compilation conditional for models >= 2B parameters to optimize free tier usage 4500f92 Luigi commited on Oct 12, 2025
add 4 20b+ models after enabling dynamic gpu duration fea2910 verified Luigi commited on Oct 12, 2025
disable two models that cannot run or too run too slowly on hf spaces with zerogpu 3dc7ced Luigi commited on Oct 11, 2025
feat(models): add Granite-4.0-Micro and Qwen3-4B-Instruct-2507 to MODELS registry c30a7f7 verified Luigi commited on Oct 9, 2025
add parser_model_ner_gemma_v0 based on gemma 3 370m it bc1bd75 verified Luigi commited on Aug 29, 2025
remove prevously added breeze models (as it didn't work), add smollm 135m taiwan b3fd72e Luigi commited on Aug 4, 2025