Surn commited on
Commit
d57c213
·
1 Parent(s): 850b1df

Enhance AI word generation and update to v0.1.1

Browse files

Updated Wrdler to version 0.1.1 with significant enhancements to AI word generation, including intelligent word saving, retry mechanisms, and a 1000-word file size limit. Introduced dual generation modes using Hugging Face Space API and local transformers. Improved logging for better pipeline visibility. Updated documentation and added a new test for word file validation. Added new AI-generated word lists and a cooking-themed word list.

CLAUDE.md CHANGED
@@ -7,13 +7,22 @@ Wrdler is a simplified vocabulary puzzle game based on BattleWords, with these k
7
  - **No scope/radar visualization**
8
  - **2 free letter guesses at game start** (all instances of chosen letters are revealed)
9
 
10
- **Current Version:** 0.1.0
11
  **Repository:** https://github.com/Oncorporation/Wrdler.git
12
  **Live Demo:** [DEPLOYMENT_URL_HERE]
13
 
14
  ## Recent Changes
15
 
16
- **v0.1.0 (Current):**
 
 
 
 
 
 
 
 
 
17
  - ✅ Version updated to 0.1.0 across all files
18
  - ✅ AI word generation functionality added
19
  - ✅ Word list management enhanced with AI support
@@ -157,12 +166,21 @@ wrdler/
157
  - No gameplay logic changes required
158
  - Works offline for basic functionality
159
 
160
- ### ✅ AI Word Generation (v0.1.0)
161
  - **AI-Powered Word Lists:** Generate custom word lists using Hugging Face Spaces or local transformers
162
  - **Topic-Based Generation:** Create words related to specific themes (e.g., "Ocean Life", "Space")
163
- - **Automatic Expansion:** New AI-generated words are saved to local files for future use
 
 
 
 
 
164
  - **Fallback Support:** Gracefully falls back to dictionary words if AI is unavailable
165
  - **Word Distribution:** Ensures exactly 25 words each of lengths 4, 5, and 6 per topic
 
 
 
 
166
 
167
  ### PLANNED: Local Player Storage (v0.3.0)
168
  - **Local Storage:**
@@ -210,7 +228,7 @@ wrdler/
210
 
211
  ### Development Status
212
 
213
- **Current Version:** 0.1.0 (Complete)
214
  - ✅ All 7 sprints complete
215
  - ✅ 100% test coverage (25/25 tests)
216
  - ✅ AI word generation implemented
@@ -347,6 +365,15 @@ The dataset repository will contain:
347
 
348
  ## Post-v0.0.2 Enhancements
349
 
 
 
 
 
 
 
 
 
 
350
  ### v0.1.0 (AI Word Generation)
351
  - AI-powered word list generation using Hugging Face Spaces
352
  - Topic-based word creation with automatic saving
@@ -496,19 +523,15 @@ From `pyproject.toml`:
496
 
497
  ## Version History Summary
498
 
499
- - **v0.1.0** (Current) - AI word generation, utility modules, version bump
500
- - **v0.0.4** (Previous) - Documentation sync, version update
 
501
  - **v0.0.2-0.0.3** - All 7 sprints complete, core Wrdler features
502
  - **v0.2.20-0.2.29** - Challenge Mode, PWA, remote storage (inherited from BattleWords)
503
  - **v0.1.x** - Initial BattleWords releases before Wrdler fork
504
 
505
- See README.md for complete changelog.
506
-
507
  ---
508
 
509
  **Last Updated:** 2025-01-31
510
- **Current Version:** 0.1.0
511
- **Status:** Production Ready - All Features Complete
512
-
513
- ## Test File Location
514
- All test files must be placed in the `/tests` folder. This ensures a clean project structure and makes it easy to discover and run all tests.
 
7
  - **No scope/radar visualization**
8
  - **2 free letter guesses at game start** (all instances of chosen letters are revealed)
9
 
10
+ **Current Version:** 0.1.1
11
  **Repository:** https://github.com/Oncorporation/Wrdler.git
12
  **Live Demo:** [DEPLOYMENT_URL_HERE]
13
 
14
  ## Recent Changes
15
 
16
+ **v0.1.1 (Current):**
17
+ - ✅ Enhanced AI word generation logic with intelligent word saving
18
+ - ✅ Automatic retry mechanism for insufficient word counts (up to 3 retries)
19
+ - ✅1000-word file size limit to prevent dictionary bloat
20
+ - ✅ Better new word detection (separates existing vs. new words before saving)
21
+ - ✅ Improved HF Space API integration with graceful fallback to local models
22
+ - ✅ Additional word generation when initial pass doesn't meet MIN_REQUIRED threshold
23
+ - ✅ Enhanced logging for word generation pipeline visibility
24
+
25
+ **v0.1.0 (Previous):**
26
  - ✅ Version updated to 0.1.0 across all files
27
  - ✅ AI word generation functionality added
28
  - ✅ Word list management enhanced with AI support
 
166
  - No gameplay logic changes required
167
  - Works offline for basic functionality
168
 
169
+ ### ✅ AI Word Generation (v0.1.0+)
170
  - **AI-Powered Word Lists:** Generate custom word lists using Hugging Face Spaces or local transformers
171
  - **Topic-Based Generation:** Create words related to specific themes (e.g., "Ocean Life", "Space")
172
+ - **Automatic Word Expansion:** New AI-generated words are saved to local files for future use
173
+ - Intelligent word detection: separates existing dictionary words from new AI-generated words
174
+ - Only new words are saved to prevent duplicates
175
+ - Automatic retry mechanism (up to 3 attempts) if insufficient words generated
176
+ - 1000-word file size limit prevents dictionary bloat
177
+ - Files auto-sorted by length then alphabetically
178
  - **Fallback Support:** Gracefully falls back to dictionary words if AI is unavailable
179
  - **Word Distribution:** Ensures exactly 25 words each of lengths 4, 5, and 6 per topic
180
+ - **Dual Generation Modes:**
181
+ - **HF Space API** (primary): Uses Hugging Face Space for word generation when `USE_HF_WORDS=true`
182
+ - **Local Models** (fallback): Falls back to local transformers models if HF Space unavailable
183
+ - **Enhanced Logging:** Detailed pipeline visibility for debugging and monitoring
184
 
185
  ### PLANNED: Local Player Storage (v0.3.0)
186
  - **Local Storage:**
 
228
 
229
  ### Development Status
230
 
231
+ **Current Version:** 0.1.1 (Complete)
232
  - ✅ All 7 sprints complete
233
  - ✅ 100% test coverage (25/25 tests)
234
  - ✅ AI word generation implemented
 
365
 
366
  ## Post-v0.0.2 Enhancements
367
 
368
+ ### v0.1.1 (AI Word Generation Enhancement)
369
+ - Enhanced AI word generation with intelligent word saving
370
+ - Automatic retry mechanism for insufficient word counts (up to 3 retries)
371
+ - 1000-word file size limit to prevent dictionary bloat
372
+ - Improved new word detection (separates existing vs. new words)
373
+ - Better HF Space API integration with fallback to local models
374
+ - Additional word generation when MIN_REQUIRED threshold not met
375
+ - Enhanced logging for generation pipeline visibility
376
+
377
  ### v0.1.0 (AI Word Generation)
378
  - AI-powered word list generation using Hugging Face Spaces
379
  - Topic-based word creation with automatic saving
 
523
 
524
  ## Version History Summary
525
 
526
+ - **v0.1.1** (Current) - Enhanced AI word generation with intelligent saving, retry logic, file size limits
527
+ - **v0.1.0** (Previous) - AI word generation, utility modules, version bump
528
+ - **v0.0.4** - Documentation sync, version update
529
  - **v0.0.2-0.0.3** - All 7 sprints complete, core Wrdler features
530
  - **v0.2.20-0.2.29** - Challenge Mode, PWA, remote storage (inherited from BattleWords)
531
  - **v0.1.x** - Initial BattleWords releases before Wrdler fork
532
 
 
 
533
  ---
534
 
535
  **Last Updated:** 2025-01-31
536
+ **Current Version:** 0.1.1
537
+ **Status:** Production Ready - AI Enhanced
 
 
 
README.md CHANGED
@@ -13,7 +13,8 @@ tags:
13
  - vocabulary
14
  - streamlit
15
  - education
16
- short_description: Fast paced word guessing game
 
17
  thumbnail: >-
18
  https://cdn-uploads.huggingface.co/production/uploads/6346595c9e5f0fe83fc60444/6rWS4AIaozoNMCbx9F5Rv.png
19
  ---
@@ -50,6 +51,20 @@ Wrdler is a vocabulary learning game with a simplified grid and strategic letter
50
  - Sound effects for hits, misses, correct/incorrect guesses
51
  - Responsive UI built with Streamlit
52
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  ### Customization
54
  - Multiple word lists (classic, fourth_grade, wordlist)
55
  - Wordlist sidebar controls (picker + one-click sort)
@@ -181,6 +196,22 @@ All test files must be placed in the `/tests` folder. This ensures a clean proje
181
 
182
  ## Changelog
183
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
184
  ### v0.0.8
185
  - remove background animation
186
  - add "easy" mode (single guess per reveal)
 
13
  - vocabulary
14
  - streamlit
15
  - education
16
+ - ai
17
+ short_description: Fast paced word guessing game with AI-generated word lists
18
  thumbnail: >-
19
  https://cdn-uploads.huggingface.co/production/uploads/6346595c9e5f0fe83fc60444/6rWS4AIaozoNMCbx9F5Rv.png
20
  ---
 
51
  - Sound effects for hits, misses, correct/incorrect guesses
52
  - Responsive UI built with Streamlit
53
 
54
+ ### AI Word Generation
55
+ - **Topic-based word lists**: Generate custom word lists using AI for any theme
56
+ - **Intelligent word expansion**: New AI-generated words automatically saved to local files
57
+ - Smart detection separates existing dictionary words from new AI words
58
+ - Only saves new words to prevent duplicates
59
+ - Automatic retry mechanism (up to 3 attempts) for insufficient word counts
60
+ - 1000-word file size limit prevents bloat
61
+ - Auto-sorted by length then alphabetically
62
+ - **Dual generation modes**:
63
+ - **HF Space API** (primary): Uses Hugging Face Space when `USE_HF_WORDS=true`
64
+ - **Local transformers** (fallback): Falls back to local models if HF unavailable
65
+ - **Fallback support**: Gracefully uses dictionary words if AI generation fails
66
+ - **Guaranteed distribution**: Ensures exactly 25 words each of lengths 4, 5, and 6
67
+
68
  ### Customization
69
  - Multiple word lists (classic, fourth_grade, wordlist)
70
  - Wordlist sidebar controls (picker + one-click sort)
 
196
 
197
  ## Changelog
198
 
199
+ ### v0.1.1 (Current)
200
+ - ✅ Enhanced AI word generation with intelligent word saving
201
+ - ✅ Automatic retry mechanism for insufficient word counts (up to 3 retries)
202
+ - ✅ 1000-word file size limit to prevent dictionary bloat
203
+ - ✅ Improved new word detection (separates existing vs. new words before saving)
204
+ - ✅ Better HF Space API integration with graceful fallback to local models
205
+ - ✅ Additional word generation when initial pass doesn't meet MIN_REQUIRED threshold
206
+ - ✅ Enhanced logging for word generation pipeline visibility
207
+
208
+ ### v0.1.0
209
+ - ✅ AI word generation functionality added
210
+ - ✅ Topic-based custom word list creation
211
+ - ✅ Dual generation modes (HF Space API + local transformers)
212
+ - ✅ Utility modules integration (storage, file_utils, constants)
213
+ - ✅ Documentation synchronized across all files
214
+
215
  ### v0.0.8
216
  - remove background animation
217
  - add "easy" mode (single guess per reveal)
env.template CHANGED
@@ -15,6 +15,7 @@ TMPDIR=/tmp
15
 
16
  # Flash attention setting (optional)
17
  # USE_FLASH_ATTENTION=1
 
18
 
19
  CRYPTO_PK=btc_public_key_here
20
  IS_LOCAL=true
 
15
 
16
  # Flash attention setting (optional)
17
  # USE_FLASH_ATTENTION=1
18
+ TF_ENABLE_ONEDNN_OPTS=0
19
 
20
  CRYPTO_PK=btc_public_key_here
21
  IS_LOCAL=true
specs/requirements.md CHANGED
@@ -1,11 +1,11 @@
1
  # Wrdler: Implementation Requirements
2
- **Version:** 0.1.0
3
- **Status:** All Features Complete - Ready for Deployment
4
  **Last Updated:** 2025-01-31
5
 
6
  This document breaks down the implementation tasks for Wrdler using the game rules described in `specs.md`. Wrdler is based on BattleWords but with a simplified 8x6 grid, horizontal-only words, and free letter guesses at the start.
7
 
8
- **Current Status:** ✅ All Phase 1 requirements complete, 100% tested (25/25 tests passing)
9
 
10
  ## Key Differences from BattleWords
11
  - 8x6 grid instead of 12x12
@@ -14,14 +14,15 @@ This document breaks down the implementation tasks for Wrdler using the game rul
14
  - No radar/scope visualization
15
  - 2 free letter guesses at game start
16
 
17
- ## Implementation Details (v0.0.2)
18
- - **Tech Stack:** Python 3.12.8, Streamlit 1.51.0, numpy, matplotlib
19
  - **Architecture:** Single-player, local state in Streamlit session state
20
  - **Grid:** 8 columns × 6 rows (48 cells) with exactly six words
21
  - **Word Placement:** Horizontal-only, one word per row, no overlaps
 
22
  - **Entry Point:** `app.py`
23
  - **Testing:** pytest with 25/25 tests passing (100%)
24
- - **Development Time:** ~12.75 hours across 7 sprints
25
 
26
  ## Streamlit Components (Implemented in v0.0.2)
27
  - State & caching ✅
@@ -88,15 +89,27 @@ This document breaks down the implementation tasks for Wrdler using the game rul
88
 
89
  **Acceptance:** ✅ Types implemented and fully integrated (13/13 tests passing)
90
 
91
- ### 2) Word List Management ✅ (Sprint 1)
92
  - ✅ English word list filtered to alphabetic uppercase, lengths in {4,5,6}
93
  - ✅ Loader centralized in `word_loader.py` with caching
94
  - ✅ Three word lists: classic, fourth_grade, wordlist
95
- - ✅ AI word generation support via `word_loader_ai.py` (generates 75 words per topic)
 
 
 
 
 
 
 
 
 
 
 
 
96
  - ✅ Unified loader (`load_word_list_or_ai`) routes between file-based and AI-generated words
97
  - ✅ Saves new AI-generated words to local files for expansion
98
 
99
- **Acceptance:** ✅ Loading function returns lists by length with >= 25 words per length
100
 
101
  ### 3) Puzzle Generation (8x6 Horizontal) ✅ (Sprint 2)
102
  - ✅ Randomly place 6 words on 8x6 grid, one per row
@@ -194,6 +207,12 @@ This document breaks down the implementation tasks for Wrdler using the game rul
194
  - 📋 Player statistics
195
  - 📋 Enhanced UI animations
196
 
 
 
 
 
 
 
197
  ### v1.0.0 (Long Term)
198
  - 📋 Multiple difficulty levels
199
  - 📋 Daily puzzle mode
@@ -214,8 +233,60 @@ This document breaks down the implementation tasks for Wrdler using the game rul
214
  ---
215
 
216
  **Last Updated:** 2025-01-31
217
- **Version:** 0.1.0
218
- **Status:** All Features Complete - Ready for Deployment 🚀
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
219
 
220
  ## Test File Location
221
  All test files must be placed in the `/tests` folder. This ensures a clean project structure and makes it easy to discover and run all tests.
 
1
  # Wrdler: Implementation Requirements
2
+ **Version:** 0.1.1
3
+ **Status:** Production Ready - AI Enhanced
4
  **Last Updated:** 2025-01-31
5
 
6
  This document breaks down the implementation tasks for Wrdler using the game rules described in `specs.md`. Wrdler is based on BattleWords but with a simplified 8x6 grid, horizontal-only words, and free letter guesses at the start.
7
 
8
+ **Current Status:** ✅ All Phase 1 requirements complete, 100% tested (25/25 tests passing), AI word generation enhanced in v0.1.1
9
 
10
  ## Key Differences from BattleWords
11
  - 8x6 grid instead of 12x12
 
14
  - No radar/scope visualization
15
  - 2 free letter guesses at game start
16
 
17
+ ## Implementation Details (v0.1.1)
18
+ - **Tech Stack:** Python 3.12.8, Streamlit 1.51.0, numpy, matplotlib, transformers, gradio_client
19
  - **Architecture:** Single-player, local state in Streamlit session state
20
  - **Grid:** 8 columns × 6 rows (48 cells) with exactly six words
21
  - **Word Placement:** Horizontal-only, one word per row, no overlaps
22
+ - **AI Generation:** Topic-based word lists with intelligent saving and retry logic
23
  - **Entry Point:** `app.py`
24
  - **Testing:** pytest with 25/25 tests passing (100%)
25
+ - **Development Time:** ~12.75 hours across 7 sprints (Phase 1) + AI enhancements
26
 
27
  ## Streamlit Components (Implemented in v0.0.2)
28
  - State & caching ✅
 
89
 
90
  **Acceptance:** ✅ Types implemented and fully integrated (13/13 tests passing)
91
 
92
+ ### 2) Word List Management ✅ (Sprint 1, Enhanced in v0.1.0-0.1.1)
93
  - ✅ English word list filtered to alphabetic uppercase, lengths in {4,5,6}
94
  - ✅ Loader centralized in `word_loader.py` with caching
95
  - ✅ Three word lists: classic, fourth_grade, wordlist
96
+ - ✅ **AI word generation** support via `word_loader_ai.py`:
97
+ - Generates 75 words per topic (25 each of lengths 4, 5, 6)
98
+ - **Dual generation modes** (v0.1.0+):
99
+ - HF Space API (primary): Uses Hugging Face Space when `USE_HF_WORDS=true`
100
+ - Local transformers (fallback): Falls back to local models if HF unavailable
101
+ - **Intelligent word saving** (v0.1.1):
102
+ - Smart detection separates existing dictionary words from new AI-generated words
103
+ - Only saves new words to prevent duplicates
104
+ - Automatic retry mechanism (up to 3 attempts) if insufficient words generated
105
+ - 1000-word file size limit prevents dictionary bloat
106
+ - Auto-sorted by length then alphabetically
107
+ - **Additional word generation**: Automatically generates more words when MIN_REQUIRED threshold not met
108
+ - **Enhanced logging**: Detailed pipeline visibility for debugging
109
  - ✅ Unified loader (`load_word_list_or_ai`) routes between file-based and AI-generated words
110
  - ✅ Saves new AI-generated words to local files for expansion
111
 
112
+ **Acceptance:** ✅ Loading function returns lists by length with >= 25 words per length; AI generation produces valid words with intelligent saving and retry logic
113
 
114
  ### 3) Puzzle Generation (8x6 Horizontal) ✅ (Sprint 2)
115
  - ✅ Randomly place 6 words on 8x6 grid, one per row
 
207
  - 📋 Player statistics
208
  - 📋 Enhanced UI animations
209
 
210
+ ### v0.4.0 (AI Expansion)
211
+ - 📋 AI difficulty tuning based on player performance
212
+ - 📋 Custom topic suggestions
213
+ - 📋 Multi-language word generation
214
+ - 📋 Word difficulty analysis and visualization
215
+
216
  ### v1.0.0 (Long Term)
217
  - 📋 Multiple difficulty levels
218
  - 📋 Daily puzzle mode
 
233
  ---
234
 
235
  **Last Updated:** 2025-01-31
236
+ **Version:** 0.1.1
237
+ **Status:** Production Ready - AI Enhanced 🚀
238
+
239
+ ## AI Word Generation Pipeline (v0.1.1)
240
+
241
+ ### Architecture
242
+ ```
243
+ User Input (Topic)
244
+
245
+ Check USE_HF_WORDS flag
246
+
247
+ ┌─────────────────────────────────────┐
248
+ │ HF Space API (Primary) │
249
+ │ - gradio_client integration │
250
+ │ - Temperature: 0.95 │
251
+ │ - Max tokens: 512 │
252
+ └─────────────────────────────────────┘
253
+ ↓ (if fails or USE_HF_WORDS=false)
254
+ ┌──��──────────────────────────────────┐
255
+ │ Local Transformers (Fallback) │
256
+ │ - Auto model selection │
257
+ │ - Device auto-detection │
258
+ │ - Temperature: 0.7 │
259
+ └─────────────────────────────────────┘
260
+
261
+ Parse & Filter Words
262
+
263
+ Identify New vs Existing
264
+
265
+ Check MIN_REQUIRED threshold
266
+ ↓ (if insufficient)
267
+ Generate Additional Words (up to 3 retries)
268
+
269
+ Save New Words to File
270
+
271
+ Validate & Sort File
272
+
273
+ Return 75 Words for Game
274
+ ```
275
+
276
+ ### Word Saving Strategy
277
+ 1. **Detection Phase**: Separate new AI words from existing dictionary words
278
+ 2. **Validation Phase**: Check if file meets MIN_REQUIRED (25 words per length)
279
+ 3. **Retry Phase**: If insufficient, generate additional words (up to 3 attempts)
280
+ 4. **Save Phase**: Write only new words to topic-based file
281
+ 5. **Sort Phase**: Auto-sort by length then alphabetically
282
+ 6. **Limit Phase**: Stop adding if file reaches 1000 words
283
+
284
+ ### Error Handling
285
+ - **HF Space API failure**: Graceful fallback to local model
286
+ - **Model loading failure**: Try multiple models in priority order
287
+ - **Device compatibility**: Retry pipeline without device parameter on error 422
288
+ - **Insufficient words**: Automatic retry with targeted prompts
289
+ - **File operations**: Detailed logging and error recovery
290
 
291
  ## Test File Location
292
  All test files must be placed in the `/tests` folder. This ensures a clean project structure and makes it easy to discover and run all tests.
specs/specs.md CHANGED
@@ -1,6 +1,6 @@
1
  # Wrdler Game Specifications (specs.md)
2
- **Version:** 0.1.0
3
- **Status:** All Features Complete - Ready for Deployment
4
  **Last Updated:** 2025-01-31
5
 
6
  ## Overview
@@ -58,7 +58,22 @@ Wrdler is a simplified vocabulary puzzle game based on BattleWords, but with key
58
  - ✅ 10 incorrect guess limit per game
59
  - ✅ Two game modes: Classic (chain guesses) and Too Easy (single guess per reveal)
60
 
61
- ## Implemented Features (v0.0.2)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
 
63
  ### Challenge Mode
64
  - ✅ **Game ID Sharing:** Each puzzle generates a shareable link with `?game_id=<sid>` to challenge others with the same word list
@@ -175,19 +190,25 @@ HF_REPO_ID/
175
 
176
  ## Development Status
177
 
178
- ### Completed (v0.0.2)
179
- All 7 sprints complete, 100% test coverage (25/25 tests passing):
180
- - **Sprint 1:** Core data models (rectangular grid support)
181
- - **Sprint 2:** Puzzle generator (horizontal-only, one-per-row)
182
- - **Sprint 3:** Radar visualization removal
183
- - **Sprint 4:** Free letter selection UI
184
- - **Sprint 5:** Grid UI updates for 8×6 display
185
- - **Sprint 6:** Integration testing
186
- - **Sprint 7:** Documentation finalization
187
-
188
- **Development Time:** ~12.75 hours
189
- **Test Pass Rate:** 100% (25/25 tests)
190
- **Status:** Ready for deployment! 🚀
 
 
 
 
 
 
191
 
192
  ### Future Roadmap
193
  - **v0.3.0:** Local persistent storage, high score tracking, player statistics
 
1
  # Wrdler Game Specifications (specs.md)
2
+ **Version:** 0.1.1
3
+ **Status:** Production Ready - AI Enhanced
4
  **Last Updated:** 2025-01-31
5
 
6
  ## Overview
 
58
  - ✅ 10 incorrect guess limit per game
59
  - ✅ Two game modes: Classic (chain guesses) and Too Easy (single guess per reveal)
60
 
61
+ ## Implemented Features (v0.1.1)
62
+
63
+ ### AI Word Generation (v0.1.0+)
64
+ - ✅ **Topic-Based Generation:** Create custom word lists for any theme using AI
65
+ - ✅ **Dual Generation Modes:**
66
+ - HF Space API (primary): Uses Hugging Face Space when `USE_HF_WORDS=true`
67
+ - Local transformers (fallback): Falls back to local models if HF unavailable
68
+ - ✅ **Intelligent Word Management:**
69
+ - Smart detection separates existing dictionary words from new AI-generated words
70
+ - Only saves new words to prevent duplicates in word files
71
+ - Automatic retry mechanism (up to 3 attempts) if insufficient words generated
72
+ - 1000-word file size limit prevents dictionary bloat
73
+ - Auto-sorted by length then alphabetically
74
+ - ✅ **Guaranteed Distribution:** Ensures exactly 25 words each of lengths 4, 5, and 6
75
+ - ✅ **Graceful Fallback:** Uses dictionary words if AI generation fails
76
+ - ✅ **Enhanced Logging:** Detailed pipeline visibility for debugging
77
 
78
  ### Challenge Mode
79
  - ✅ **Game ID Sharing:** Each puzzle generates a shareable link with `?game_id=<sid>` to challenge others with the same word list
 
190
 
191
  ## Development Status
192
 
193
+ **Current Version:** 0.1.1 (Production Ready - AI Enhanced)
194
+
195
+ ### Completed
196
+ - **v0.1.1:** Enhanced AI word generation
197
+ - Intelligent word saving with duplicate prevention
198
+ - Automatic retry mechanism (up to 3 attempts)
199
+ - 1000-word file size limit
200
+ - Improved HF Space API integration
201
+ - Enhanced logging and error handling
202
+
203
+ - **v0.1.0:** AI word generation foundation
204
+ - Topic-based word list creation
205
+ - Dual generation modes (HF Space + local)
206
+ - Utility modules integration
207
+
208
+ - **v0.0.2:** All 7 sprints complete
209
+ - ✅ 100% test coverage (25/25 tests)
210
+ - 📊 Development time: ~12.75 hours (sprints 1-7)
211
+ - 📚 Complete documentation
212
 
213
  ### Future Roadmap
214
  - **v0.3.0:** Local persistent storage, high score tracking, player statistics
tests/test_word_file_validation.py ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Test validation of word files for MIN_REQUIRED threshold compliance.
3
+ """
4
+
5
+ import os
6
+ import tempfile
7
+ import shutil
8
+ from wrdler.word_loader_ai import _save_ai_words_to_file
9
+ from wrdler.word_loader import MIN_REQUIRED
10
+
11
+
12
+ def test_save_ai_words_validates_min_required():
13
+ """Test that _save_ai_words_to_file returns insufficiency info."""
14
+ # Create a temporary directory for test files
15
+ test_dir = tempfile.mkdtemp()
16
+
17
+ try:
18
+ # Mock the words directory to point to our temp dir
19
+ import wrdler.word_loader_ai as wl_ai
20
+ original_dirname = wl_ai.os.path.dirname
21
+
22
+ def mock_dirname(path):
23
+ if "word_loader_ai.py" in path:
24
+ return test_dir
25
+ return original_dirname(path)
26
+
27
+ wl_ai.os.path.dirname = mock_dirname
28
+
29
+ # Test case 1: Insufficient words (should return non-empty dict)
30
+ insufficient_words = [
31
+ "COOK", "BAKE", "HEAT", # 3 x 4-letter (need 25)
32
+ "ROAST", "GRILL", "STEAM", # 3 x 5-letter (need 25)
33
+ "SIMMER", "BRAISE", # 2 x 6-letter (need 25)
34
+ ]
35
+
36
+ filename, insufficient = _save_ai_words_to_file("test_topic", insufficient_words)
37
+
38
+ assert filename == "test_topic.txt", f"Expected 'test_topic.txt', got '{filename}'"
39
+ assert len(insufficient) > 0, "Expected insufficient_lengths to be non-empty"
40
+ assert 4 in insufficient, "Expected 4-letter words to be insufficient"
41
+ assert 5 in insufficient, "Expected 5-letter words to be insufficient"
42
+ assert 6 in insufficient, "Expected 6-letter words to be insufficient"
43
+
44
+ # Test case 2: Sufficient words (should return empty dict)
45
+ sufficient_words = []
46
+ for length in [4, 5, 6]:
47
+ for i in range(MIN_REQUIRED):
48
+ # Generate unique words of the required length
49
+ word = chr(65 + (i % 26)) * length + str(i).zfill(length - 1)
50
+ sufficient_words.append(word[:length].upper())
51
+
52
+ filename2, insufficient2 = _save_ai_words_to_file("test_sufficient", sufficient_words)
53
+
54
+ assert filename2 == "test_sufficient.txt", f"Expected 'test_sufficient.txt', got '{filename2}'"
55
+ assert len(insufficient2) == 0, f"Expected empty insufficient_lengths, got {insufficient2}"
56
+
57
+ print("? All validation tests passed!")
58
+
59
+ finally:
60
+ # Restore original dirname
61
+ wl_ai.os.path.dirname = original_dirname
62
+ # Clean up temp directory
63
+ shutil.rmtree(test_dir, ignore_errors=True)
64
+
65
+
66
+ if __name__ == "__main__":
67
+ test_save_ai_words_validates_min_required()
wrdler/__init__.py CHANGED
@@ -8,5 +8,5 @@ Key differences from BattleWords:
8
  - 2 free letter guesses at game start
9
  """
10
 
11
- __version__ = "0.1.0"
12
  __all__ = ["models", "generator", "logic", "ui", "word_loader"]
 
8
  - 2 free letter guesses at game start
9
  """
10
 
11
+ __version__ = "0.1.1"
12
  __all__ = ["models", "generator", "logic", "ui", "word_loader"]
wrdler/modules/__init__.py CHANGED
@@ -4,6 +4,11 @@ Shared utility modules for Wrdler.
4
 
5
  These modules are imported from the OpenBadge project and provide
6
  reusable functionality for storage, constants, and file utilities.
 
 
 
 
 
7
  """
8
 
9
  from .storage import (
 
4
 
5
  These modules are imported from the OpenBadge project and provide
6
  reusable functionality for storage, constants, and file utilities.
7
+
8
+ The AI word generation system (word_loader_ai.py) uses these modules for:
9
+ - File operations and path management (file_utils)
10
+ - Storage configuration and HF integration (constants)
11
+ - Remote storage and URL generation (storage)
12
  """
13
 
14
  from .storage import (
wrdler/word_loader_ai.py CHANGED
@@ -30,7 +30,8 @@ except Exception: # pragma: no cover
30
  # Local imports
31
  from .word_loader import (
32
  load_word_list,
33
- FALLBACK_WORDS,
 
34
  compute_word_difficulties, # Use current v3 difficulty metric
35
  )
36
  from .modules.constants import AI_MODELS, TMPDIR, USE_HF_WORDS, HF_WORD_LIST_REPO_ID
@@ -59,7 +60,7 @@ _USED_MODEL_NAME: Optional[str] = None
59
 
60
  DEFAULT_MODEL_NAME = os.environ.get(
61
  "WRDLER_AI_MODEL",
62
- AI_MODELS[0] if AI_MODELS else "meta-llama/Meta-Llama-3-8B-Instruct"
63
  )
64
 
65
  # Safety: limit max new tokens to keep latency reasonable (increased to accommodate 75+ words)
@@ -71,7 +72,7 @@ BASE_PROMPT_TEMPLATE = (
71
  "Return AT LEAST 75 UNIQUE WORDS related to the topic: '{topic}'.\n"
72
  "FORMAT RULES:\n"
73
  "- Output ONLY a single comma-separated list (no numbering, no extra text)\n"
74
- "- Include at least: 25 words of length 4 letters, 25 words of length 5 letters, 25 words of length 6 letters\n"
75
  "- Use ONLY uppercase A-Z letters (no diacritics, hyphens, or spaces)\n"
76
  "- No duplicates. No explanations.\n"
77
  "List:"
@@ -79,7 +80,7 @@ BASE_PROMPT_TEMPLATE = (
79
 
80
  VALID_LENGTHS = (4, 5, 6)
81
  RE_WORD = re.compile(r"^[A-Z]+$")
82
- WORDS_PER_LENGTH = 25 # Target 25 words for each length
83
 
84
 
85
  # ---------------------------------------------------------------------------
@@ -101,9 +102,8 @@ def _generate_via_hf_space(topic: str) -> Tuple[str, str]:
101
  """
102
  if not _GRADIO_CLIENT_AVAILABLE:
103
  raise Exception("gradio_client not installed; install with: pip install gradio_client")
104
-
105
- prompt = BASE_PROMPT_TEMPLATE.format(topic=topic.upper())
106
-
107
  try:
108
  logger.info(f"🌐 Calling HF Space API: {HF_WORD_LIST_REPO_ID}")
109
  client = Client(HF_WORD_LIST_REPO_ID)
@@ -131,6 +131,7 @@ def _generate_via_hf_space(topic: str) -> Tuple[str, str]:
131
  def _load_model(model_name: str = DEFAULT_MODEL_NAME):
132
  """
133
  Try to load the requested model first, then fall back through AI_MODELS in order.
 
134
  """
135
  if not _TRANSFORMERS_AVAILABLE:
136
  logger.warning("⚠️ Transformers not available; falling back to dictionary words.")
@@ -154,12 +155,25 @@ def _load_model(model_name: str = DEFAULT_MODEL_NAME):
154
  torch_dtype="auto",
155
  device_map="auto" if device == 0 else None,
156
  )
157
- gen = pipeline(
158
- "text-generation",
159
- model=model,
160
- tokenizer=tokenizer,
161
- device=device,
162
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
163
  global _USED_MODEL_NAME
164
  _USED_MODEL_NAME = current
165
  logger.info(f"✅ Model loaded successfully: {current}")
@@ -197,7 +211,7 @@ def _extract_words_from_output(prompt: str, raw_output: str) -> List[str]:
197
 
198
  def _enforce_distribution(words: List[str], wordlist_map: Dict[int, List[str]]) -> List[str]:
199
  """
200
- Ensure we have exactly 25 of each required length (4,5,6). Truncate extras.
201
  Missing slots are filled from dictionary words (wordlist_map), then FALLBACK_WORDS if needed.
202
 
203
  Args:
@@ -205,7 +219,7 @@ def _enforce_distribution(words: List[str], wordlist_map: Dict[int, List[str]])
205
  wordlist_map: Dictionary of canonical words by length from load_word_list
206
 
207
  Returns:
208
- List of exactly 75 words (25 each of lengths 4, 5, 6)
209
  """
210
  by_len: Dict[int, List[str]] = {4: [], 5: [], 6: []}
211
  for w in words:
@@ -213,10 +227,6 @@ def _enforce_distribution(words: List[str], wordlist_map: Dict[int, List[str]])
213
  if L in by_len and w not in by_len[L]:
214
  by_len[L].append(w)
215
 
216
- # Trim to at most 25 each
217
- for L in VALID_LENGTHS:
218
- by_len[L] = by_len[L][:WORDS_PER_LENGTH]
219
-
220
  # Fill missing using dictionary words, then fallback words if still needed
221
  for L in VALID_LENGTHS:
222
  if len(by_len[L]) < WORDS_PER_LENGTH:
@@ -261,6 +271,89 @@ def _filter_and_dedupe(words: List[str]) -> List[str]:
261
  return result
262
 
263
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
264
  def _score_words(full_wordlist_path: Optional[str], words: List[str]) -> Dict[str, float]:
265
  """
266
  Use existing difficulty metric for the subset derived.
@@ -276,7 +369,7 @@ def _score_words(full_wordlist_path: Optional[str], words: List[str]) -> Dict[st
276
  return {}
277
 
278
 
279
- def _save_ai_words_to_file(topic: str, words: List[str]) -> str:
280
  """
281
  Save AI-generated words to a file in the words folder.
282
  If the file exists, append new words without duplicates and sort.
@@ -285,9 +378,11 @@ def _save_ai_words_to_file(topic: str, words: List[str]) -> str:
285
  Args:
286
  topic: The topic used for generation
287
  words: List of words to save
288
-
289
  Returns:
290
- The filename of the saved file
 
 
291
  """
292
  from .generator import sort_word_file
293
 
@@ -314,7 +409,7 @@ def _save_ai_words_to_file(topic: str, words: List[str]) -> str:
314
  # Check if file already has 1000+ words
315
  if len(existing_words) >= 1000:
316
  logger.info(f"ℹ️ File {filename} already has {len(existing_words)} words (≥1000). Not adding new words.")
317
- return filename
318
 
319
  except Exception as e:
320
  logger.warning(f"⚠️ Error reading existing file {filename}: {e}")
@@ -357,15 +452,55 @@ def _save_ai_words_to_file(topic: str, words: List[str]) -> str:
357
  for word in sorted_words:
358
  f.write(f"{word}\n")
359
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
360
  logger.info(f"✅ Successfully saved and sorted {len(all_words)} words to {filename}")
361
- return filename
362
 
363
  except Exception as e:
364
  logger.error(f"❌ Error saving words to {filename}: {e}")
365
- return ""
366
  else:
367
  logger.info(f"ℹ️ No new words to add to {filename}")
368
- return filename
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
369
 
370
 
371
  # ---------------------------------------------------------------------------
@@ -385,7 +520,7 @@ def generate_ai_words(
385
 
386
  Returns:
387
  words: List[str] - Final 75 words (uppercase A–Z).
388
- difficulties: Dict[str,float] - Difficulty scores using compute_word_difficulties().
389
  metadata: Dict[str,str] - Source / diagnostic info.
390
 
391
  Parameters:
@@ -407,13 +542,14 @@ def generate_ai_words(
407
  raw_generated_text = ""
408
  ai_words: List[str] = []
409
  generation_source = "none"
 
410
 
411
  # Check if USE_HF_WORDS is enabled
412
  if USE_HF_WORDS:
413
  # Try HF Space API first
414
  try:
415
  raw_generated_text, generation_source = _generate_via_hf_space(topic)
416
- prompt = BASE_PROMPT_TEMPLATE.format(topic=topic.upper())
417
  parsed = _extract_words_from_output(prompt, raw_generated_text)
418
  logger.debug(f"Parsed {len(parsed)} words from HF Space output")
419
  ai_words = _filter_and_dedupe(parsed)
@@ -432,7 +568,7 @@ def generate_ai_words(
432
  generator = _load_model(model_name or DEFAULT_MODEL_NAME)
433
 
434
  if generator is not None:
435
- prompt = BASE_PROMPT_TEMPLATE.format(topic=topic.upper())
436
  try:
437
  logger.info(f"📝 Generating words from local AI model...")
438
  outputs = generator(
@@ -462,7 +598,7 @@ def generate_ai_words(
462
  ai_words = []
463
 
464
  # CORRECT ORDER:
465
- # 1. FIRST identify and save new words (before any filtering)
466
  new_words_to_save: List[str] = []
467
  if ai_words:
468
  existing_words = [w for w in ai_words if w in canonical_set]
@@ -470,13 +606,86 @@ def generate_ai_words(
470
 
471
  logger.info(f"📊 Word analysis: {len(ai_words)} total = {len(existing_words)} existing + {len(new_words_to_save)} NEW")
472
 
473
- # Save the NEW words to expand the dictionary
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
474
  if new_words_to_save:
475
- saved_filename = _save_ai_words_to_file(topic, new_words_to_save)
476
- if saved_filename:
477
- logger.info(f"💾 Saved {len(new_words_to_save)} NEW words to {saved_filename}")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
478
 
479
- # 2. THEN apply dictionary filter if requested (for game word selection)
480
  if use_dictionary_filter and ai_words:
481
  before_filter = len(ai_words)
482
  filtered_out_words = [w for w in ai_words if w not in canonical_set]
@@ -494,7 +703,7 @@ def generate_ai_words(
494
  else:
495
  by_len['other'].append(w)
496
 
497
- logger.info(f"🚫 Filtered out words NOT in dictionary:")
498
  for length in [4, 5, 6, 'other']:
499
  if by_len[length]:
500
  logger.info(f" {length}-letter: {', '.join(sorted(by_len[length]))}")
 
30
  # Local imports
31
  from .word_loader import (
32
  load_word_list,
33
+ FALLBACK_WORDS,
34
+ MIN_REQUIRED,
35
  compute_word_difficulties, # Use current v3 difficulty metric
36
  )
37
  from .modules.constants import AI_MODELS, TMPDIR, USE_HF_WORDS, HF_WORD_LIST_REPO_ID
 
60
 
61
  DEFAULT_MODEL_NAME = os.environ.get(
62
  "WRDLER_AI_MODEL",
63
+ AI_MODELS[0] if AI_MODELS else "meta-llama/Llama-3.1-8B-Instruct"
64
  )
65
 
66
  # Safety: limit max new tokens to keep latency reasonable (increased to accommodate 75+ words)
 
72
  "Return AT LEAST 75 UNIQUE WORDS related to the topic: '{topic}'.\n"
73
  "FORMAT RULES:\n"
74
  "- Output ONLY a single comma-separated list (no numbering, no extra text)\n"
75
+ "- Include at least: {WORDS_PER_LENGTH} words of length 4 letters, {WORDS_PER_LENGTH} words of length 5 letters, {WORDS_PER_LENGTH} words of length 6 letters\n"
76
  "- Use ONLY uppercase A-Z letters (no diacritics, hyphens, or spaces)\n"
77
  "- No duplicates. No explanations.\n"
78
  "List:"
 
80
 
81
  VALID_LENGTHS = (4, 5, 6)
82
  RE_WORD = re.compile(r"^[A-Z]+$")
83
+ WORDS_PER_LENGTH = MIN_REQUIRED # Target MIN_REQUIRED words for each length
84
 
85
 
86
  # ---------------------------------------------------------------------------
 
102
  """
103
  if not _GRADIO_CLIENT_AVAILABLE:
104
  raise Exception("gradio_client not installed; install with: pip install gradio_client")
105
+
106
+ prompt = BASE_PROMPT_TEMPLATE.format(topic=topic.upper(), WORDS_PER_LENGTH=WORDS_PER_LENGTH)
 
107
  try:
108
  logger.info(f"🌐 Calling HF Space API: {HF_WORD_LIST_REPO_ID}")
109
  client = Client(HF_WORD_LIST_REPO_ID)
 
131
  def _load_model(model_name: str = DEFAULT_MODEL_NAME):
132
  """
133
  Try to load the requested model first, then fall back through AI_MODELS in order.
134
+ Detect error 422 or 'cannot be moved to a specific device' and retry pipeline without device argument.
135
  """
136
  if not _TRANSFORMERS_AVAILABLE:
137
  logger.warning("⚠️ Transformers not available; falling back to dictionary words.")
 
155
  torch_dtype="auto",
156
  device_map="auto" if device == 0 else None,
157
  )
158
+ try:
159
+ gen = pipeline(
160
+ "text-generation",
161
+ model=model,
162
+ tokenizer=tokenizer,
163
+ device=device,
164
+ )
165
+ except Exception as e:
166
+ # Detect error 422 or accelerate device error
167
+ msg = str(e)
168
+ if "cannot be moved to a specific device" in msg or "422" in msg:
169
+ logger.warning(f"⚠️ Retrying pipeline for {current} without device argument due to error: {msg}")
170
+ gen = pipeline(
171
+ "text-generation",
172
+ model=model,
173
+ tokenizer=tokenizer,
174
+ )
175
+ else:
176
+ raise
177
  global _USED_MODEL_NAME
178
  _USED_MODEL_NAME = current
179
  logger.info(f"✅ Model loaded successfully: {current}")
 
211
 
212
  def _enforce_distribution(words: List[str], wordlist_map: Dict[int, List[str]]) -> List[str]:
213
  """
214
+ Ensure we have at least MIN_REQUIRED (25) of each required length (4,5,6).
215
  Missing slots are filled from dictionary words (wordlist_map), then FALLBACK_WORDS if needed.
216
 
217
  Args:
 
219
  wordlist_map: Dictionary of canonical words by length from load_word_list
220
 
221
  Returns:
222
+ List of minimum 75 words (25 each of lengths 4, 5, 6)
223
  """
224
  by_len: Dict[int, List[str]] = {4: [], 5: [], 6: []}
225
  for w in words:
 
227
  if L in by_len and w not in by_len[L]:
228
  by_len[L].append(w)
229
 
 
 
 
 
230
  # Fill missing using dictionary words, then fallback words if still needed
231
  for L in VALID_LENGTHS:
232
  if len(by_len[L]) < WORDS_PER_LENGTH:
 
271
  return result
272
 
273
 
274
+ def _generate_additional_words(
275
+ topic: str,
276
+ needed_by_length: Dict[int, int],
277
+ existing_words: set,
278
+ generator,
279
+ generation_source: str
280
+ ) -> List[str]:
281
+ """
282
+ Generate additional words when initial generation didn't produce enough new words.
283
+
284
+ Args:
285
+ topic: The topic for generation
286
+ needed_by_length: Dict mapping length to number of words needed
287
+ existing_words: Set of words that already exist (to avoid duplicates)
288
+ generator: The AI model generator (None if using HF Space)
289
+ generation_source: Source identifier for logging
290
+
291
+ Returns:
292
+ List of newly generated words
293
+ """
294
+ # Build targeted prompt requesting specific quantities
295
+ total_needed = sum(needed_by_length.values())
296
+ if total_needed == 0:
297
+ return []
298
+
299
+ length_requirements = ", ".join([
300
+ f"{count} words of length {length} letters"
301
+ for length, count in needed_by_length.items()
302
+ if count > 0
303
+ ])
304
+
305
+ targeted_prompt = (
306
+ f"You are an assistant generating words for a word deduction game.\n"
307
+ f"Generate AT LEAST {total_needed} MORE UNIQUE WORDS related to the topic: '{topic.upper()}'.\n"
308
+ f"FORMAT RULES:\n"
309
+ f"- Output ONLY a single comma-separated list (no numbering, no extra text)\n"
310
+ f"- Include AT LEAST: {length_requirements}\n"
311
+ f"- Use ONLY uppercase A-Z letters (no diacritics, hyphens, or spaces)\n"
312
+ f"- No duplicates. No explanations.\n"
313
+ f"List:"
314
+ )
315
+
316
+ logger.info(f"📝 Generating {total_needed} additional words: {length_requirements}")
317
+
318
+ additional_words: List[str] = []
319
+
320
+ try:
321
+ if generation_source == HF_WORD_LIST_REPO_ID and _GRADIO_CLIENT_AVAILABLE:
322
+ # Use HF Space API
323
+ client = Client(HF_WORD_LIST_REPO_ID)
324
+ result = client.predict(
325
+ message=targeted_prompt,
326
+ temperature=0.95,
327
+ max_new_tokens=MAX_NEW_TOKENS,
328
+ api_name="/chat"
329
+ )
330
+ parsed = _extract_words_from_output(targeted_prompt, result)
331
+ additional_words = _filter_and_dedupe(parsed)
332
+
333
+ elif generator is not None:
334
+ # Use local model
335
+ outputs = generator(
336
+ targeted_prompt,
337
+ max_new_tokens=MAX_NEW_TOKENS,
338
+ num_return_sequences=1,
339
+ temperature=0.8,
340
+ do_sample=True,
341
+ )
342
+ raw_output = outputs[0]["generated_text"]
343
+ parsed = _extract_words_from_output(targeted_prompt, raw_output)
344
+ additional_words = _filter_and_dedupe(parsed)
345
+
346
+ # Filter out words that already exist
347
+ additional_words = [w for w in additional_words if w not in existing_words]
348
+
349
+ logger.info(f"✅ Generated {len(additional_words)} additional unique words")
350
+ return additional_words
351
+
352
+ except Exception as e:
353
+ logger.error(f"❌ Failed to generate additional words: {e}")
354
+ return []
355
+
356
+
357
  def _score_words(full_wordlist_path: Optional[str], words: List[str]) -> Dict[str, float]:
358
  """
359
  Use existing difficulty metric for the subset derived.
 
369
  return {}
370
 
371
 
372
+ def _save_ai_words_to_file(topic: str, words: List[str]) -> Tuple[str, Dict[int, int]]:
373
  """
374
  Save AI-generated words to a file in the words folder.
375
  If the file exists, append new words without duplicates and sort.
 
378
  Args:
379
  topic: The topic used for generation
380
  words: List of words to save
381
+
382
  Returns:
383
+ Tuple of (filename, insufficient_lengths) where:
384
+ - filename: The filename of the saved file (empty string on error)
385
+ - insufficient_lengths: Dict mapping length -> shortfall count (empty if all lengths meet MIN_REQUIRED)
386
  """
387
  from .generator import sort_word_file
388
 
 
409
  # Check if file already has 1000+ words
410
  if len(existing_words) >= 1000:
411
  logger.info(f"ℹ️ File {filename} already has {len(existing_words)} words (≥1000). Not adding new words.")
412
+ return filename, {} # Return empty dict = no insufficiency
413
 
414
  except Exception as e:
415
  logger.warning(f"⚠️ Error reading existing file {filename}: {e}")
 
452
  for word in sorted_words:
453
  f.write(f"{word}\n")
454
 
455
+ # Validate file now has MIN_REQUIRED words per length
456
+ words_by_len = {4: [], 5: [], 6: []}
457
+ for w in sorted_words:
458
+ L = len(w)
459
+ if L in words_by_len:
460
+ words_by_len[L].append(w)
461
+
462
+ insufficient_lengths = {L: MIN_REQUIRED - len(words_by_len[L])
463
+ for L in (4, 5, 6)
464
+ if len(words_by_len[L]) < MIN_REQUIRED}
465
+
466
+ if insufficient_lengths:
467
+ logger.warning(
468
+ f"⚠️ File {filename} still below MIN_REQUIRED threshold: "
469
+ f"{', '.join(f'{L}-letter: {len(words_by_len[L])}/{MIN_REQUIRED}' for L in insufficient_lengths.keys())}"
470
+ )
471
+ else:
472
+ logger.info(f"✅ File {filename} meets MIN_REQUIRED threshold for all lengths")
473
+
474
  logger.info(f"✅ Successfully saved and sorted {len(all_words)} words to {filename}")
475
+ return filename, insufficient_lengths
476
 
477
  except Exception as e:
478
  logger.error(f"❌ Error saving words to {filename}: {e}")
479
+ return "", {}
480
  else:
481
  logger.info(f"ℹ️ No new words to add to {filename}")
482
+ # Still validate existing file
483
+ try:
484
+ sorted_words = sort_word_file(filepath)
485
+ words_by_len = {4: [], 5: [], 6: []}
486
+ for w in sorted_words:
487
+ L = len(w)
488
+ if L in words_by_len:
489
+ words_by_len[L].append(w)
490
+
491
+ insufficient_lengths = {L: MIN_REQUIRED - len(words_by_len[L])
492
+ for L in (4, 5, 6)
493
+ if len(words_by_len[L]) < MIN_REQUIRED}
494
+
495
+ if insufficient_lengths:
496
+ logger.warning(
497
+ f"⚠️ File {filename} still below MIN_REQUIRED threshold: "
498
+ f"{', '.join(f'{L}-letter: {len(words_by_len[L])}/{MIN_REQUIRED}' for L in insufficient_lengths.keys())}"
499
+ )
500
+
501
+ return filename, insufficient_lengths
502
+ except Exception:
503
+ return filename, {}
504
 
505
 
506
  # ---------------------------------------------------------------------------
 
520
 
521
  Returns:
522
  words: List[str] - Final 75 words (uppercase A–Z).
523
+ difficulties: Dict[str,float] - Difficulty scores using compute_word_difficulty().
524
  metadata: Dict[str,str] - Source / diagnostic info.
525
 
526
  Parameters:
 
542
  raw_generated_text = ""
543
  ai_words: List[str] = []
544
  generation_source = "none"
545
+ generator = None # Track generator for potential additional generation
546
 
547
  # Check if USE_HF_WORDS is enabled
548
  if USE_HF_WORDS:
549
  # Try HF Space API first
550
  try:
551
  raw_generated_text, generation_source = _generate_via_hf_space(topic)
552
+ prompt = BASE_PROMPT_TEMPLATE.format(topic=topic.upper(), WORDS_PER_LENGTH=WORDS_PER_LENGTH)
553
  parsed = _extract_words_from_output(prompt, raw_generated_text)
554
  logger.debug(f"Parsed {len(parsed)} words from HF Space output")
555
  ai_words = _filter_and_dedupe(parsed)
 
568
  generator = _load_model(model_name or DEFAULT_MODEL_NAME)
569
 
570
  if generator is not None:
571
+ prompt = BASE_PROMPT_TEMPLATE.format(topic=topic.upper(), WORDS_PER_LENGTH=WORDS_PER_LENGTH)
572
  try:
573
  logger.info(f"📝 Generating words from local AI model...")
574
  outputs = generator(
 
598
  ai_words = []
599
 
600
  # CORRECT ORDER:
601
+ # 1. FIRST identify new words (before any filtering)
602
  new_words_to_save: List[str] = []
603
  if ai_words:
604
  existing_words = [w for w in ai_words if w in canonical_set]
 
606
 
607
  logger.info(f"📊 Word analysis: {len(ai_words)} total = {len(existing_words)} existing + {len(new_words_to_save)} NEW")
608
 
609
+ # 2. Check if we have MIN_REQUIRED new words for each length
610
+ new_words_by_length = {4: [], 5: [], 6: []}
611
+ for w in new_words_to_save:
612
+ L = len(w)
613
+ if L in new_words_by_length:
614
+ new_words_by_length[L].append(w)
615
+
616
+ # Calculate how many more words we need per length
617
+ needed_by_length = {}
618
+ for L in VALID_LENGTHS:
619
+ current_count = len(new_words_by_length[L])
620
+ if current_count < MIN_REQUIRED:
621
+ needed_by_length[L] = MIN_REQUIRED - current_count
622
+ logger.info(f"⚠️ Only {current_count}/{MIN_REQUIRED} new {L}-letter words. Need {needed_by_length[L]} more.")
623
+
624
+ # 3. If we need more words, generate them
625
+ if needed_by_length:
626
+ logger.info(f"🔄 Attempting to generate additional words to meet MIN_REQUIRED threshold...")
627
+ all_existing = canonical_set.union(set(new_words_to_save))
628
+ additional_words = _generate_additional_words(
629
+ topic=topic,
630
+ needed_by_length=needed_by_length,
631
+ existing_words=all_existing,
632
+ generator=generator,
633
+ generation_source=generation_source
634
+ )
635
+
636
+ if additional_words:
637
+ # Add additional words to new_words_to_save and ai_words
638
+ new_words_to_save.extend(additional_words)
639
+ ai_words.extend(additional_words)
640
+
641
+ # Update counts
642
+ for w in additional_words:
643
+ L = len(w)
644
+ if L in new_words_by_length:
645
+ new_words_by_length[L].append(w)
646
+
647
+ logger.info(f"✅ Added {len(additional_words)} additional words. New totals:")
648
+ for L in VALID_LENGTHS:
649
+ logger.info(f" {L}-letter: {len(new_words_by_length[L])} new words")
650
+ else:
651
+ logger.warning(f"⚠️ Could not generate additional words. Proceeding with current set.")
652
+
653
+ # 4. Save the NEW words to expand the dictionary
654
  if new_words_to_save:
655
+ max_save_retries = 3
656
+ retry_count = 0
657
+
658
+ while retry_count < max_save_retries:
659
+ saved_filename, insufficient_lengths = _save_ai_words_to_file(topic, new_words_to_save)
660
+
661
+ if saved_filename:
662
+ logger.info(f"💾 Saved {len(new_words_to_save)} NEW words to {saved_filename}")
663
+
664
+ # If file meets MIN_REQUIRED or we've exhausted retries, break
665
+ if not insufficient_lengths or retry_count >= max_save_retries - 1:
666
+ break
667
+
668
+ # File still insufficient - generate more words for the missing lengths
669
+ logger.info(f"🔄 File {saved_filename} needs more words. Retry {retry_count + 1}/{max_save_retries}")
670
+
671
+ # Generate additional words to fill the gap
672
+ additional_fill_words = _generate_additional_words(
673
+ topic=topic,
674
+ needed_by_length=insufficient_lengths,
675
+ existing_words=canonical_set.union(set(new_words_to_save)),
676
+ generator=generator,
677
+ generation_source=generation_source
678
+ )
679
+
680
+ if additional_fill_words:
681
+ logger.info(f"✅ Generated {len(additional_fill_words)} words to fill file gap")
682
+ new_words_to_save.extend(additional_fill_words)
683
+ retry_count += 1
684
+ else:
685
+ logger.warning(f"⚠️ Could not generate additional words to fill file. Stopping retries.")
686
+ break
687
 
688
+ # 5. THEN apply dictionary filter if requested (for game word selection)
689
  if use_dictionary_filter and ai_words:
690
  before_filter = len(ai_words)
691
  filtered_out_words = [w for w in ai_words if w not in canonical_set]
 
703
  else:
704
  by_len['other'].append(w)
705
 
706
+ logger.info(f"🚫 Filtered out words ALREADY in dictionary:")
707
  for length in [4, 5, 6, 'other']:
708
  if by_len[length]:
709
  logger.info(f" {length}-letter: {', '.join(sorted(by_len[length]))}")
wrdler/words/cooking.txt ADDED
@@ -0,0 +1,180 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # AI-generated word list
2
+ # Topic: COOKing
3
+ # Last updated: 2025-11-28 12:16:45
4
+ # Total words: 174
5
+ # Format: one word per line, sorted by length then alphabetically
6
+ #
7
+ BAKE
8
+ BEEN
9
+ BLUE
10
+ BOIL
11
+ BUFF
12
+ CAKE
13
+ CALV
14
+ CARD
15
+ COOK
16
+ FARM
17
+ FAST
18
+ FILL
19
+ FIRE
20
+ FISH
21
+ FIZZ
22
+ FOOD
23
+ FRYD
24
+ HARV
25
+ HERB
26
+ JUST
27
+ KEEN
28
+ KERN
29
+ KING
30
+ MAKE
31
+ MANU
32
+ MASH
33
+ MEAT
34
+ MINI
35
+ PICK
36
+ PLAN
37
+ POTS
38
+ POUR
39
+ PREP
40
+ PUFF
41
+ RACK
42
+ RECT
43
+ ROLL
44
+ SEAR
45
+ SEEK
46
+ SIFT
47
+ SIMP
48
+ SKIL
49
+ SKIP
50
+ SLOW
51
+ SOFT
52
+ SOUP
53
+ STEW
54
+ STIR
55
+ TAKE
56
+ THAW
57
+ TURK
58
+ VEAL
59
+ WARM
60
+ WASH
61
+ WELL
62
+ WHIP
63
+ WRAP
64
+ BACON
65
+ BAKED
66
+ BASTE
67
+ BLANK
68
+ BLEND
69
+ BLTIG
70
+ BOATS
71
+ BOILD
72
+ BREAD
73
+ BROWN
74
+ BUTTI
75
+ CANDY
76
+ CARBS
77
+ CHICK
78
+ CHILL
79
+ CLEAN
80
+ CRUMB
81
+ CRUST
82
+ DRIED
83
+ FLAKY
84
+ FRIED
85
+ FRUIT
86
+ FRYER
87
+ GLAZE
88
+ GRAIN
89
+ GRASS
90
+ GRILL
91
+ HARVE
92
+ HERBS
93
+ HOUSE
94
+ KITCH
95
+ KNEAD
96
+ MARIN
97
+ MIXED
98
+ MIXER
99
+ PASTA
100
+ PLATE
101
+ PREPS
102
+ PUREE
103
+ QUICK
104
+ ROAST
105
+ RUBED
106
+ SALAD
107
+ SALTY
108
+ SAUTE
109
+ SCONE
110
+ SHAVE
111
+ SHRIL
112
+ SLICE
113
+ SLICK
114
+ SMAKE
115
+ SPICE
116
+ SPICY
117
+ SPIRE
118
+ START
119
+ STEAK
120
+ STICK
121
+ STIRD
122
+ STIRP
123
+ STOVE
124
+ SWEET
125
+ TASTE
126
+ TASTY
127
+ TEMPP
128
+ THINK
129
+ TOAST
130
+ TREAT
131
+ TRIMM
132
+ TWIST
133
+ UNDER
134
+ VEGET
135
+ WHISK
136
+ WIELD
137
+ YIELD
138
+ ZESTY
139
+ ASSERT
140
+ BARELY
141
+ BARKED
142
+ BAROBA
143
+ BATTER
144
+ BAUGHT
145
+ BAZAAR
146
+ BITTER
147
+ BOILED
148
+ BROWNY
149
+ CARROT
150
+ CHEESE
151
+ COOKED
152
+ COOKER
153
+ COOKIN
154
+ DETAIL
155
+ DILUTE
156
+ EATING
157
+ FLOURD
158
+ FLYING
159
+ FROZEN
160
+ GRATED
161
+ GRILLD
162
+ KNEADD
163
+ LITTLE
164
+ MARINA
165
+ MASHED
166
+ METHOD
167
+ MUFFIN
168
+ NOODLE
169
+ PACKED
170
+ PICKED
171
+ PIERCE
172
+ RECIPE
173
+ SHIELD
174
+ SMELLY
175
+ SMOOTH
176
+ SNAKES
177
+ SPICED
178
+ TASTES
179
+ TOUCHS
180
+ VEGGIE
wrdler/words/english.txt CHANGED
@@ -1,7 +1,7 @@
1
  # AI-generated word list
2
- # Topic: EnglisH
3
- # Last updated: 2025-11-24 09:23:12
4
- # Total words: 336
5
  # Format: one word per line, sorted by length then alphabetically
6
  #
7
  ALMS
@@ -15,6 +15,7 @@ BASE
15
  BEAN
16
  BELL
17
  BEND
 
18
  BORE
19
  BORN
20
  BUNK
@@ -49,9 +50,11 @@ GIRD
49
  GIVE
50
  GLOB
51
  HALE
 
52
  HONE
53
  HOSE
54
  IRON
 
55
  LAKE
56
  LAND
57
  LANE
@@ -70,7 +73,9 @@ MINE
70
  NAIL
71
  OILY
72
  PACE
 
73
  PAIN
 
74
  POET
75
  PORE
76
  PORT
@@ -92,9 +97,11 @@ SLIT
92
  SLUG
93
  SORE
94
  STEM
 
95
  TAGS
96
  TAIL
97
  TALE
 
98
  TIME
99
  TYPE
100
  WALK
@@ -104,6 +111,7 @@ WISH
104
  WORD
105
  YELL
106
  ZONE
 
107
  ALICE
108
  ALIKE
109
  BANKS
@@ -111,6 +119,7 @@ BEAST
111
  BLANK
112
  BLIND
113
  BOAST
 
114
  BRAID
115
  BRAKE
116
  BRASH
@@ -207,6 +216,7 @@ LOYAL
207
  LUNCH
208
  LURID
209
  LURKY
 
210
  MAIZE
211
  MANTO
212
  MAPLE
@@ -224,6 +234,7 @@ NODES
224
  NURSE
225
  OMITS
226
  OUTGO
 
227
  PEERS
228
  PINKO
229
  PINTS
@@ -320,9 +331,13 @@ STAIR
320
  STALK
321
  STICK
322
  SUNNY
 
323
  VOICE
324
  WAIST
325
  WASTE
 
 
 
326
  AUTHOR
327
  BEACON
328
  BRIDGE
@@ -336,7 +351,13 @@ GENTLE
336
  HAMMER
337
  ISLAND
338
  LENGTH
 
 
339
  PRAISE
340
  REASON
341
  REFORM
342
  REYLES
 
 
 
 
 
1
  # AI-generated word list
2
+ # Topic: English
3
+ # Last updated: 2025-11-28 10:56:13
4
+ # Total words: 357
5
  # Format: one word per line, sorted by length then alphabetically
6
  #
7
  ALMS
 
15
  BEAN
16
  BELL
17
  BEND
18
+ BOOK
19
  BORE
20
  BORN
21
  BUNK
 
50
  GIVE
51
  GLOB
52
  HALE
53
+ HELP
54
  HONE
55
  HOSE
56
  IRON
57
+ JUMP
58
  LAKE
59
  LAND
60
  LANE
 
73
  NAIL
74
  OILY
75
  PACE
76
+ PAGE
77
  PAIN
78
+ PENS
79
  POET
80
  PORE
81
  PORT
 
97
  SLUG
98
  SORE
99
  STEM
100
+ STEP
101
  TAGS
102
  TAIL
103
  TALE
104
+ TEXT
105
  TIME
106
  TYPE
107
  WALK
 
111
  WORD
112
  YELL
113
  ZONE
114
+ ALIAS
115
  ALICE
116
  ALIKE
117
  BANKS
 
119
  BLANK
120
  BLIND
121
  BOAST
122
+ BOOKS
123
  BRAID
124
  BRAKE
125
  BRASH
 
216
  LUNCH
217
  LURID
218
  LURKY
219
+ MACRO
220
  MAIZE
221
  MANTO
222
  MAPLE
 
234
  NURSE
235
  OMITS
236
  OUTGO
237
+ PAGES
238
  PEERS
239
  PINKO
240
  PINTS
 
331
  STALK
332
  STICK
333
  SUNNY
334
+ TUTOR
335
  VOICE
336
  WAIST
337
  WASTE
338
+ WORDS
339
+ WRITE
340
+ ASSIST
341
  AUTHOR
342
  BEACON
343
  BRIDGE
 
351
  HAMMER
352
  ISLAND
353
  LENGTH
354
+ LETTER
355
+ PHRASE
356
  PRAISE
357
  REASON
358
  REFORM
359
  REYLES
360
+ SCRIPT
361
+ SPRINT
362
+ SURVEY
363
+ SYMBOL
wrdler/words/wordlist.txt CHANGED
@@ -1,5 +1,9 @@
1
- # Optional: place a large A–Z word list here (one word per line).
2
- # The app falls back to built-in pools if fewer than 500 words per length are found.
 
 
 
 
3
  ABLE
4
  ACID
5
  AGED
@@ -268,7 +272,6 @@ MISS
268
  MODE
269
  MOOD
270
  MOON
271
- MOON
272
  MORE
273
  MOST
274
  MOVE
@@ -287,6 +290,7 @@ NICK
287
  NINE
288
  NOSE
289
  NOTE
 
290
  OBEY
291
  ODDS
292
  OILY
@@ -481,7 +485,6 @@ WIFE
481
  WILD
482
  WILL
483
  WIND
484
- WIND
485
  WINE
486
  WING
487
  WIRE
@@ -498,6 +501,7 @@ YELL
498
  YOGA
499
  ZERO
500
  ZONE
 
501
  APPLE
502
  BLAST
503
  BOARD
@@ -506,40 +510,50 @@ BREAD
506
  CHAIR
507
  CHALK
508
  CHESS
 
509
  CLOUD
510
  CRANE
 
511
  DANCE
512
  EARTH
513
  FAITH
 
514
  FLAME
515
  FLUTE
 
516
  GHOST
517
  GRAPE
518
  GRASS
519
  GREAT
520
  HEART
521
- HEART
 
522
  LEMON
523
  LIGHT
524
  MARCH
525
  MOUSE
526
  NURSE
527
  PANEL
528
- PANEL
529
  PLANT
530
  PRIZE
531
  QUEST
 
532
  RIVER
533
  SCALE
534
  SHINE
535
  SMILE
536
  STONE
537
  TIGER
 
 
538
  YOUNG
 
539
  BUNDLE
540
  CANDLE
541
  CHERRY
542
  CIRCLE
 
 
543
  DOCTOR
544
  DOMAIN
545
  FAMILY
@@ -553,13 +567,20 @@ LADDER
553
  LAUNCH
554
  LOGGER
555
  MARKET
 
556
  MOTHER
 
557
  ORANGE
 
558
  PALACE
 
559
  POCKET
 
560
  SILVER
561
  SPIRIT
562
  STREAM
 
 
563
  THRIVE
564
  TUNNEL
565
  WINNER
 
1
+ # AI-generated word list
2
+ # Topic: wordlisT
3
+ # Last updated: 2025-11-28 12:18:24
4
+ # Total words: 581
5
+ # Format: one word per line, sorted by length then alphabetically
6
+ #
7
  ABLE
8
  ACID
9
  AGED
 
272
  MODE
273
  MOOD
274
  MOON
 
275
  MORE
276
  MOST
277
  MOVE
 
290
  NINE
291
  NOSE
292
  NOTE
293
+ NUNC
294
  OBEY
295
  ODDS
296
  OILY
 
485
  WILD
486
  WILL
487
  WIND
 
488
  WINE
489
  WING
490
  WIRE
 
501
  YOGA
502
  ZERO
503
  ZONE
504
+ ADJAR
505
  APPLE
506
  BLAST
507
  BOARD
 
510
  CHAIR
511
  CHALK
512
  CHESS
513
+ CLASS
514
  CLOUD
515
  CRANE
516
+ CYCLE
517
  DANCE
518
  EARTH
519
  FAITH
520
+ FINAL
521
  FLAME
522
  FLUTE
523
+ FORUM
524
  GHOST
525
  GRAPE
526
  GRASS
527
  GREAT
528
  HEART
529
+ INDEX
530
+ LABEL
531
  LEMON
532
  LIGHT
533
  MARCH
534
  MOUSE
535
  NURSE
536
  PANEL
 
537
  PLANT
538
  PRIZE
539
  QUEST
540
+ RATIO
541
  RIVER
542
  SCALE
543
  SHINE
544
  SMILE
545
  STONE
546
  TIGER
547
+ VERBS
548
+ VOICE
549
  YOUNG
550
+ AUTHOR
551
  BUNDLE
552
  CANDLE
553
  CHERRY
554
  CIRCLE
555
+ DEVICE
556
+ DIGITS
557
  DOCTOR
558
  DOMAIN
559
  FAMILY
 
567
  LAUNCH
568
  LOGGER
569
  MARKET
570
+ METHOD
571
  MOTHER
572
+ OFFSET
573
  ORANGE
574
+ ORIGIN
575
  PALACE
576
+ PHRASE
577
  POCKET
578
+ SENSOR
579
  SILVER
580
  SPIRIT
581
  STREAM
582
+ SYMBOL
583
+ SYSTEM
584
  THRIVE
585
  TUNNEL
586
  WINNER