File size: 10,226 Bytes
deaf2c5
 
 
 
 
 
 
15de73a
deaf2c5
 
15de73a
deaf2c5
 
 
15de73a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
---
title: BioGuideMCP
emoji: πŸ‘
colorFrom: purple
colorTo: yellow
sdk: gradio
sdk_version: 5.49.1
app_file: gradio_app.py
pinned: false
license: mit
short_description: 'An MCP allowing users to analyze congressional biographies. '
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

# Congressional Bioguide MCP Server

A Model Context Protocol (MCP) server that provides access to Congressional member profiles with both structured SQL queries and semantic search capabilities.

## Deployment Options

### 1. Gradio MCP (Hugging Face Spaces)

Run this MCP as a Gradio app with web interface + MCP server:

```bash
python gradio_app.py
```

This will launch a web interface at `http://localhost:7860` with 9 tools exposed as both a web UI and MCP tools.

**Deploy to Hugging Face Spaces:**
1. Create a new Space on Hugging Face
2. Set SDK to `gradio` (version 5.49.1+)
3. Upload all files including `gradio_app.py`, `congress.db`, `congress_faiss.index`, and `congress_bio_ids.pkl`
4. The app will automatically launch with `mcp_server=True`

### 2. Traditional MCP Server

Use the original MCP server for integration with Claude Desktop or other MCP clients:

```bash
python server.py
```

Test the server backend with `npx @modelcontextprotocol/inspector python server.py` or integrate it into your Claude setup.

## Features

### Gradio MCP Tools (9 Tools)

The Gradio app (`gradio_app.py`) exposes these 9 MCP tools:

1. **search_by_name** - Search members by name (first/last name)
2. **search_by_party** - Find by political party affiliation
3. **search_by_state** - Search by state/region representation
4. **semantic_search_biography** - AI-powered natural language search of biographies
5. **get_member_profile** - Get complete profile by Bioguide ID
6. **count_members_by_party** - Count members grouped by party
7. **count_members_by_state** - Count members grouped by state
8. **execute_sql_query** - Execute custom SQL queries (read-only)
9. **get_database_schema** - View database structure

### Traditional MCP Server Tools (14 Tools)

The traditional server (`server.py`) provides all tools:

**Search Tools** (return concise results by default):
1. **search_by_name** - Search members by name (returns: name, dates, party, congress)
2. **search_by_party** - Find by political party affiliation
3. **search_by_state** - Search by state/region representation
4. **search_by_congress** - Get all members from specific Congress
5. **search_by_date_range** - Find members who served during specific dates
6. **semantic_search_biography** - Natural language AI search of biographies
7. **search_biography_regex** - Regex pattern search (keywords, phrases)
8. **search_by_relationship** - Find members with family relationships

**Aggregation & Analysis Tools** (efficient for large datasets):
9. **count_members** - Count members by party, state, position, congress, or year
10. **temporal_analysis** - Analyze trends over time (party shifts, demographics, etc.)
11. **count_by_biography_content** - Count members mentioning specific keywords (e.g., "Harvard", "lawyer")

**Profile & Query Tools**:
12. **get_member_profile** - Get complete profile by Bioguide ID
13. **execute_sql_query** - Execute custom SQL queries (read-only)
14. **get_database_schema** - View database structure

### Database Schema

- **members** - Core biographical data (13,047+ profiles)
- **job_positions** - Congressional positions and affiliations
- **images** - Profile images
- **relationships** - Family relationships between members
- **creative_works** - Publications by members
- **assets** - Additional media assets

## Requirements

- **Python 3.10+** including Python 3.14
- βœ… **Python 3.14 is now supported!** (with single-threaded mode for FAISS)

## Setup

### Quick Start

```bash
./setup.sh
```

This automated script will:
1. Create a Python virtual environment
2. Install all dependencies
3. Ingest all Congressional profiles into SQLite
4. Build the FAISS semantic search index

### Manual Setup

If you prefer manual setup:

#### 1. Install Dependencies

```bash
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```

#### 2. Ingest Data

Run the ingestion script to create the SQLite database and FAISS index:

```bash
python3 ingest_data.py
```

This will:
- Create `congress.db` SQLite database (13,047+ members)
- Build `congress_faiss.index` for semantic search
- Generate `congress_bio_ids.pkl` for ID mapping

Expected output:
```
Starting Congressional Bioguide ingestion...
============================================================
βœ“ Database schema created
Ingesting 13047 profiles...
  Processed 1000/13047 profiles...
  ...
βœ“ Ingested 13047 profiles into database
Building FAISS index for semantic search...
  Encoding 13047 biographies...
    Encoded 3200/13047 biographies...
    ...
βœ“ FAISS index created with 13047 vectors
  Index dimension: 384
============================================================
βœ“ Ingestion complete!
```

**Note**: Ingestion takes approximately 5-10 minutes depending on your system.

#### 3. Test the System (Optional)

```bash
python3 test_queries.py
```

#### 4. Run the Server

```bash
python3 server.py
```

## Usage Examples

### Name Search
```json
{
  "name": "search_by_name",
  "arguments": {
    "family_name": "Lincoln"
  }
}
```

### Party Search
```json
{
  "name": "search_by_party",
  "arguments": {
    "party": "Republican",
    "congress_number": 117
  }
}
```

### State Search
```json
{
  "name": "search_by_state",
  "arguments": {
    "state_code": "CA",
    "congress_number": 117
  }
}
```

### Semantic Search
```json
{
  "name": "semantic_search_biography",
  "arguments": {
    "query": "Civil War veterans who became lawyers",
    "top_k": 5
  }
}
```

### Regex Search - Find Keywords
```json
{
  "name": "search_biography_regex",
  "arguments": {
    "pattern": "Harvard",
    "limit": 5
  }
}
```

### Regex Search - Filter by Party
```json
{
  "name": "search_biography_regex",
  "arguments": {
    "pattern": "lawyer",
    "filter_party": "Republican",
    "limit": 10
  }
}
```

### Regex Search - Filter by State and Congress
```json
{
  "name": "search_biography_regex",
  "arguments": {
    "pattern": "served.*Confederate Army",
    "filter_state": "VA",
    "limit": 5
  }
}
```

**Note**: Regex search returns concise results (name, dates, party, state) by default. Set `return_full_profile: true` to get biography text.

### Count Members by Party
```json
{
  "name": "count_members",
  "arguments": {
    "group_by": "party"
  }
}
```

### Count Republicans by State in 117th Congress
```json
{
  "name": "count_members",
  "arguments": {
    "group_by": "state",
    "filter_party": "Republican",
    "filter_congress": 117
  }
}
```

### Temporal Analysis - Party Changes Over Time
```json
{
  "name": "temporal_analysis",
  "arguments": {
    "analysis_type": "party_over_time",
    "time_unit": "congress",
    "start_date": "1900-01-01",
    "end_date": "2000-12-31"
  }
}
```

### Demographics Analysis - Average Age by Congress
```json
{
  "name": "temporal_analysis",
  "arguments": {
    "analysis_type": "demographics",
    "time_unit": "congress"
  }
}
```

### Count Members Who Attended Harvard
```json
{
  "name": "count_by_biography_content",
  "arguments": {
    "keywords": ["Harvard"]
  }
}
```

### Count Lawyers by Party
```json
{
  "name": "count_by_biography_content",
  "arguments": {
    "keywords": ["lawyer", "attorney"],
    "breakdown_by": "party"
  }
}
```

### Count Members Who Were Both Lawyers AND Veterans
```json
{
  "name": "count_by_biography_content",
  "arguments": {
    "keywords": ["lawyer", "military", "army"],
    "match_all": false,
    "breakdown_by": "state"
  }
}
```

### SQL Query - Find Longest Serving Members
```json
{
  "name": "execute_sql_query",
  "arguments": {
    "query": "SELECT family_name, given_name, COUNT(DISTINCT congress_number) as congresses FROM members m JOIN job_positions j ON m.bio_id = j.bio_id GROUP BY m.bio_id HAVING congresses > 5 ORDER BY congresses DESC LIMIT 10"
  }
}
```

### Get Full Member Profile
```json
{
  "name": "get_member_profile",
  "arguments": {
    "bio_id": "L000313"
  }
}
```

### Search by Congress Number
```json
{
  "name": "search_by_congress",
  "arguments": {
    "congress_number": 117,
    "chamber": "Senator"
  }
}
```

### Search by Date Range
```json
{
  "name": "search_by_date_range",
  "arguments": {
    "start_date": "1861-03-04",
    "end_date": "1865-03-04"
  }
}
```

### Find Family Relationships
```json
{
  "name": "search_by_relationship",
  "arguments": {
    "relationship_type": "father"
  }
}
```

### Complex SQL - Party Transitions
```json
{
  "name": "execute_sql_query",
  "arguments": {
    "query": "SELECT m.bio_id, m.family_name, m.given_name, GROUP_CONCAT(DISTINCT j.party) as parties FROM members m JOIN job_positions j ON m.bio_id = j.bio_id WHERE j.party IS NOT NULL GROUP BY m.bio_id HAVING COUNT(DISTINCT j.party) > 1 LIMIT 20"
  }
}
```

## Data Source

Data comes from the US Congressional Bioguide, containing biographical information for all members of Congress throughout history.

## Technical Details

- **Database**: SQLite for structured queries
- **Semantic Search**: FAISS with sentence-transformers (all-MiniLM-L6-v2)
- **Embedding Dimension**: 384
- **Index Type**: Flat IP (Inner Product) with L2 normalization for cosine similarity

## MCP Configuration

Add to your MCP settings file (usually `~/.config/claude/claude_desktop_config.json` on macOS/Linux or `%APPDATA%\Claude\claude_desktop_config.json` on Windows):

```json
{
  "mcpServers": {
    "congressional-bioguide": {
      "command": "/Users/electron/workspace/Nanocentury AI/NIO/BioGuideMCP/venv/bin/python",
      "args": [
        "/Users/electron/workspace/Nanocentury AI/NIO/BioGuideMCP/server.py"
      ],
      "cwd": "/Users/electron/workspace/Nanocentury AI/NIO/BioGuideMCP"
    }
  }
}
```

**Note**: This uses the virtual environment's Python which has all the required dependencies installed.

## License

Data is public domain from the US Congressional Bioguide.