Spaces:
Running
Running
| title: BioGuideMCP | |
| emoji: π | |
| colorFrom: purple | |
| colorTo: yellow | |
| sdk: gradio | |
| sdk_version: 5.49.1 | |
| app_file: gradio_app.py | |
| pinned: false | |
| license: mit | |
| short_description: 'An MCP allowing users to analyze congressional biographies. ' | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |
| # Congressional Bioguide MCP Server | |
| A Model Context Protocol (MCP) server that provides access to Congressional member profiles with both structured SQL queries and semantic search capabilities. | |
| ## Deployment Options | |
| ### 1. Gradio MCP (Hugging Face Spaces) | |
| Run this MCP as a Gradio app with web interface + MCP server: | |
| ```bash | |
| python gradio_app.py | |
| ``` | |
| This will launch a web interface at `http://localhost:7860` with 9 tools exposed as both a web UI and MCP tools. | |
| **Deploy to Hugging Face Spaces:** | |
| 1. Create a new Space on Hugging Face | |
| 2. Set SDK to `gradio` (version 5.49.1+) | |
| 3. Upload all files including `gradio_app.py`, `congress.db`, `congress_faiss.index`, and `congress_bio_ids.pkl` | |
| 4. The app will automatically launch with `mcp_server=True` | |
| ### 2. Traditional MCP Server | |
| Use the original MCP server for integration with Claude Desktop or other MCP clients: | |
| ```bash | |
| python server.py | |
| ``` | |
| Test the server backend with `npx @modelcontextprotocol/inspector python server.py` or integrate it into your Claude setup. | |
| ## Features | |
| ### Gradio MCP Tools (9 Tools) | |
| The Gradio app (`gradio_app.py`) exposes these 9 MCP tools: | |
| 1. **search_by_name** - Search members by name (first/last name) | |
| 2. **search_by_party** - Find by political party affiliation | |
| 3. **search_by_state** - Search by state/region representation | |
| 4. **semantic_search_biography** - AI-powered natural language search of biographies | |
| 5. **get_member_profile** - Get complete profile by Bioguide ID | |
| 6. **count_members_by_party** - Count members grouped by party | |
| 7. **count_members_by_state** - Count members grouped by state | |
| 8. **execute_sql_query** - Execute custom SQL queries (read-only) | |
| 9. **get_database_schema** - View database structure | |
| ### Traditional MCP Server Tools (14 Tools) | |
| The traditional server (`server.py`) provides all tools: | |
| **Search Tools** (return concise results by default): | |
| 1. **search_by_name** - Search members by name (returns: name, dates, party, congress) | |
| 2. **search_by_party** - Find by political party affiliation | |
| 3. **search_by_state** - Search by state/region representation | |
| 4. **search_by_congress** - Get all members from specific Congress | |
| 5. **search_by_date_range** - Find members who served during specific dates | |
| 6. **semantic_search_biography** - Natural language AI search of biographies | |
| 7. **search_biography_regex** - Regex pattern search (keywords, phrases) | |
| 8. **search_by_relationship** - Find members with family relationships | |
| **Aggregation & Analysis Tools** (efficient for large datasets): | |
| 9. **count_members** - Count members by party, state, position, congress, or year | |
| 10. **temporal_analysis** - Analyze trends over time (party shifts, demographics, etc.) | |
| 11. **count_by_biography_content** - Count members mentioning specific keywords (e.g., "Harvard", "lawyer") | |
| **Profile & Query Tools**: | |
| 12. **get_member_profile** - Get complete profile by Bioguide ID | |
| 13. **execute_sql_query** - Execute custom SQL queries (read-only) | |
| 14. **get_database_schema** - View database structure | |
| ### Database Schema | |
| - **members** - Core biographical data (13,047+ profiles) | |
| - **job_positions** - Congressional positions and affiliations | |
| - **images** - Profile images | |
| - **relationships** - Family relationships between members | |
| - **creative_works** - Publications by members | |
| - **assets** - Additional media assets | |
| ## Requirements | |
| - **Python 3.10+** including Python 3.14 | |
| - β **Python 3.14 is now supported!** (with single-threaded mode for FAISS) | |
| ## Setup | |
| ### Quick Start | |
| ```bash | |
| ./setup.sh | |
| ``` | |
| This automated script will: | |
| 1. Create a Python virtual environment | |
| 2. Install all dependencies | |
| 3. Ingest all Congressional profiles into SQLite | |
| 4. Build the FAISS semantic search index | |
| ### Manual Setup | |
| If you prefer manual setup: | |
| #### 1. Install Dependencies | |
| ```bash | |
| python3 -m venv venv | |
| source venv/bin/activate # On Windows: venv\Scripts\activate | |
| pip install -r requirements.txt | |
| ``` | |
| #### 2. Ingest Data | |
| Run the ingestion script to create the SQLite database and FAISS index: | |
| ```bash | |
| python3 ingest_data.py | |
| ``` | |
| This will: | |
| - Create `congress.db` SQLite database (13,047+ members) | |
| - Build `congress_faiss.index` for semantic search | |
| - Generate `congress_bio_ids.pkl` for ID mapping | |
| Expected output: | |
| ``` | |
| Starting Congressional Bioguide ingestion... | |
| ============================================================ | |
| β Database schema created | |
| Ingesting 13047 profiles... | |
| Processed 1000/13047 profiles... | |
| ... | |
| β Ingested 13047 profiles into database | |
| Building FAISS index for semantic search... | |
| Encoding 13047 biographies... | |
| Encoded 3200/13047 biographies... | |
| ... | |
| β FAISS index created with 13047 vectors | |
| Index dimension: 384 | |
| ============================================================ | |
| β Ingestion complete! | |
| ``` | |
| **Note**: Ingestion takes approximately 5-10 minutes depending on your system. | |
| #### 3. Test the System (Optional) | |
| ```bash | |
| python3 test_queries.py | |
| ``` | |
| #### 4. Run the Server | |
| ```bash | |
| python3 server.py | |
| ``` | |
| ## Usage Examples | |
| ### Name Search | |
| ```json | |
| { | |
| "name": "search_by_name", | |
| "arguments": { | |
| "family_name": "Lincoln" | |
| } | |
| } | |
| ``` | |
| ### Party Search | |
| ```json | |
| { | |
| "name": "search_by_party", | |
| "arguments": { | |
| "party": "Republican", | |
| "congress_number": 117 | |
| } | |
| } | |
| ``` | |
| ### State Search | |
| ```json | |
| { | |
| "name": "search_by_state", | |
| "arguments": { | |
| "state_code": "CA", | |
| "congress_number": 117 | |
| } | |
| } | |
| ``` | |
| ### Semantic Search | |
| ```json | |
| { | |
| "name": "semantic_search_biography", | |
| "arguments": { | |
| "query": "Civil War veterans who became lawyers", | |
| "top_k": 5 | |
| } | |
| } | |
| ``` | |
| ### Regex Search - Find Keywords | |
| ```json | |
| { | |
| "name": "search_biography_regex", | |
| "arguments": { | |
| "pattern": "Harvard", | |
| "limit": 5 | |
| } | |
| } | |
| ``` | |
| ### Regex Search - Filter by Party | |
| ```json | |
| { | |
| "name": "search_biography_regex", | |
| "arguments": { | |
| "pattern": "lawyer", | |
| "filter_party": "Republican", | |
| "limit": 10 | |
| } | |
| } | |
| ``` | |
| ### Regex Search - Filter by State and Congress | |
| ```json | |
| { | |
| "name": "search_biography_regex", | |
| "arguments": { | |
| "pattern": "served.*Confederate Army", | |
| "filter_state": "VA", | |
| "limit": 5 | |
| } | |
| } | |
| ``` | |
| **Note**: Regex search returns concise results (name, dates, party, state) by default. Set `return_full_profile: true` to get biography text. | |
| ### Count Members by Party | |
| ```json | |
| { | |
| "name": "count_members", | |
| "arguments": { | |
| "group_by": "party" | |
| } | |
| } | |
| ``` | |
| ### Count Republicans by State in 117th Congress | |
| ```json | |
| { | |
| "name": "count_members", | |
| "arguments": { | |
| "group_by": "state", | |
| "filter_party": "Republican", | |
| "filter_congress": 117 | |
| } | |
| } | |
| ``` | |
| ### Temporal Analysis - Party Changes Over Time | |
| ```json | |
| { | |
| "name": "temporal_analysis", | |
| "arguments": { | |
| "analysis_type": "party_over_time", | |
| "time_unit": "congress", | |
| "start_date": "1900-01-01", | |
| "end_date": "2000-12-31" | |
| } | |
| } | |
| ``` | |
| ### Demographics Analysis - Average Age by Congress | |
| ```json | |
| { | |
| "name": "temporal_analysis", | |
| "arguments": { | |
| "analysis_type": "demographics", | |
| "time_unit": "congress" | |
| } | |
| } | |
| ``` | |
| ### Count Members Who Attended Harvard | |
| ```json | |
| { | |
| "name": "count_by_biography_content", | |
| "arguments": { | |
| "keywords": ["Harvard"] | |
| } | |
| } | |
| ``` | |
| ### Count Lawyers by Party | |
| ```json | |
| { | |
| "name": "count_by_biography_content", | |
| "arguments": { | |
| "keywords": ["lawyer", "attorney"], | |
| "breakdown_by": "party" | |
| } | |
| } | |
| ``` | |
| ### Count Members Who Were Both Lawyers AND Veterans | |
| ```json | |
| { | |
| "name": "count_by_biography_content", | |
| "arguments": { | |
| "keywords": ["lawyer", "military", "army"], | |
| "match_all": false, | |
| "breakdown_by": "state" | |
| } | |
| } | |
| ``` | |
| ### SQL Query - Find Longest Serving Members | |
| ```json | |
| { | |
| "name": "execute_sql_query", | |
| "arguments": { | |
| "query": "SELECT family_name, given_name, COUNT(DISTINCT congress_number) as congresses FROM members m JOIN job_positions j ON m.bio_id = j.bio_id GROUP BY m.bio_id HAVING congresses > 5 ORDER BY congresses DESC LIMIT 10" | |
| } | |
| } | |
| ``` | |
| ### Get Full Member Profile | |
| ```json | |
| { | |
| "name": "get_member_profile", | |
| "arguments": { | |
| "bio_id": "L000313" | |
| } | |
| } | |
| ``` | |
| ### Search by Congress Number | |
| ```json | |
| { | |
| "name": "search_by_congress", | |
| "arguments": { | |
| "congress_number": 117, | |
| "chamber": "Senator" | |
| } | |
| } | |
| ``` | |
| ### Search by Date Range | |
| ```json | |
| { | |
| "name": "search_by_date_range", | |
| "arguments": { | |
| "start_date": "1861-03-04", | |
| "end_date": "1865-03-04" | |
| } | |
| } | |
| ``` | |
| ### Find Family Relationships | |
| ```json | |
| { | |
| "name": "search_by_relationship", | |
| "arguments": { | |
| "relationship_type": "father" | |
| } | |
| } | |
| ``` | |
| ### Complex SQL - Party Transitions | |
| ```json | |
| { | |
| "name": "execute_sql_query", | |
| "arguments": { | |
| "query": "SELECT m.bio_id, m.family_name, m.given_name, GROUP_CONCAT(DISTINCT j.party) as parties FROM members m JOIN job_positions j ON m.bio_id = j.bio_id WHERE j.party IS NOT NULL GROUP BY m.bio_id HAVING COUNT(DISTINCT j.party) > 1 LIMIT 20" | |
| } | |
| } | |
| ``` | |
| ## Data Source | |
| Data comes from the US Congressional Bioguide, containing biographical information for all members of Congress throughout history. | |
| ## Technical Details | |
| - **Database**: SQLite for structured queries | |
| - **Semantic Search**: FAISS with sentence-transformers (all-MiniLM-L6-v2) | |
| - **Embedding Dimension**: 384 | |
| - **Index Type**: Flat IP (Inner Product) with L2 normalization for cosine similarity | |
| ## MCP Configuration | |
| Add to your MCP settings file (usually `~/.config/claude/claude_desktop_config.json` on macOS/Linux or `%APPDATA%\Claude\claude_desktop_config.json` on Windows): | |
| ```json | |
| { | |
| "mcpServers": { | |
| "congressional-bioguide": { | |
| "command": "/Users/electron/workspace/Nanocentury AI/NIO/BioGuideMCP/venv/bin/python", | |
| "args": [ | |
| "/Users/electron/workspace/Nanocentury AI/NIO/BioGuideMCP/server.py" | |
| ], | |
| "cwd": "/Users/electron/workspace/Nanocentury AI/NIO/BioGuideMCP" | |
| } | |
| } | |
| } | |
| ``` | |
| **Note**: This uses the virtual environment's Python which has all the required dependencies installed. | |
| ## License | |
| Data is public domain from the US Congressional Bioguide. | |