--- title: BioGuideMCP emoji: 👁 colorFrom: purple colorTo: yellow sdk: gradio sdk_version: 5.49.1 app_file: gradio_app.py pinned: false license: mit short_description: 'An MCP allowing users to analyze congressional biographies. ' --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # Congressional Bioguide MCP Server A Model Context Protocol (MCP) server that provides access to Congressional member profiles with both structured SQL queries and semantic search capabilities. ## Deployment Options ### 1. Gradio MCP (Hugging Face Spaces) Run this MCP as a Gradio app with web interface + MCP server: ```bash python gradio_app.py ``` This will launch a web interface at `http://localhost:7860` with 9 tools exposed as both a web UI and MCP tools. **Deploy to Hugging Face Spaces:** 1. Create a new Space on Hugging Face 2. Set SDK to `gradio` (version 5.49.1+) 3. Upload all files including `gradio_app.py`, `congress.db`, `congress_faiss.index`, and `congress_bio_ids.pkl` 4. The app will automatically launch with `mcp_server=True` ### 2. Traditional MCP Server Use the original MCP server for integration with Claude Desktop or other MCP clients: ```bash python server.py ``` Test the server backend with `npx @modelcontextprotocol/inspector python server.py` or integrate it into your Claude setup. ## Features ### Gradio MCP Tools (9 Tools) The Gradio app (`gradio_app.py`) exposes these 9 MCP tools: 1. **search_by_name** - Search members by name (first/last name) 2. **search_by_party** - Find by political party affiliation 3. **search_by_state** - Search by state/region representation 4. **semantic_search_biography** - AI-powered natural language search of biographies 5. **get_member_profile** - Get complete profile by Bioguide ID 6. **count_members_by_party** - Count members grouped by party 7. **count_members_by_state** - Count members grouped by state 8. **execute_sql_query** - Execute custom SQL queries (read-only) 9. **get_database_schema** - View database structure ### Traditional MCP Server Tools (14 Tools) The traditional server (`server.py`) provides all tools: **Search Tools** (return concise results by default): 1. **search_by_name** - Search members by name (returns: name, dates, party, congress) 2. **search_by_party** - Find by political party affiliation 3. **search_by_state** - Search by state/region representation 4. **search_by_congress** - Get all members from specific Congress 5. **search_by_date_range** - Find members who served during specific dates 6. **semantic_search_biography** - Natural language AI search of biographies 7. **search_biography_regex** - Regex pattern search (keywords, phrases) 8. **search_by_relationship** - Find members with family relationships **Aggregation & Analysis Tools** (efficient for large datasets): 9. **count_members** - Count members by party, state, position, congress, or year 10. **temporal_analysis** - Analyze trends over time (party shifts, demographics, etc.) 11. **count_by_biography_content** - Count members mentioning specific keywords (e.g., "Harvard", "lawyer") **Profile & Query Tools**: 12. **get_member_profile** - Get complete profile by Bioguide ID 13. **execute_sql_query** - Execute custom SQL queries (read-only) 14. **get_database_schema** - View database structure ### Database Schema - **members** - Core biographical data (13,047+ profiles) - **job_positions** - Congressional positions and affiliations - **images** - Profile images - **relationships** - Family relationships between members - **creative_works** - Publications by members - **assets** - Additional media assets ## Requirements - **Python 3.10+** including Python 3.14 - ✅ **Python 3.14 is now supported!** (with single-threaded mode for FAISS) ## Setup ### Quick Start ```bash ./setup.sh ``` This automated script will: 1. Create a Python virtual environment 2. Install all dependencies 3. Ingest all Congressional profiles into SQLite 4. Build the FAISS semantic search index ### Manual Setup If you prefer manual setup: #### 1. Install Dependencies ```bash python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt ``` #### 2. Ingest Data Run the ingestion script to create the SQLite database and FAISS index: ```bash python3 ingest_data.py ``` This will: - Create `congress.db` SQLite database (13,047+ members) - Build `congress_faiss.index` for semantic search - Generate `congress_bio_ids.pkl` for ID mapping Expected output: ``` Starting Congressional Bioguide ingestion... ============================================================ ✓ Database schema created Ingesting 13047 profiles... Processed 1000/13047 profiles... ... ✓ Ingested 13047 profiles into database Building FAISS index for semantic search... Encoding 13047 biographies... Encoded 3200/13047 biographies... ... ✓ FAISS index created with 13047 vectors Index dimension: 384 ============================================================ ✓ Ingestion complete! ``` **Note**: Ingestion takes approximately 5-10 minutes depending on your system. #### 3. Test the System (Optional) ```bash python3 test_queries.py ``` #### 4. Run the Server ```bash python3 server.py ``` ## Usage Examples ### Name Search ```json { "name": "search_by_name", "arguments": { "family_name": "Lincoln" } } ``` ### Party Search ```json { "name": "search_by_party", "arguments": { "party": "Republican", "congress_number": 117 } } ``` ### State Search ```json { "name": "search_by_state", "arguments": { "state_code": "CA", "congress_number": 117 } } ``` ### Semantic Search ```json { "name": "semantic_search_biography", "arguments": { "query": "Civil War veterans who became lawyers", "top_k": 5 } } ``` ### Regex Search - Find Keywords ```json { "name": "search_biography_regex", "arguments": { "pattern": "Harvard", "limit": 5 } } ``` ### Regex Search - Filter by Party ```json { "name": "search_biography_regex", "arguments": { "pattern": "lawyer", "filter_party": "Republican", "limit": 10 } } ``` ### Regex Search - Filter by State and Congress ```json { "name": "search_biography_regex", "arguments": { "pattern": "served.*Confederate Army", "filter_state": "VA", "limit": 5 } } ``` **Note**: Regex search returns concise results (name, dates, party, state) by default. Set `return_full_profile: true` to get biography text. ### Count Members by Party ```json { "name": "count_members", "arguments": { "group_by": "party" } } ``` ### Count Republicans by State in 117th Congress ```json { "name": "count_members", "arguments": { "group_by": "state", "filter_party": "Republican", "filter_congress": 117 } } ``` ### Temporal Analysis - Party Changes Over Time ```json { "name": "temporal_analysis", "arguments": { "analysis_type": "party_over_time", "time_unit": "congress", "start_date": "1900-01-01", "end_date": "2000-12-31" } } ``` ### Demographics Analysis - Average Age by Congress ```json { "name": "temporal_analysis", "arguments": { "analysis_type": "demographics", "time_unit": "congress" } } ``` ### Count Members Who Attended Harvard ```json { "name": "count_by_biography_content", "arguments": { "keywords": ["Harvard"] } } ``` ### Count Lawyers by Party ```json { "name": "count_by_biography_content", "arguments": { "keywords": ["lawyer", "attorney"], "breakdown_by": "party" } } ``` ### Count Members Who Were Both Lawyers AND Veterans ```json { "name": "count_by_biography_content", "arguments": { "keywords": ["lawyer", "military", "army"], "match_all": false, "breakdown_by": "state" } } ``` ### SQL Query - Find Longest Serving Members ```json { "name": "execute_sql_query", "arguments": { "query": "SELECT family_name, given_name, COUNT(DISTINCT congress_number) as congresses FROM members m JOIN job_positions j ON m.bio_id = j.bio_id GROUP BY m.bio_id HAVING congresses > 5 ORDER BY congresses DESC LIMIT 10" } } ``` ### Get Full Member Profile ```json { "name": "get_member_profile", "arguments": { "bio_id": "L000313" } } ``` ### Search by Congress Number ```json { "name": "search_by_congress", "arguments": { "congress_number": 117, "chamber": "Senator" } } ``` ### Search by Date Range ```json { "name": "search_by_date_range", "arguments": { "start_date": "1861-03-04", "end_date": "1865-03-04" } } ``` ### Find Family Relationships ```json { "name": "search_by_relationship", "arguments": { "relationship_type": "father" } } ``` ### Complex SQL - Party Transitions ```json { "name": "execute_sql_query", "arguments": { "query": "SELECT m.bio_id, m.family_name, m.given_name, GROUP_CONCAT(DISTINCT j.party) as parties FROM members m JOIN job_positions j ON m.bio_id = j.bio_id WHERE j.party IS NOT NULL GROUP BY m.bio_id HAVING COUNT(DISTINCT j.party) > 1 LIMIT 20" } } ``` ## Data Source Data comes from the US Congressional Bioguide, containing biographical information for all members of Congress throughout history. ## Technical Details - **Database**: SQLite for structured queries - **Semantic Search**: FAISS with sentence-transformers (all-MiniLM-L6-v2) - **Embedding Dimension**: 384 - **Index Type**: Flat IP (Inner Product) with L2 normalization for cosine similarity ## MCP Configuration Add to your MCP settings file (usually `~/.config/claude/claude_desktop_config.json` on macOS/Linux or `%APPDATA%\Claude\claude_desktop_config.json` on Windows): ```json { "mcpServers": { "congressional-bioguide": { "command": "/Users/electron/workspace/Nanocentury AI/NIO/BioGuideMCP/venv/bin/python", "args": [ "/Users/electron/workspace/Nanocentury AI/NIO/BioGuideMCP/server.py" ], "cwd": "/Users/electron/workspace/Nanocentury AI/NIO/BioGuideMCP" } } } ``` **Note**: This uses the virtual environment's Python which has all the required dependencies installed. ## License Data is public domain from the US Congressional Bioguide.