Spaces:

stefanjwojcik
/

BioGuideMCP

Running

App Files Files Community

BioGuideMCP / README.md

stefanjwojcik

Add setup script and comprehensive tests for Congressional Bioguide MCP Server

15de73a 30 days ago

preview code

raw

history blame contribute delete

10.2 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: BioGuideMCP
emoji: 👁
colorFrom: purple
colorTo: yellow
sdk: gradio
sdk_version: 5.49.1
app_file: gradio_app.py
pinned: false
license: mit
short_description: 'An MCP allowing users to analyze congressional biographies. '

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Congressional Bioguide MCP Server

A Model Context Protocol (MCP) server that provides access to Congressional member profiles with both structured SQL queries and semantic search capabilities.

Deployment Options

1. Gradio MCP (Hugging Face Spaces)

Run this MCP as a Gradio app with web interface + MCP server:

python gradio_app.py

This will launch a web interface at http://localhost:7860 with 9 tools exposed as both a web UI and MCP tools.

Deploy to Hugging Face Spaces:

Create a new Space on Hugging Face
Set SDK to gradio (version 5.49.1+)
Upload all files including gradio_app.py, congress.db, congress_faiss.index, and congress_bio_ids.pkl
The app will automatically launch with mcp_server=True

2. Traditional MCP Server

Use the original MCP server for integration with Claude Desktop or other MCP clients:

python server.py

Test the server backend with npx @modelcontextprotocol/inspector python server.py or integrate it into your Claude setup.

Features

Gradio MCP Tools (9 Tools)

The Gradio app (gradio_app.py) exposes these 9 MCP tools:

search_by_name - Search members by name (first/last name)
search_by_party - Find by political party affiliation
search_by_state - Search by state/region representation
semantic_search_biography - AI-powered natural language search of biographies
get_member_profile - Get complete profile by Bioguide ID
count_members_by_party - Count members grouped by party
count_members_by_state - Count members grouped by state
execute_sql_query - Execute custom SQL queries (read-only)
get_database_schema - View database structure

Traditional MCP Server Tools (14 Tools)

The traditional server (server.py) provides all tools:

Search Tools (return concise results by default):

search_by_name - Search members by name (returns: name, dates, party, congress)
search_by_party - Find by political party affiliation
search_by_state - Search by state/region representation
search_by_congress - Get all members from specific Congress
search_by_date_range - Find members who served during specific dates
semantic_search_biography - Natural language AI search of biographies
search_biography_regex - Regex pattern search (keywords, phrases)
search_by_relationship - Find members with family relationships

Aggregation & Analysis Tools (efficient for large datasets): 9. count_members - Count members by party, state, position, congress, or year 10. temporal_analysis - Analyze trends over time (party shifts, demographics, etc.) 11. count_by_biography_content - Count members mentioning specific keywords (e.g., "Harvard", "lawyer")

Profile & Query Tools: 12. get_member_profile - Get complete profile by Bioguide ID 13. execute_sql_query - Execute custom SQL queries (read-only) 14. get_database_schema - View database structure

Database Schema

members - Core biographical data (13,047+ profiles)
job_positions - Congressional positions and affiliations
images - Profile images
relationships - Family relationships between members
creative_works - Publications by members
assets - Additional media assets

Requirements

Python 3.10+ including Python 3.14
✅ Python 3.14 is now supported! (with single-threaded mode for FAISS)

Setup

Quick Start

./setup.sh

This automated script will:

Create a Python virtual environment
Install all dependencies
Ingest all Congressional profiles into SQLite
Build the FAISS semantic search index

Manual Setup

If you prefer manual setup:

1. Install Dependencies

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

2. Ingest Data

Run the ingestion script to create the SQLite database and FAISS index:

python3 ingest_data.py

This will:

Create congress.db SQLite database (13,047+ members)
Build congress_faiss.index for semantic search
Generate congress_bio_ids.pkl for ID mapping

Expected output: ``` Starting Congressional Bioguide ingestion...

✓ Database schema created Ingesting 13047 profiles... Processed 1000/13047 profiles... ... ✓ Ingested 13047 profiles into database Building FAISS index for semantic search... Encoding 13047 biographies... Encoded 3200/13047 biographies... ... ✓ FAISS index created with 13047 vectors Index dimension: 384

✓ Ingestion complete!


**Note**: Ingestion takes approximately 5-10 minutes depending on your system.

#### 3. Test the System (Optional)

```bash
python3 test_queries.py

4. Run the Server

python3 server.py

Usage Examples

Name Search

{
  "name": "search_by_name",
  "arguments": {
    "family_name": "Lincoln"
  }
}

Party Search

{
  "name": "search_by_party",
  "arguments": {
    "party": "Republican",
    "congress_number": 117
  }
}

State Search

{
  "name": "search_by_state",
  "arguments": {
    "state_code": "CA",
    "congress_number": 117
  }
}

Semantic Search

{
  "name": "semantic_search_biography",
  "arguments": {
    "query": "Civil War veterans who became lawyers",
    "top_k": 5
  }
}

Regex Search - Find Keywords

{
  "name": "search_biography_regex",
  "arguments": {
    "pattern": "Harvard",
    "limit": 5
  }
}

Regex Search - Filter by Party

{
  "name": "search_biography_regex",
  "arguments": {
    "pattern": "lawyer",
    "filter_party": "Republican",
    "limit": 10
  }
}

Regex Search - Filter by State and Congress

{
  "name": "search_biography_regex",
  "arguments": {
    "pattern": "served.*Confederate Army",
    "filter_state": "VA",
    "limit": 5
  }
}

Note: Regex search returns concise results (name, dates, party, state) by default. Set return_full_profile: true to get biography text.

Count Members by Party

{
  "name": "count_members",
  "arguments": {
    "group_by": "party"
  }
}

Count Republicans by State in 117th Congress

{
  "name": "count_members",
  "arguments": {
    "group_by": "state",
    "filter_party": "Republican",
    "filter_congress": 117
  }
}

Temporal Analysis - Party Changes Over Time

{
  "name": "temporal_analysis",
  "arguments": {
    "analysis_type": "party_over_time",
    "time_unit": "congress",
    "start_date": "1900-01-01",
    "end_date": "2000-12-31"
  }
}

Demographics Analysis - Average Age by Congress

{
  "name": "temporal_analysis",
  "arguments": {
    "analysis_type": "demographics",
    "time_unit": "congress"
  }
}

Count Members Who Attended Harvard

{
  "name": "count_by_biography_content",
  "arguments": {
    "keywords": ["Harvard"]
  }
}

Count Lawyers by Party

{
  "name": "count_by_biography_content",
  "arguments": {
    "keywords": ["lawyer", "attorney"],
    "breakdown_by": "party"
  }
}

Count Members Who Were Both Lawyers AND Veterans

{
  "name": "count_by_biography_content",
  "arguments": {
    "keywords": ["lawyer", "military", "army"],
    "match_all": false,
    "breakdown_by": "state"
  }
}

SQL Query - Find Longest Serving Members

{
  "name": "execute_sql_query",
  "arguments": {
    "query": "SELECT family_name, given_name, COUNT(DISTINCT congress_number) as congresses FROM members m JOIN job_positions j ON m.bio_id = j.bio_id GROUP BY m.bio_id HAVING congresses > 5 ORDER BY congresses DESC LIMIT 10"
  }
}

Get Full Member Profile

{
  "name": "get_member_profile",
  "arguments": {
    "bio_id": "L000313"
  }
}

Search by Congress Number

{
  "name": "search_by_congress",
  "arguments": {
    "congress_number": 117,
    "chamber": "Senator"
  }
}

Search by Date Range

{
  "name": "search_by_date_range",
  "arguments": {
    "start_date": "1861-03-04",
    "end_date": "1865-03-04"
  }
}

Find Family Relationships

{
  "name": "search_by_relationship",
  "arguments": {
    "relationship_type": "father"
  }
}

Complex SQL - Party Transitions

{
  "name": "execute_sql_query",
  "arguments": {
    "query": "SELECT m.bio_id, m.family_name, m.given_name, GROUP_CONCAT(DISTINCT j.party) as parties FROM members m JOIN job_positions j ON m.bio_id = j.bio_id WHERE j.party IS NOT NULL GROUP BY m.bio_id HAVING COUNT(DISTINCT j.party) > 1 LIMIT 20"
  }
}

Data Source

Data comes from the US Congressional Bioguide, containing biographical information for all members of Congress throughout history.

Technical Details

Database: SQLite for structured queries
Semantic Search: FAISS with sentence-transformers (all-MiniLM-L6-v2)
Embedding Dimension: 384
Index Type: Flat IP (Inner Product) with L2 normalization for cosine similarity

MCP Configuration

Add to your MCP settings file (usually ~/.config/claude/claude_desktop_config.json on macOS/Linux or %APPDATA%\Claude\claude_desktop_config.json on Windows):

{
  "mcpServers": {
    "congressional-bioguide": {
      "command": "/Users/electron/workspace/Nanocentury AI/NIO/BioGuideMCP/venv/bin/python",
      "args": [
        "/Users/electron/workspace/Nanocentury AI/NIO/BioGuideMCP/server.py"
      ],
      "cwd": "/Users/electron/workspace/Nanocentury AI/NIO/BioGuideMCP"
    }
  }
}

Note: This uses the virtual environment's Python which has all the required dependencies installed.

License

Data is public domain from the US Congressional Bioguide.