Data Science
ReadmitRisk
Hospital Readmission Prevention Platform

Overview
A full-stack care management platform that identifies high-risk patients and prioritizes post-discharge interventions to reduce preventable hospital readmissions. Features an AI-powered chat interface that lets users query patient risk data and run live ML predictions conversationally, powered by a custom MCP server deployed on Railway and the Anthropic Claude API.
Key Features
- Multi-source ML pipeline processing 280K+ patient records from MIMIC-IV (ICU), UCI Diabetes, and CMS HRRP datasets
- Gradient Boosting classifier with SMOTE oversampling to handle severe class imbalance (8.8% → 50% positive class)
- 61-feature clinical model with demographic normalization and comprehensive feature importance analysis
- Interactive care management dashboard with risk stratification tiers (60%, 70%, 80% thresholds) and cost estimation
- Real-time data visualizations using Recharts with ROC-AUC curves, precision-recall metrics, and intervention tracking
- Remote MCP server deployed on Railway exposing 7 tools (patient risk lookup, live ML predictions, hospital metrics) that any AI assistant can call via SSE transport
- AI-powered conversational interface with guided and freeform queries against live patient data, connecting a React chat widget → FastAPI backend → Anthropic Claude API → MCP server → ML models and datasets
System Architecture
- Chat widget embedded in the ReadmitRisk frontend (Next.js on Vercel)
- FastAPI proxy backend deployed on Railway that handles Anthropic API calls and MCP routing
- MCP server deployed on Railway exposing 7 tools: patient risk lookup, live predictions, dataset comparison, feature importance, risk distribution, hospital metrics, and model information
- Claude processes natural language queries and selects appropriate MCP tools to answer
- Responses stream back via SSE for a real-time conversational experience
Challenges & Solutions
The project tackled significant ML challenges including handling severe class imbalance in readmission data using SMOTE oversampling, extracting and processing 211K ICU admissions from Google BigQuery with PhysioNet credentials, engineering 61 clinical features from raw MIMIC-IV data while maintaining HIPAA-compliant practices, implementing configurable probability thresholds for different risk tolerance levels, integrating multiple heterogeneous data sources (MIMIC-IV, UCI, CMS) with different schemas and feature sets, and building an intuitive care management interface that translates complex ML outputs into actionable clinical insights with cost-benefit analysis.
Tech Stack
Project Type
Data Science