Skip to content
Manu Murugesan edited this page Mar 13, 2026 · 2 revisions

medicaid-utils

Open-source Python toolkit for Medicaid claims data analysis

PyPI Python 3.11+ License: MIT

medicaid-utils is a Python package for constructing patient-level analytic files from Medicaid claims data published by the Centers for Medicare & Medicaid Services (CMS). It implements validated cleaning routines, variable construction methods, and public-domain clinical algorithms for both MAX (Medicaid Analytic eXtract) and TAF (Transformed Medicaid Statistical Information System) file formats.

Built on Dask for scalable, distributed processing of large-scale claims datasets.


Wiki Pages

Getting Started

  • Installation — How to install and set up medicaid-utils
  • Data Layout — Required folder structure for MAX and TAF Parquet files
  • Quick Start — Load claims, clean them, and extract a cohort in minutes

User Guide

  • Preprocessing — What cleaning and variable construction routines do
  • Cohort Extraction — Build patient-level analytic files from diagnosis and procedure codes
  • Risk Adjustment Algorithms — Elixhauser, CDPS-Rx, BETOS, PQI, NYU/Billings, PMCA, low-value care
  • MAX vs TAF — Key differences between MAX and TAF file formats and how the package handles them

Recipes & How-Tos

Reference

  • Glossary — CMS terminology, acronyms, and column name conventions
  • Publications — Peer-reviewed papers built with medicaid-utils
  • FAQ — Frequently asked questions
  • Contributing — How to contribute to the project

Clone this wiki locally