GitHub - AseemPrasad/Legalassist-AI

Legalassist AI

The challenge is the Information Barrier in the Judiciary that prevents citizens from understanding their own legal outcomes. Specifically: Court judgments are inaccessible to the public due to complex legal jargon and language diversity.

This barrier leads to:

Lack of trust in the judicial system.
Citizen dependency on expensive, slow intermediaries for basic case updates.

It must be solved by an automated, multilingual, plain-language translation layer applied to final judgment documents.

Legalassist AI An AI-powered, multilingual translation engine that converts complex, jargon-filled judicial judgments into three key points of clear, actionable information for the citizen. Addresses the Problem: It directly dismantles the Information Barrier (our defined problem) by instantly providing clarity and eliminating the reliance on expensive, slow intermediaries for basic understanding. This solution directly breaks the language and jargon barrier by providing instant clarity and removing dependence on expensive intermediaries for basic understanding.

The entire process is designed to be completed in less than 60 seconds. The interface requires only one significant action from the user (upload/paste), and the system handles the entire complex process of legal interpretation and translation, demonstrating true simplification

Impact on the Target Audience (The Citizen Litigant)

The core impact is shifting the citizen's status from a dependent bystander to an informed participant.Before Citizens wait years for closure and cannot navigate courts due to language and cost barriers, relying solely on intermediaries for basic updates. The judiciary is stuck with manual records and PDFs, leaving the citizen confused.

After The solution eliminates the information gap, leading to:

Emotional Relief & Clarity: The primary source of post-judgment anxiety (not knowing what the document means) is removed by providing instant, actionable clarity.

Zero Dependency Cost: Citizens are no longer forced to pay or wait for legal aid/middlemen merely to understand the outcome of their case, directly addressing the cost barrier.

Trust Building: By offering tamper-proof clarity, the solution begins to rebuild trust in the legal system, countering the perceived absence of transparency.

The benefits are defined by the direct, automated replacement of flawed manual processes.

Automation of Clarity (AI Advantage): The system auto-generates plain-language judgment explainers instantly. This is a quantum leap over the slow, manual process of a lawyer explaining a complex document.

Accessibility (Digital Divide Bridge): By instantly converting legal jargon into local language summaries, the solution bridges the Digital Divide and promotes inclusive justice for ordinary people who cannot navigate the courts due to language

CLI Tool for Batch Processing

LegalEase AI now supports command-line processing for legal aid teams handling many judgments each day.

Installation

Create and activate a virtual environment (recommended).
Install dependencies:

pip install -r requirements.txt

Set API environment variables:

# Windows PowerShell
$env:OPENROUTER_API_KEY="your_key_here"
$env:OPENROUTER_BASE_URL="https://openrouter.ai/api/v1"

CLI Commands

Show full help:

python cli.py --help

Process a single file:

python cli.py process --file judgment.pdf --language Hindi

Process a scanned/image PDF using OCR (Hindi + English):

python cli.py process --file scanned_judgment.pdf --enable-ocr --ocr-languages eng+hin

Batch process a folder (parallel workers):

python cli.py batch --folder ./documents --output results.csv --workers 4

Alias form (also supported):

python cli.py process_batch --input ./judgments_folder --output ./results.csv

Key Features

Reads all PDFs from a folder
Generates summary and remedies advice for each PDF
Parallel processing (--workers, default 4)
Resume capability via checkpoint file
Per-file error handling (one failure does not stop the run)
Real-time progress bar with status and running cost
Exports to CSV/JSON (--format csv|json|both, default both)
Language controls: fixed (--language Hindi) or auto-detect (--language auto)
OCR fallback for scanned PDFs (--enable-ocr)
OCR language packs for local scripts (--ocr-languages eng+hin)
OCR quality signal via extraction confidence in output

Resume Behavior

Default mode resumes automatically.
Checkpoint path defaults to <output>.checkpoint.jsonl.
Successful files in checkpoint are skipped on re-run.
Use --no-resume to start from scratch.

Output Format

The exported CSV/JSON includes one record per PDF with:

file_name, file_path
status (success or error), error
language
summary
what_happened, can_appeal, appeal_days, appeal_court, cost_estimate, first_action, deadline
prompt_tokens, completion_tokens, total_tokens
api_cost_usd (estimated)
duration_seconds, processed_at

Cost Estimation

CLI prints total tokens and total estimated API cost at the end of batch runs.

By default, cost per token is 0.0 unless configured. Set these flags to match your provider pricing:

python cli.py batch \
  --folder ./documents \
  --output ./results.csv \
  --workers 4 \
  --prompt-cost-per-1k 0.0002 \
  --completion-cost-per-1k 0.0002

Estimated cost formula:

$$ \text{total_cost_usd} = \left(\frac{\text{prompt_tokens}}{1000}\right)\cdot p + \left(\frac{\text{completion_tokens}}{1000}\right)\cdot c $$

where $p$ and $c$ are prompt/completion USD rates per 1K tokens.

Example: 10+ PDFs

python cli.py batch --folder ./tests/samples --output ./outputs/results.csv --workers 4 --recursive

This command is suitable for validating a 10+ file run with concurrency, checkpoint resume, and export outputs.

📊 Analytics Dashboard

LegalEase AI now includes a comprehensive analytics dashboard that tracks case outcomes and helps users make informed appeal decisions.

Features

📈 Case Analytics

Track all processed cases (anonymized)
Monitor success rates by jurisdiction, court, and judge
Identify trends and patterns in case outcomes

🎯 Appeal Success Estimator

Estimate your appeal success probability based on similar cases
Get cost and time estimates
See confidence levels based on data quantity

📝 Outcome Feedback Form

Report your case results and appeal outcomes
Help improve predictions for future users
Anonymous and confidential

📊 Judge Performance Analytics

See which judges have higher appeal success rates
Regional comparisons
Identify high-performing courts

Getting Started

1. Initialize Analytics Database

python -c "from database import init_db; init_db()"

2. Generate Sample Data (Optional, for testing)

# Generate 100 sample cases
python scripts/generate_sample_analytics_data.py 100

# Generate more cases for better estimates
python scripts/generate_sample_analytics_data.py 500

# Clear sample data when done
python scripts/generate_sample_analytics_data.py clear

3. Start the App

streamlit run pages/0_Home.py

Note: The main application entry point is pages/0_Home.py.

Multi-page structure with Streamlit's automatic routing
Pages located in pages/ directory:
- 0_Home.py - Judgment analysis (main feature)
- 1_Deadlines.py - Appeal deadline management
- 2_History.py - Notification history
- 3_Settings.py - User preferences
Core utilities extracted to core/app_utils.py
Legacy files (app.py, app_integrated.py) have been deprecated and consolidated into this unified structure

4. Access the Pages

After uploading a judgment:

Analytics Dashboard → View case statistics and trends
Appeal Estimator → Get your appeal success probability
Report Outcome → Submit feedback about your case

How Appeal Success Estimation Works

Enter your case details (type, jurisdiction, court, judge)
System finds similar cases from the database
Calculates success rate based on similar cases
Adjusts for your specifics (decision clarity, case value, etc.)
Returns probability with confidence level

Example:

Case: Civil case in Delhi High Court before Justice Sharma

Similar Cases Found: 23
Appeal Success Rate: 22%
Confidence: Medium

Estimated Cost: ₹12,000 - ₹25,000
Typical Duration: 12-24 months

Privacy & Anonymization

✅ What's protected:

No case numbers or party names stored
No identifiable personal information
User feedback is anonymous
Data aggregated before display

✅ What's tracked (anonymized):

Case type, jurisdiction, court, judge
Outcomes (won/lost/settlement)
Appeal filing and success rates
Timeline data

Data Available

The analytics dashboard works best with real case data. Sample data is provided for testing:

100+ sample cases across 10 jurisdictions
Realistic success rates and timelines
Multiple case types (Civil, Criminal, Family, Commercial, Labor)

Analytics Engine

The system uses:

Similarity Matching: Finds cases similar to yours (50+ parameters)
Statistical Analysis: Calculates success rates by demographics
Confidence Scoring: Rates estimate reliability based on data quantity
Trend Analysis: Identifies regional and judge-specific patterns

For Developers

See ANALYTICS.md for:

Detailed architecture
API reference
Database schema
Sample data generation
Integration examples

Name		Name	Last commit message	Last commit date
Latest commit History 687 Commits
.github/workflows		.github/workflows
.streamlit		.streamlit
.vscode		.vscode
__pycache__		__pycache__
api		api
core		core
db		db
k8s		k8s
notifications		notifications
observability		observability
pages		pages
scratch		scratch
scripts		scripts
sdk		sdk
services		services
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.env.production		.env.production
.gitignore		.gitignore
.jwt_secret		.jwt_secret
.pre-commit-config.yaml		.pre-commit-config.yaml
ARCHITECTURE_DIAGRAMS.md		ARCHITECTURE_DIAGRAMS.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.api		Dockerfile.api
INSTRUCTIONS.md		INSTRUCTIONS.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
analytics_engine.py		analytics_engine.py
app.py		app.py
auth.py		auth.py
case_manager.py		case_manager.py
celery_app.py		celery_app.py
cli.py		cli.py
config.py		config.py
core.py		core.py
database.py		database.py
deadline_cli.py		deadline_cli.py
docker-compose-api.yml		docker-compose-api.yml
docker-compose.yml		docker-compose.yml
legal_aid_directory.json		legal_aid_directory.json
logging_config.py		logging_config.py
modify_pdf.py		modify_pdf.py
notification_service.py		notification_service.py
notifications_ui.py		notifications_ui.py
pdf_exporter.py		pdf_exporter.py
pyproject.toml		pyproject.toml
report_batch_service.py		report_batch_service.py
report_service.py		report_service.py
requirements-api.txt		requirements-api.txt
requirements.txt		requirements.txt
routes.py		routes.py
scheduler.py		scheduler.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLI Tool for Batch Processing

Installation

CLI Commands

Key Features

Resume Behavior

Output Format

Cost Estimation

Example: 10+ PDFs

📊 Analytics Dashboard

Features

Getting Started

1. Initialize Analytics Database

2. Generate Sample Data (Optional, for testing)

3. Start the App

4. Access the Pages

How Appeal Success Estimation Works

Privacy & Anonymization

Data Available

Analytics Engine

For Developers

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CLI Tool for Batch Processing

Installation

CLI Commands

Key Features

Resume Behavior

Output Format

Cost Estimation

Example: 10+ PDFs

📊 Analytics Dashboard

Features

Getting Started

1. Initialize Analytics Database

2. Generate Sample Data (Optional, for testing)

3. Start the App

4. Access the Pages

How Appeal Success Estimation Works

Privacy & Anonymization

Data Available

Analytics Engine

For Developers

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages