AI Screen Overlay

A powerful AI-powered screen capture and chat overlay application built with Electron, React, and TypeScript. Capture any area of your screen and analyze it with multiple AI providers including OpenAI GPT-4o, Claude 3.7 Sonnet, and DeepSeek with comprehensive cost tracking, intelligent token optimization, and professional conversation features.

🌟 Key Features

💰 NEW: Comprehensive Cost Tracking System

Accurate Model Pricing: Precise pricing for 40+ AI models across all providers
- OpenAI: Standard, Batch, Flex, and Priority tiers
- Claude: Sonnet, Haiku, and Opus models with tier-based pricing
- DeepSeek: V2, V2.5, Chat, and Coder models
Real-time Cost Display: Live cost updates in chat interface and history
Mixed-Model Support: Track costs accurately when switching models mid-conversation
Cost Transparency: See exactly what each message costs based on actual usage
Optimization Tracking: Records which token optimization strategy was used

🎨 NEW: Professional User Interface

SVG Icon System: Professional scalable icons throughout the application
Enhanced About Section: Comprehensive app information with dynamic version display
Professional Settings: Clean, enterprise-ready interface design
Status Indicators: Real-time feedback with professional status badges

Advanced Screen Capture & Image Input

Global Hotkey: Press Ctrl+Shift+S anywhere to capture screen areas
File Upload: Upload images directly from files with file browser
Clipboard Integration: Paste images from clipboard with Ctrl+V or dedicated button
Smart Workflow: Images added to current chat with option to move to new chat
Precision Selection: Click and drag to select exact screen regions
Format Support: Supports PNG, JPG, GIF, and all common image formats
File Validation: Automatic size and format validation with user feedback
High Quality: Full resolution capture and processing

Professional Image Editing Suite

Canvas Editor: Built-in drawing tools for image annotation
Drawing Tools: Pen and eraser with customizable colors and brush sizes
Undo/Redo: Full history management with ImageData-based state tracking
Responsive Design: Works seamlessly on all screen sizes and devices
Real-time Editing: Live canvas updates with smooth drawing performance
Color Palette: Complete color selection with opacity controls
Brush Controls: Variable brush sizes for precise annotations
Save Integration: Edited images automatically replace originals in conversation

Multi-LLM Intelligence

OpenAI Models: GPT-4o, GPT-4o Mini, GPT-4 Turbo with vision capabilities
Claude Models: Sonnet 3.7, Sonnet 4, Opus 4.1, Opus 4, Haiku 3.5
DeepSeek Models: Chat and Reasoner with cost-effective analysis
Model Selection: Dropdown menus with pricing information for each model
Provider Status: Real-time API key validation and connection testing
Smart Identity: Each AI model maintains proper identity and capabilities

Professional Chat Experience

Intelligent Naming: Auto-generated chat titles based on conversation content
Manual Renaming: Inline editing with keyboard shortcuts (Enter/Escape)
Smart Numbering: Automatic "Chat 1", "Chat 2", "Screen Capture Chat 1" naming
Markdown Rendering: Full markdown support with syntax-highlighted code blocks
Text Selection: Select and copy any message content with one click
Provider Attribution: Each response shows AI provider and specific model used
Chat History: Persistent conversations with SQLite storage
Context Awareness: Full conversation context maintained across messages

Modern Interface & Theming

Dynamic Theme System: Three distinct themes - Glassmorphism (Default), Dark, and Light
Adaptive Opacity: Automatically adjusts transparency based on system theme detection
Glassmorphism Design: Beautiful translucent overlay with blur effects and gradient backgrounds
Smart Background Detection: System theme awareness for optimal visibility
Always On Top: Stays accessible while you work on other applications
Drag & Resize: Repositionable and resizable overlay window with smooth interactions
Enhanced Contrast: Optimized readability on any background with text shadows
Responsive Layout: Adapts to different screen sizes and orientations

System Requirements

Operating System: Windows, macOS, or Linux
Node.js: Version 16 or higher
npm: Version 8 or higher
Git: For version control and repository management
Display Resolution: Minimum 1024x768 (HD or higher recommended)
GPU: Hardware acceleration support recommended for smooth overlay rendering
Memory: Minimum 4GB RAM (8GB+ recommended for AI processing)
System Theme Support: Windows 10/11, macOS 10.14+, or Linux desktop environments

Quick Start

Installation

# Clone the repository
git clone https://github.com/Black-Lights/ai_screen_overlay.git
cd ai_screen_overlay

# Install dependencies
npm install

# Setup environment
cp .env.example .env
nano .env  # Add your API keys

# Build and start
npm run build
npm start

API Key Configuration

Edit .env file with your API keys:

# OpenAI (GPT-4o, GPT-4o Mini, GPT-4 Turbo)
OPENAI_API_KEY=sk-your-openai-key-here

# Anthropic (Claude models)
CLAUDE_API_KEY=sk-ant-your-claude-key-here

# DeepSeek (Chat and Reasoner)
DEEPSEEK_API_KEY=your-deepseek-key-here

Get API Keys:

OpenAI: platform.openai.com/api-keys
Claude: console.anthropic.com
DeepSeek: platform.deepseek.com

Usage Guide

Basic Workflow

Launch the application with npm start
Position the overlay on your preferred screen location
Add images via screen capture (Ctrl+Shift+S), file upload, or clipboard paste (Ctrl+V)
Edit images using the built-in canvas editor with drawing tools
Select your preferred AI provider and model
Analyze images with natural language questions
Chat with full markdown rendering and copy support

Image Input Methods

Screen Capture: Global hotkey Ctrl+Shift+S works from any application
File Upload: Click upload button to browse and select image files
Clipboard Paste: Use Ctrl+V or click clipboard button to paste copied images
Format Support: Supports PNG, JPG, GIF, and all common image formats
Size Validation: Automatic file size and format validation with error messages

Image Editing Features

Canvas Editor: Click edit button on any image to open drawing tools
Drawing Tools: Choose between pen and eraser with color selection
Brush Controls: Adjustable brush size and opacity for precise annotations
Undo/Redo: Full history tracking with unlimited undo/redo operations
Responsive Interface: Canvas editor works on all screen sizes
Save Integration: Edited images automatically replace originals in conversation

Chat Management

New Chat: Click the "+" button to create a new conversation
Rename Chats: Click the edit icon next to any chat title
Auto-naming: Chats are automatically renamed based on content after the first AI response
Image Integration: All images flow seamlessly into your conversations

AI Provider Setup

API Keys: Add your OpenAI, Claude, and/or DeepSeek API keys in settings
Model Selection: Choose specific models with pricing information displayed
Status Monitoring: Indicators show real-time API connectivity status
Provider Switching: Change providers mid-conversation as needed

Keyboard Shortcuts

Shortcut	Action
`Ctrl+Shift+S`	Capture screen area and add to current chat
`Ctrl+V`	Paste image from clipboard
`Enter`	Send message / Save chat rename
`Shift+Enter`	New line in message
`Esc`	Cancel screen capture / Cancel chat rename
`Ctrl+C`	Copy selected text

Latest Features (v1.1.0) - 🚀 Major Update

💰 Comprehensive Cost Tracking

Accurate Billing: See exactly what each message costs based on the actual model used
40+ Model Support: Precise pricing for OpenAI (4 tiers), Claude, and DeepSeek models
Real-time Updates: Dynamic cost calculation and display in chat interface
Cost History: Complete cost tracking across all conversations
Mixed-Model Support: Accurate cost calculation when switching models mid-conversation

🎨 Professional UI Overhaul

SVG Icon System: Replaced all emojis with professional scalable icons
Enhanced About Section: Dynamic version display and comprehensive app information
Enterprise-Ready: Professional appearance suitable for business environments
Consistent Design: Color-coded features with semantic iconography

🚀 Token Optimization Enhancements

Smart Cost Estimation: Comprehensive cost disclaimers and accuracy improvements
Dynamic Token Counter: Auto-updating with real-time cost calculations
Optimization Tracking: Database records of optimization methods used per message
Performance Insights: Clear indication of applied optimization strategies

🛠️ Technical Improvements

Windows Compatibility: Fixed API key storage for Windows installer builds
Database Enhancements: Added comprehensive cost tracking schema
Build System: Improved webpack configuration and asset handling
Error Handling: Enhanced stability and error recovery

Technical Architecture

Technology Stack

Frontend: React 18 + TypeScript + Tailwind CSS
Backend: Electron 27 + Node.js + better-sqlite3
Build: Webpack 5 + TypeScript compiler
Styling: Tailwind CSS + Custom glassmorphism
Markdown: ReactMarkdown + Prism syntax highlighting

Project Structure

src/
├── main/                    # Electron main process
│   ├── main.ts             # Application entry point with global shortcuts
│   ├── database.ts         # SQLite database service with chat management
│   ├── screen-capture.ts   # Advanced screen capture with area selection
│   ├── ipc-handlers.ts     # IPC communication handlers for all features
│   └── preload.ts          # Secure preload script with event handling
├── renderer/               # React frontend
│   ├── App.tsx             # Main app with chat state management
│   ├── components/         # UI components
│   │   ├── Overlay.tsx     # Main overlay container with window controls
│   │   ├── ChatInterface.tsx # Chat UI with image upload and editing
│   │   ├── ChatHistory.tsx # Sidebar with inline renaming functionality
│   │   ├── LLMSelector.tsx # AI provider selection with status indicators
│   │   ├── ImageCanvas.tsx # Canvas editor for image annotation
│   │   └── Settings.tsx    # Configuration panel with API key management
│   ├── services/           # Frontend services
│   │   └── chatNamingService.ts # LLM-based automatic chat title generation
│   └── styles/             # CSS and styling with glassmorphism
├── shared/                 # Shared TypeScript definitions
│   ├── types.ts           # Complete type definitions for all features
│   └── models.ts          # AI model configurations with pricing
└── types/                  # Global type declarations

Database Schema

-- Chat conversations with intelligent naming
CREATE TABLE chats (
  id INTEGER PRIMARY KEY,
  title TEXT NOT NULL,
  created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
  updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- Chat messages with full AI attribution and image support
CREATE TABLE messages (
  id INTEGER PRIMARY KEY,
  chat_id INTEGER REFERENCES chats(id),
  role TEXT CHECK (role IN ('user', 'assistant')),
  content TEXT NOT NULL,
  image_path TEXT,
  provider TEXT,
  model TEXT,
  timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
);

-- Application settings with API key management
CREATE TABLE settings (
  key TEXT PRIMARY KEY,
  value TEXT NOT NULL
);

Development

Development Setup

# Clone repository
git clone https://github.com/Black-Lights/ai_screen_overlay.git
cd ai_screen_overlay

# Install Node.js dependencies
npm install

# Setup environment variables
cp .env.example .env
nano .env  # Add your API keys

# Start development mode
npm run dev          # Starts both Electron and React dev servers
npm run dev:react    # React development server only
npm run dev:electron # Electron development mode only

Build Commands

npm run build           # Build everything
npm run build:react     # Build React frontend
npm run build:electron  # Build Electron main process
npm run clean           # Clean build artifacts

Code Quality

npm run lint           # ESLint code checking
npm run type-check     # TypeScript validation
npm test              # Run test suite

Distribution & Deployment

Building Distributions

Create installable applications for all platforms:

# Setup app icons (first time only)
./setup-icons.sh

# Build for all platforms (Windows, macOS, Linux)
./build-dist.sh all

# Build for specific platforms
./build-dist.sh win      # Windows installer + portable
./build-dist.sh mac      # macOS DMG + ZIP
./build-dist.sh linux    # AppImage, DEB, RPM, Snap
./build-dist.sh portable # Portable versions only

See DISTRIBUTION.md for complete distribution guide.

Distribution Formats

Windows

.exe Installer (NSIS): Full installer with Start Menu shortcuts
.exe Portable: No installation required, run directly
Features: Desktop shortcuts, auto-updater, uninstaller

macOS

.dmg Installer: Standard macOS app installer
Universal Binary: Supports both Intel (x64) and Apple Silicon (arm64)
Features: Code signing ready, Gatekeeper compatible

Linux

.AppImage: Portable, runs on any Linux distribution (Recommended)
.deb Package: For Debian/Ubuntu systems (apt install)
.rpm Package: For RedHat/Fedora systems (yum/dnf install)
.snap Package: Universal Linux package (snap install)

Troubleshooting

Common Issues

Screen Capture Not Working

# Check X11 permissions
xhost +local:
echo $DISPLAY

# Install missing dependencies
sudo apt-get install libxtst6 libxrandr2 libx11-6

API Connection Errors

Check API Keys: Verify keys are valid and have credits
Network Issues: Test internet connection
Rate Limits: Wait if rate limited by provider
Model Availability: Some models have regional restrictions

Build Failures

# Clear caches and reinstall
rm -rf node_modules package-lock.json dist
npm install
npm run build

Debug Mode

# Enable verbose logging
DEBUG=* npm start

# Check Electron logs
tail -f electron-debug.log

Performance

Optimizations

Lazy Loading: Components loaded on demand
Code Splitting: Vendor and main bundles separated
Database Indexing: Optimized queries for chat history
Memory Management: Automatic cleanup of temporary images
Bundle Size: Optimized to ~950KB total

Benchmarks

Startup Time: ~2-3 seconds cold start
Screen Capture: <500ms from hotkey to selection
AI Response: Varies by provider (2-10 seconds)
Memory Usage: ~150-200MB RAM

Privacy & Security

Data Protection

Local Storage: All data remains on your machine
No Telemetry: Zero tracking or analytics
Secure APIs: Direct HTTPS communication with AI providers
Temporary Files: Images auto-deleted after processing

Security Practices

Environment Variables: API keys stored in .env file
IPC Security: Secure communication between processes
Input Validation: All user inputs sanitized
Error Handling: Graceful error recovery without data loss

Roadmap

Upcoming Features

Multi-language Support: Internationalization
Plugin System: Custom AI provider plugins
Export Options: PDF, HTML conversation export
Voice Input: Speech-to-text integration
Cloud Sync: Optional cloud backup (encrypted)
OCR: Text extraction from images
Drag & Drop: Direct file drag and drop support

Performance Improvements

Streaming Responses: Real-time AI response streaming
Caching: Intelligent response caching
Compression: Database and image optimization
Background Processing: Non-blocking AI operations

Adding New Features

New AI Provider

Add provider config to src/shared/models.ts
Implement API client in src/main/ipc-handlers.ts
Update UI selectors in src/renderer/components/LLMSelector.tsx
Add API key handling in settings

New UI Components

Create component in src/renderer/components/
Add styling in src/renderer/styles/globals.css
Import and use in parent components
Update TypeScript types if needed

License

MIT License - see LICENSE file for full details.

Contributing

We welcome contributions! Please see our contributing guidelines:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Guidelines

Follow TypeScript best practices
Add tests for new features
Update documentation
Ensure accessibility compliance
Test on multiple platforms

Acknowledgments

Special thanks to:

Electron Team - Cross-platform desktop framework
React Community - Frontend framework and ecosystem
AI Providers - OpenAI, Anthropic, and DeepSeek for powerful APIs
Open Source - All the amazing libraries that make this possible

Star this repository if you find it helpful!

Author: Ammar (Black-Lights)
Project: AI Screen Overlay - Professional AI-powered screen capture and chat application

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
assets		assets
src		src
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
API_SETUP_GUIDE.md		API_SETUP_GUIDE.md
CHANGELOG.md		CHANGELOG.md
DISTRIBUTION.md		DISTRIBUTION.md
DISTRIBUTION_TEST_RESULTS.md		DISTRIBUTION_TEST_RESULTS.md
LICENSE		LICENSE
README.md		README.md
RELEASE_NOTES_v1.0.0.md		RELEASE_NOTES_v1.0.0.md
UBUNTU_INSTALL.md		UBUNTU_INSTALL.md
WINDOWS_DISTRIBUTION_FIXES.md		WINDOWS_DISTRIBUTION_FIXES.md
build-dist.sh		build-dist.sh
electron.config.js		electron.config.js
first-run-setup.sh		first-run-setup.sh
node-requirements.txt		node-requirements.txt
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
requirements.txt		requirements.txt
setup-icons.sh		setup-icons.sh
tailwind.config.js		tailwind.config.js
test-global.html		test-global.html
tsconfig.json		tsconfig.json
webpack.config.js		webpack.config.js

Folders and files

Latest commit

History

Repository files navigation

AI Screen Overlay

🌟 Key Features

💰 NEW: Comprehensive Cost Tracking System

🎨 NEW: Professional User Interface

Advanced Screen Capture & Image Input

Professional Image Editing Suite

Multi-LLM Intelligence

Professional Chat Experience

Modern Interface & Theming

System Requirements

Quick Start

Installation

API Key Configuration

Usage Guide

Basic Workflow

Image Input Methods

Image Editing Features

Chat Management

AI Provider Setup

Keyboard Shortcuts

Latest Features (v1.1.0) - 🚀 Major Update

💰 Comprehensive Cost Tracking

🎨 Professional UI Overhaul

🚀 Token Optimization Enhancements

🛠️ Technical Improvements

Technical Architecture

Technology Stack

Project Structure

Database Schema

Development

Development Setup

Build Commands

Code Quality

Distribution & Deployment

Building Distributions

Distribution Formats

Windows

macOS

Linux

Troubleshooting

Common Issues

Screen Capture Not Working

API Connection Errors

Build Failures

Debug Mode

Performance

Optimizations

Benchmarks

Privacy & Security

Data Protection

Security Practices

Roadmap

Upcoming Features

Performance Improvements

Adding New Features

New AI Provider

New UI Components

License

Contributing

Development Guidelines

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages