This major release removes ALL cloud AI dependencies. Your Second Brain now runs entirely on local models - no API keys, no monthly fees, complete privacy!
- REMOVED all OpenAI dependencies - no more API costs!
- REMOVED all Anthropic dependencies - complete privacy!
- REMOVED cloud-based embeddings - everything is local now!
-
LM Studio Integration (port 1234)
- LLaVA 1.6 Mistral 7B Q6_K for text generation
- Nomic Embed Text v1.5 for text embeddings (768-dim)
- Full vision support with multimodal capabilities
-
CLIP Service (port 8002)
- OpenAI CLIP ViT-L/14 for image embeddings
- 768-dimensional vectors for semantic image search
- ~300ms processing time per image
-
LLaVA Service (port 8003)
- LLaVA 1.6 Mistral 7B with 4-bit quantization
- Deep image understanding and OCR
- 4096-dimensional embeddings for rich visual features
-
Google Drive Integration
- Full OAuth 2.0 implementation
- Automatic document synchronization
- Multimodal processing of all file types
Tested on RTX 4090:
- Text embeddings: ~100ms per document
- Image embeddings: ~300ms per image
- Vision analysis: 2-5 seconds per image
- Memory usage: ~12GB VRAM (all services combined)
- Processing speed: 200 docs/minute, 20 images/minute
- Batch processing of multi-part Google Takeout archives
- HEIC support for iPhone photos with automatic conversion
- Real-time dashboard with WebSocket updates
- Deduplication to avoid reprocessing existing photos
- Successfully tested with 60,000+ photos indexed
- Processes 10GB+ archive files without memory issues
- Complete Docker-only workflow - no Python venv needed
- Cross-platform support with automatic OS detection
- Windows 11 + WSL2 optimized configuration
- Port standardization - all services on port 8000 (was 8001)
- Host service connectivity via docker-compose.override.yml
Port 8000: Main FastAPI backend (changed from 8001)
Port 8002: CLIP image embeddings
Port 8003: LLaVA vision understanding
Port 1234: LM Studio (text + embeddings)
Port 5432: PostgreSQL with pgvector
Port 8080: Adminer database UI
-
Install LM Studio and load these models:
llava-1.6-mistral-7bQ6_K (6.57GB)text-embedding-nomic-embed-text-v1.5
-
Start services:
# PostgreSQL
docker-compose up -d postgres
# GPU Services
python services/gpu/clip/clip_api.py # Port 8002
python services/gpu/llava/llava_api.py # Port 8003
# LM Studio - start manually on port 1234
# Main backend
uvicorn app.main:app --port 8000- Update your .env:
# Remove these:
# OPENAI_API_KEY=...
# ANTHROPIC_API_KEY=...
# Add these:
LM_STUDIO_URL=http://127.0.0.1:1234/v1
CLIP_SERVICE_URL=http://127.0.0.1:8002
LLAVA_SERVICE_URL=http://127.0.0.1:8003- 100% Private: No data leaves your machine
- Zero API Costs: Run unlimited queries
- Multimodal Search: Find by text, image, or both
- Vision Understanding: Extract text from images, analyze diagrams
- Google Drive Sync: Process all your documents locally
- Knowledge Graph: Automatic relationship discovery
- Offline Mode: Works without internet
- Remove API keys from
.env - Install LM Studio
- Load required models
- Update service URLs
- Restart all services
- Old: 1536 dimensions (OpenAI)
- New: 768 dimensions (Nomic/CLIP)
- Existing embeddings will need regeneration
| Cloud AI | Local AI |
|---|---|
| $20-200/month | $0/month |
| Data leaves your network | 100% private |
| Internet required | Works offline |
| Rate limits | Unlimited usage |
| Vendor lock-in | Full control |
Document Processing (1000 files):
- Total time: ~5 minutes
- Text extraction: 200 docs/min
- Image analysis: 20 imgs/min
- Embedding generation: 100ms/doc
Search Performance:
- Vector search: <50ms
- Hybrid search: <100ms
- Image similarity: <200ms
- REMOVED all OpenAI references from codebase (was causing test failures)
- FIXED
save_to_diskAttributeError in periodic persistence - FIXED CI/CD pipeline - all tests passing
- FIXED WebSocket real-time updates for photo pipeline
- IMPROVED Photo processor to handle corrupted files gracefully
- UPDATED Docker mounts for Google Drive integration
- RENAMED dev.bat/sh to manage.bat/sh for clarity
- ADDED Cross-platform startup scripts (start-dev.bat/sh)
- LM Studio must be started manually
- First model load takes 30-60 seconds
- Google Drive files must be "Available offline" for Docker access
- Vision API requires proper CUDA setup
- Ollama integration
- Automatic model downloading
- Web UI for model management
- Multi-GPU support
- Apple Silicon optimization
transformers>=4.36.0
torch>=2.1.0
torchvision>=0.16.0
bitsandbytes>=0.41.0
accelerate>=0.25.0
sentence-transformers>=2.2.2
- LM Studio team for the excellent local inference server
- Hugging Face for model hosting
- The open-source AI community
Built with โค๏ธ for privacy and self-sovereignty
No cloud. No tracking. No API keys. Just you and your second brain.