Skip to content

Doc0x1/geoclip-service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

geoclip-service

A Flask microservice that predicts where on Earth a photo was taken using the GeoCLIP machine learning model. Returns the top 3 candidate locations with GPS coordinates and confidence scores.

Built as the backend for an image geolocation feature in a Next.js web application.

How It Works

Client
  │
  ├─ GET /api/get-csrf-token  (X-API-Key header)
  │       └─ returns time-limited HMAC-signed CSRF token
  │
  └─ POST /api/geolocate  (X-API-Key + X-CSRF-Token headers)
          │  image file upload  OR  base64 data URI in form body
          │
          ├─ PIL: validate + convert to JPEG in memory
          │
          ├─ GeoCLIP: predict top 3 GPS coordinates + probabilities
          │           (CLIP-based model trained on geotagged images)
          │
          ├─ reverse_geocoder: (lat, lon) → "City, Region, CC"
          │
          └─ JSON response:
               {
                 "predictions": [
                   {"location": "Paris, Île-de-France, FR", "confidence": 0.82},
                   ...
                 ],
                 "coordinates": [
                   {"lat": 48.8566, "lon": 2.3522},
                   ...
                 ]
               }

API Reference

All endpoints are under /api/. Protected endpoints require an X-API-Key header.

GET /api/health

Health check. No authentication required.

Response: {"status": "healthy"}


GET /api/get-csrf-token

Returns a short-lived CSRF token. Required before calling /api/geolocate.

Headers: X-API-Key: <your key>

Response: {"csrf_token": "<token>"}

Tokens are HMAC-SHA256 signed and expire after 30 minutes.


POST /api/geolocate

Predicts the geographic location of an image.

Headers:

  • X-API-Key: <your key>
  • X-CSRF-Token: <token from get-csrf-token>

Body (multipart/form-data), one of:

  • image — file upload (PNG, JPG, JPEG, WebP, max 5MB)
  • image_data — base64 data URI string (e.g. data:image/jpeg;base64,...)

Response:

{
  "predictions": [
    {"location": "Tokyo, Tokyo, JP", "confidence": 0.74},
    {"location": "Yokohama, Kanagawa, JP", "confidence": 0.15},
    {"location": "Osaka, Osaka, JP", "confidence": 0.06}
  ],
  "coordinates": [
    {"lat": 35.6762, "lon": 139.6503},
    {"lat": 35.4437, "lon": 139.6380},
    {"lat": 34.6937, "lon": 135.5023}
  ]
}

Error responses:

Status Meaning
400 No image provided, invalid file type, or malformed base64
401 Missing or invalid API key
403 CSRF token missing, invalid, or expired
413 Image exceeds 5MB limit
429 Rate limit exceeded (10 req/min per endpoint, 200/day global)
500 Internal server error

Security

  • API key — static shared secret passed via X-API-Key header
  • CSRF tokens — HMAC-SHA256 signed, 30-minute expiry, validated on every mutation
  • CORS — restricted to the configured NEXTJS_DOMAIN origin
  • Rate limiting — Flask-Limiter with per-IP enforcement (200/day, 50/hour global; 10/min on prediction endpoint)
  • Security headers — Content-Security-Policy and other headers via Flask-Talisman
  • File validation — allowlist of image MIME types, 5MB content-length cap, conversion to JPEG before inference

Setup

Prerequisites

  • Python 3.9+
  • build-essential (for compiling some pip packages)
  • ~2GB disk space for PyTorch CPU and model weights

Environment variables

Copy .env.example to .env and fill in the values:

cp .env.example .env
API_KEY=        # shared secret for the X-API-Key header
CSRF_SECRET=    # random secret for HMAC signing (generate below)
NEXTJS_DOMAIN=  # origin allowed by CORS, e.g. https://yourdomain.com
PORT=9999       # port to listen on

Generate secrets:

python -c "import secrets; print(secrets.token_hex(32))"

Development

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

FLASK_ENV=development python app.py

The dev server runs on http://localhost:5000 (or the PORT env var) with debug mode enabled.

Production (Linux / systemd)

The install.sh script sets up a Python virtualenv and registers a systemd service using Gunicorn:

# Edit install.sh to set your username and deployment path, then:
chmod +x install.sh
./install.sh

The service is configured for a 6-core machine (12 Gunicorn workers, 2 threads each, 120s timeout).

Check status:

sudo systemctl status geoclip.service
sudo journalctl -u geoclip.service -f

Dependencies

Key packages (see requirements.txt for the full list):

Package Purpose
geoclip Core ML model for GPS prediction
torch (CPU) PyTorch runtime for GeoCLIP
Pillow Image loading and format conversion
reverse-geocoder Offline (lat, lon) → city/region/country
Flask + flask-limiter + flask-talisman Web framework, rate limiting, security headers
gunicorn Production WSGI server
python-dotenv .env file loading

Notes on error logs

The error.log from a live deployment will show Gunicorn warnings like:

Invalid HTTP request line: 'SSH-2.0-WanScannerBot'
Invalid HTTP Version: 'RTSP/1.0'
Invalid HTTP request line: '\x03\x00\x00/*...'

These are internet background scanners probing for RDP, SSH, and RTSP services on the HTTP port. They are harmless — Gunicorn rejects them immediately and they never reach the Flask application.

About

Flask microsoervice for Geolocation predictions

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors