Skip to content

lilhuss26/sakila25

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

52 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Sakila25

A modernized version of the classic Sakila database with 2025 data, supporting multiple database platforms.

Features

  • 🎬 Real movie data from TMDB API
  • πŸ’³ Realistic customer and payment data
  • πŸ—„οΈ Multi-database support: MySQL, PostgreSQL, SQL Server, MongoDB
  • πŸ“Š CSV exports for data analysis
  • πŸ”„ Fully reproducible with Python scripts

Supported Databases

MySQL PostgreSQL SQL Server MongoDB

Pre-built Database Backups

Ready-to-use database dumps are available in the Sakila25/ folder. Simply restore them using your preferred database tool:

# MySQL
mysql -u username -p sakila25 < Sakila25/sakila25_mysql.sql

# PostgreSQL
psql -U username -d sakila25 -f Sakila25/sakila25_postgresql.sql

# SQL Server
sqlcmd -S localhost -d sakila25 -i Sakila25/sakila25_sqlserver.sql

Database Schema

Sakila25 Schema

The schema includes:

  • Films with actors, categories, and languages
  • Customers with addresses and payment cards
  • Providers (streaming services) with inventory
  • Subscriptions and Payments
  • Views for common queries (actor_info, customer_list, film_list, revenue_by_provider)

Differences from Original Sakila

  • βœ… Modern 2025 movie data
  • βœ… Streaming providers instead of rental stores
  • βœ… Subscription-based model with payments
  • βœ… Credit card information
  • βœ… Multi-database support including NoSQL
  • βœ… Automated data generation from APIs

Prerequisites

  • Python 3.12+
  • Database Server (one or more):
    • MySQL 8.0+
    • PostgreSQL 13+
    • SQL Server 2019+
    • MongoDB 5.0+
  • TMDB API Key - Free from themoviedb.org

Quick Start

1. Installation

git clone https://github.com/lilhuss26/sakila25.git
cd sakila25
pip install -r requirements.txt

2. Configuration

Create a .env file based on .env.example:

TMDB_API_KEY="your_tmdb_api_key"
TMDB_TOKEN="your_tmdb_token"
MYSQL_STRING="mysql+pymysql://username:password@localhost:3306"
POSTGRESQL_STRING="postgresql+psycopg2://username:password@localhost:5432/sakila25"
SQLSERVER_STRING="mssql+pymssql://username:password@localhost:1433/sakila25"
MONGODB_STRING="mongodb://username:password@localhost:27017/"

3. Create Databases

Important: For PostgreSQL and SQL Server, create the database first:

-- PostgreSQL
CREATE DATABASE sakila25;

-- SQL Server
CREATE DATABASE sakila25;

Then run the creation scripts:

# MySQL (auto-creates database)
python scripts/create_mysql.py

# PostgreSQL
python scripts/create_postgresql.py

# SQL Server
python scripts/create_sqlserver.py

# MongoDB (auto-creates database)
python scripts/create_mongodb.py

# Generate CSV files
python scripts/create_csv.py

Project Structure

sakila25/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ FetchData/              # API data fetchers
β”‚   β”‚   β”œβ”€β”€ Fetchers/           # Individual fetcher modules
β”‚   β”‚   └── FetchData.py        # Main fetcher orchestrator
β”‚   └── DatabasesCreation/      # Database implementations
β”‚       β”œβ”€β”€ MySQL/              # MySQL schema, insertion, views
β”‚       β”œβ”€β”€ PostgreSQL/         # PostgreSQL schema, insertion, views
β”‚       β”œβ”€β”€ SQLServer/          # SQL Server schema, insertion, views
β”‚       β”œβ”€β”€ MongoDB/            # MongoDB denormalized structure
β”‚       └── CSV/                # CSV export functionality
β”œβ”€β”€ scripts/                    # Database creation scripts
β”œβ”€β”€ Sakila25/                   # Pre-built database backups
└── .env.example                # Environment configuration template

Technology Stack

  • SQLAlchemy - ORM and schema management
  • pymongo - MongoDB driver
  • pymssql - SQL Server drivers
  • psycopg2 - PostgreSQL driver
  • pymysql - MySQL driver
  • requests - API data fetching
  • pandas - CSV generation
  • python-dotenv - Environment configuration

Data Sources

MongoDB Structure

MongoDB uses a denormalized structure for optimal performance:

  • films collection: Embedded actors, categories, and language
  • providers collection: Embedded inventory
  • customers collection: Embedded address, card, subscriptions, and payments

Troubleshooting

Connection Issues

  • Ensure database servers are running
  • Verify connection strings in .env
  • Check firewall settings

SQL Server Views Error

  • SQL Server doesn't support ORDER BY in views without TOP
  • Views are created without ordering; add ORDER BY when querying

MongoDB Connection

  • MongoDB doesn't require database creation beforehand
  • Database is created automatically on first insert

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

Re-creation of sakila database with 2025 data from TMDB API, available with multiple database engines

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages