log-analyzer-cli

A polished, Rich-powered Python CLI that generates realistic NCSA Combined access logs and analyzes them for traffic, errors, and security signals.

log-analyzer-cli is a single-file install, zero-config tool for engineers who want to demo log-analysis pipelines, prototype incident-response queries, or hand juniors a sandbox full of believable-looking traffic â€” including SQL-injection probes, brute-force POSTs, dirbusting sweeps, and .env leak attempts.

Highlights

Realistic synthetic logs â€” long-tail IP distribution, diurnal traffic curve, browser/bot mix, full status-code spread, and ~0.5% embedded suspicious requests by default.
Object-oriented analyzers â€” one class per concern, every analyzer returns a JSON-serializable AnalysisResult and its own Rich renderer.
Three report formats â€” Rich tables, GitHub-flavored Markdown, machine-readable JSON.
First-class security view â€” admin probes, config leaks, SQLi/XSS payloads, path traversal, RCE probes, dirbusting, brute-force POSTs, offensive-tool User-Agents.
Tested â€” 50+ pytest cases across parser, generator, analyzers, report builder, and CLI runner.

Demo

$ log-analyzer generate --out access.log --lines 100000 --days 7
                          Generation complete
+-----------------------------------------------------------------+
| File: access.log                                                |
| Lines: 100,000                                                  |
| Suspicious: 489 (0.49%)                                         |
| Unique IPs: 348                                                 |
| Window: 7 day(s)                                                |
+-----------------------------------------------------------------+

$ log-analyzer suspicious access.log
                            Suspicious requests
+----------+--------------+------+-----------------------------------+
| Severity | Category     | Hits | Example                           |
+----------+--------------+------+-----------------------------------+
| CRITICAL | config-leak  |   71 | 203.0.113.5 GET /.env             |
| CRITICAL | sql-injection|   58 | 198.51.100.7 GET /api/v1/products |
| CRITICAL | rce-probe    |   12 | 192.0.2.4 GET /actuator/env       |
| HIGH     | admin-probe  |  142 | 198.51.100.7 GET /wp-login.php    |
| HIGH     | xss          |   34 | 203.0.113.5 GET /search?q=<svg... |
| HIGH     | brute-force  |  121 | 192.0.2.4 -> 121 POST /login      |
| MEDIUM   | dirbuster    |   77 | 198.51.100.7 GET /backup          |
| MEDIUM   | bad-ua       |  483 | 192.0.2.4 GET /admin              |
+----------+--------------+------+-----------------------------------+

Install

git clone https://github.com/MSeyyidDev/log-analyzer-cli.git
cd log-analyzer-cli
python -m pip install -e .

Python 3.11+ required. Dev extras (pytest, coverage) install via pip install -e ".[dev]".

Commands

Command	What it does
`generate`	Produce a synthetic NCSA Combined access log.
`parse`	Show the first/last N parsed entries as a Rich table.
`top-ips`	Rank source IPs by request volume.
`status-codes`	Aggregate HTTP status codes (with class buckets).
`not-found`	List the most-requested 404 paths.
`server-errors`	Group 5xx responses by path and code.
`suspicious`	Detect admin probing, SQLi/XSS, dirbusting, brute force, etc.
`report`	Run all analyzers and emit a Rich, Markdown, or JSON report.

Run log-analyzer <command> --help for full per-command examples.

Architecture

flowchart LR
    subgraph CLI [Typer + Rich CLI]
        A[generate] --> G
        B[parse / top-ips / ...] --> P
        C[report] --> R
    end

    G[LogGenerator] -->|writes NCSA lines| F[(access.log)]
    F --> P[LogParser]
    P -->|stream of LogEntry| ANA[Analyzer suite]
    ANA --> R[ReportBuilder]
    R --> O1[Rich console]
    R --> O2[Markdown]
    R --> O3[JSON]

    subgraph ANA [Analyzers]
        A1[TopIPAnalyzer]
        A2[StatusCodeAnalyzer]
        A3[NotFoundAnalyzer]
        A4[ServerErrorAnalyzer]
        A5[SuspiciousPatternAnalyzer]
        A6[UserAgentAnalyzer]
        A7[TrafficByHourAnalyzer]
    end

Each analyzer implements:

class BaseAnalyzer:
    def analyze(self, entries: Iterable[LogEntry]) -> AnalysisResult: ...
    def render(self, result: AnalysisResult, console: Console) -> None: ...

AnalysisResult is a small dataclass with name, title, summary, and a JSON-friendly data payload. This split is what lets report emit Rich, Markdown, or JSON from the same in-memory results.

Detected suspicious patterns

Category	Severity	Detection
`config-leak`	critical	`/.env`, `/.git/config`, `/.aws/credentials`, leaked backups
`sql-injection`	critical	classic SQLi payloads (`' OR 1=1`, `UNION SELECT`, `DROP TABLE`)
`rce-probe`	critical	Ignition, Spring `actuator/env`, PHPUnit eval-stdin
`shellshock`	critical	Shellshock probes embedded in User-Agent
`admin-probe`	high	`/wp-login.php`, `/phpmyadmin`, `/admin/...`
`xss`	high	`<script>`, `onerror=`, `javascript:`, `<svg onload=...>`
`path-traversal`	high	`../../`, percent-encoded variants
`brute-force`	high	repeated `POST /login` per IP above threshold
`dirbuster`	medium	hidden-resource enumeration (`/backup`, `/staging`, ...)
`bad-user-agent`	medium	`sqlmap`, `nikto`, `nmap`, `masscan`, etc.

Example reports

Sample output lives in examples/:

examples/sample.log â€” 1,000-line deterministic synthetic log
examples/report.md â€” Markdown report generated from the above
examples/report.json â€” JSON report generated from the above

Regenerate with:

make demo                      # generate + show top-ips, status-codes, suspicious
log-analyzer report examples/sample.log --format markdown --out examples/report.md
log-analyzer report examples/sample.log --format json --out examples/report.json

Run

log-analyzer generate --out access.log --lines 100000 --days 7
log-analyzer report access.log --format rich
log-analyzer report access.log --format markdown --out report.md
log-analyzer suspicious access.log --brute-force-threshold 5

Test

python -m pip install -e ".[dev]"
pytest -v

Demo artifacts

Sample output lives in examples/:

examples/sample.log - deterministic 1,000-line access log.
examples/report.md - Markdown report generated from the sample.
examples/report.json - machine-readable report for automation.

License

MIT - see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
examples		examples
log_analyzer		log_analyzer
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

log-analyzer-cli

Highlights

Demo

Install

Commands

Architecture

Detected suspicious patterns

Example reports

Run

Test

Demo artifacts

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

log-analyzer-cli

Highlights

Demo

Install

Commands

Architecture

Detected suspicious patterns

Example reports

Run

Test

Demo artifacts

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages