Every module in this project has one job. encoders.py transforms data. detector.py scores formats. peeler.py orchestrates recursive decoding. formatter.py renders output. cli.py wires user input to functions. No module reaches into another module's concern.
This isn't over-engineering for a small project. It's how you keep a small project from becoming an unmanageable one. When you need to add a new encoding format, you touch encoders.py and detector.py. That's it. The CLI, formatter, and peeler don't change.
cli.py
βββ constants.py (EncodingFormat, ExitCode, PEEL_MAX_DEPTH)
βββ encoders.py (encode, decode, encode_url, decode_url)
βββ detector.py (detect_encoding)
βββ peeler.py (peel)
βββ formatter.py (print_encoded, print_decoded, print_detection, print_peel_result, print_chain_result)
βββ utils.py (resolve_input_bytes, resolve_input_text)
peeler.py
βββ constants.py (CONFIDENCE_THRESHOLD, PEEL_MAX_DEPTH, EncodingFormat)
βββ detector.py (detect_best)
βββ utils.py (safe_bytes_preview, truncate)
detector.py
βββ constants.py (charsets, thresholds, EncodingFormat)
βββ encoders.py (try_decode)
βββ utils.py (is_printable_text)
formatter.py
βββ constants.py (EncodingFormat, PREVIEW_LENGTH)
βββ detector.py (DetectionResult β type only)
βββ peeler.py (PeelResult β type only)
βββ utils.py (safe_bytes_preview)
encoders.py
βββ constants.py (EncodingFormat)
utils.py
βββ (no internal deps)
constants.py
βββ (no internal deps)
The dependency arrows always point downward. constants.py and utils.py sit at the bottom with zero internal dependencies. cli.py sits at the top, importing from everything. Nothing in the middle reaches upward. This is a directed acyclic graph (DAG), and if you ever create a circular import, Python will tell you immediately.
User Input (str or file or stdin)
β
βΌ
resolve_input_bytes() β utils.py:12
β Converts any input source to raw bytes
βΌ
encode(raw, fmt) β encoders.py:88
β Dispatches via ENCODER_REGISTRY to format-specific function
βΌ
encode_base64(data) (or other) β encoders.py:22
β Returns encoded string
βΌ
print_encoded(result, fmt) β formatter.py:31
β Rich panel if terminal, raw stdout if piped
βΌ
Output
User Input (str or file or stdin)
β
βΌ
resolve_input_text() β utils.py:29
β Converts any input source to stripped text
βΌ
decode(text, fmt) β encoders.py:93
β Dispatches via ENCODER_REGISTRY to format-specific function
βΌ
decode_base64(data) (or other) β encoders.py:26
β Returns decoded bytes
βΌ
print_decoded(result) β formatter.py:44
β Safe preview (UTF-8 if possible, hex fallback)
βΌ
Output
User Input (str)
β
βΌ
detect_encoding(text) β detector.py:206
β
ββββΊ _score_base64(text) β detector.py:31
ββββΊ _score_base64url(text)β detector.py:70
ββββΊ _score_base32(text) β detector.py:97
ββββΊ _score_hex(text) β detector.py:126
ββββΊ _score_url(text) β detector.py:174
β
β Each scorer returns 0.0β1.0
β Results filtered by CONFIDENCE_THRESHOLD (0.6)
β Sorted by confidence descending
βΌ
print_detection(results) β formatter.py:58
β Rich table: format, confidence %, decoded preview
βΌ
Output
User Input (str)
β
βΌ
peel(text, max_depth=20) β peeler.py:33
β
ββββΊ LOOP (up to max_depth iterations):
β β
β βββ detect_best(current_text) β detector.py:226
β β Returns highest-confidence detection
β β
β βββ Break if: no detection, below threshold, decode fails
β β
β βββ Record PeelLayer (depth, format, confidence, previews)
β β
β βββ decoded_bytes β current_text for next iteration
β (break if bytes aren't valid UTF-8)
β
βΌ
PeelResult(layers, final_output, success)
β
βΌ
print_peel_result(result) β formatter.py:94
β Layer-by-layer display + final output panel
βΌ
Output
User Input (str) + --steps "base64,hex,url"
β
βΌ
resolve_input_bytes() β utils.py:12
β
βΌ
_parse_chain_steps("base64,hex,url") β cli.py:264
β Validates each format name against EncodingFormat enum
β Returns [BASE64, HEX, URL]
βΌ
LOOP over formats:
β
βββ encode(current_bytes, fmt) β encoders.py:88
βββ Record (fmt, encoded_string)
βββ encoded_string β bytes for next iteration
β
βΌ
print_chain_result(steps, final) β formatter.py:130
β Step-by-step display + final panel
βΌ
Output
Instead of a chain of if fmt == "base64": ... elif fmt == "base64url": ..., every encoder and decoder pair is registered in a dictionary:
ENCODER_REGISTRY: dict[EncodingFormat, tuple[EncoderFn, DecoderFn]]
Adding a new format means adding one entry to the registry and writing the two functions. The dispatch functions encode() and decode() never change. This is the open-closed principle: open for extension, closed for modification.
All result types use @dataclass(frozen=True, slots=True). Frozen means the fields can't be mutated after creation. Slots means no __dict__ per instance, which uses less memory and is slightly faster. For data that flows through a pipeline and should never be changed, frozen dataclasses are the right tool.
The tool detects whether stdout is a terminal or a pipe. When piped (echo "data" | b64tool decode | other_tool), it writes raw text to stdout with no Rich formatting. When interactive, it shows panels, tables, and colors. This happens via is_piped() checking sys.stdout.isatty().
Rich output goes to stderr (Console(stderr=True) at formatter.py:19), so diagnostic messages never contaminate piped data. This is a standard Unix convention that many CLI tools get wrong.
Detection uses the same registry pattern as encoding. Each format has a scorer function with the signature Callable[[str], float]. The _SCORERS dictionary maps EncodingFormat to its scorer. This means adding detection for a new format requires writing one scorer function and adding one dict entry.
Every scorer follows the same structure:
- Quick rejection (charset check, length check)
- Accumulate a confidence score based on structural signals
- Attempt actual decoding
- Bonus if decoded output is printable text
- Return clamped to [0.0, 1.0]
type EncoderFn = Callable[[bytes], str]
type DecoderFn = Callable[[str], bytes]
Python 3.12+ type statements (PEP 695) replace TypeAlias from typing. They're lazily evaluated and more readable. These aliases document the contract: encoders take bytes and return strings, decoders take strings and return bytes.
Errors are handled at two levels:
Module level: Functions like try_decode() (encoders.py:98) catch encoding-specific exceptions and return None. The detector and peeler use this to gracefully handle decode failures without crashing.
CLI level: Each command (cli.py) wraps its body in a try/except. typer.BadParameter is re-raised (Typer formats these nicely). All other exceptions get a [red]Error:[/red] message and exit code 1. This prevents stack traces from leaking to end users.
The intermediate modules (detector, peeler) never catch exceptions themselves. They call try_decode() and check for None. This keeps error handling at the boundaries, not scattered through business logic.
None of the core modules use classes for behavior (only for data: EncodingFormat, DetectionResult, PeelLayer, PeelResult). The encoder functions are pure functions. The scorers are pure functions. The peeler is a function. There's no shared mutable state to encapsulate, so there's no reason for a class.
An Encoder class with encode() and decode() methods would add indirection without adding value. The registry dict achieves the same polymorphism with less ceremony. This is idiomatic Python: use classes for data, functions for behavior, unless you have state to manage.