Skip to content

acecchini/fiddledyn

Repository files navigation

FiddleDyn

Structure-Aware Configuration for Fiddle and NeMo Run

FiddleDyn extends Fiddle and NeMo Run with CLI overrides, global DAG references, and round-trip serialization.

Installation

# Core installation (Fiddle backend only)
pip install fiddledyn

# With NeMo Run support (recommended)
pip install fiddledyn[nemo]

Note: nemo_run is an optional dependency. Without it, only the Fiddle backend (Backend.FIDDLE) is available. The NEMO backend requires pip install fiddledyn[nemo].


Quick Start

# main.py
import fiddledyn as dyn
import fiddle as fdl

config = dyn.parse_cli()
trainer = fdl.build(config)
python main.py -f config.yaml model.lr=0.001 [email protected]

Features

Feature Description
CLI Overrides Deep nested values via dot notation (model.encoder.dim=1024)
DAG References Share instances across files with _id_/_ref_
Partial Configs Deferred instantiation with _partial_: true
Callable References Pass any callable (class, function, method) using _call_: false
File Overrides Replace branches with [email protected]
Positional Args Support for *args via _args_
Include Defaults Serialize configs with parameter defaults
Shallow/Deep Defaults Control recursive expansion of callable defaults
Round-Trip Safe Full serialization preserving all metadata
Backend Agnostic Works with both nemo_run and fiddle

YAML Syntax

FiddleDyn uses special keys prefixed with _ to control configuration behavior:

Key Type Description
_target_ str Dotted path to the class/function to instantiate
_partial_ bool If true, creates a Partial instead of Config
_call_ bool If false, returns the raw class/function reference
_id_ str Registers the object in the global registry
_ref_ str References a registered object by its ID
_args_ list Positional arguments to pass to the target

Detailed Feature Guide

1. Basic Configuration (_target_)

Define Python objects in YAML using _target_:

# model.yaml
_target_: mylib.Model
hidden_size: 512
dropout: 0.1
import fiddledyn as dyn
import fiddle as fdl

config = dyn.load_yaml("model.yaml")
model = fdl.build(config)  # Creates Model(hidden_size=512, dropout=0.1)

2. Nested Configurations

Configurations can be arbitrarily nested:

# trainer.yaml
_target_: mylib.Trainer
model:
  _target_: mylib.Model
  encoder:
    _target_: mylib.Encoder
    vocab_size: 50000
    hidden_dim: 768
  dropout: 0.1
optimizer:
  _target_: torch.optim.Adam
  lr: 0.001
max_epochs: 100

3. Partial Configurations (_partial_: true)

Create partial function applications that can be called later with additional arguments:

# optimizer.yaml
_target_: torch.optim.Adam
_partial_: true
lr: 0.001
weight_decay: 0.01
config = dyn.load_yaml("optimizer.yaml")
# config is a Partial - missing the 'params' argument

# Later, complete the partial
optimizer = fdl.build(config)(params=model.parameters())

4. Class References (_call_: false)

Pass raw classes or functions instead of instantiating them:

# factory.yaml
_target_: mylib.OptimizerFactory
optimizer_cls:
  _target_: torch.optim.Adam
  _call_: false # Returns the Adam class, not an instance
scheduler_cls:
  _target_: torch.optim.lr_scheduler.CosineAnnealingLR
  _call_: false
config = dyn.load_yaml("factory.yaml")
factory = fdl.build(config)
# factory.optimizer_cls is torch.optim.Adam (the class itself)
# factory.scheduler_cls is CosineAnnealingLR (the class itself)

5. DAG References (_id_ and _ref_)

Share object instances across your configuration using _id_ and _ref_:

# config.yaml
shared_encoder:
  _target_: mylib.Encoder
  _id_: enc # Register with ID "enc"
  vocab_size: 50000

model1:
  _target_: mylib.Model
  encoder:
    _ref_: enc # Reference the shared encoder

model2:
  _target_: mylib.Model
  encoder:
    _ref_: enc # Same encoder instance!
ctx = dyn.ParserContext()
config = dyn.load_yaml("config.yaml", ctx)
dyn.resolve_placeholders(config, ctx.registry)

# Both models share the exact same encoder object
assert config["model1"].encoder is config["model2"].encoder

Cross-File References: References work across multiple files:

# backbone.yaml
_target_: mylib.Backbone
_id_: backbone
dim: 512
# heads.yaml
classifier:
  _target_: mylib.Classifier
  backbone:
    _ref_: backbone
detector:
  _target_: mylib.Detector
  backbone:
    _ref_: backbone
ctx = dyn.ParserContext()
backbone = dyn.load_yaml("backbone.yaml", ctx)
heads = dyn.load_yaml("heads.yaml", ctx)
dyn.resolve_placeholders(heads, ctx.registry)

# Both heads share the same backbone
assert heads["classifier"].backbone is heads["detector"].backbone

6. Positional Arguments (_args_)

For functions requiring positional arguments:

# layer.yaml
_target_: mylib.create_layer
_args_: [64, 128] # Positional args
bias: true # Keyword arg

Equivalent to: create_layer(64, 128, bias=True)

7. Callable Serialization

Any callable (class, function, built-in, method) bound to a Config is serialized with _target_ and _call_: false:

from math import sqrt

class Model:
    def __init__(self, activation=sqrt):
        ...

config = fdl.Config(Model, activation=sqrt)
dyn.config_to_dict(config)
# {"_target_": "mylib.Model", "activation": {"_target_": "math.sqrt", "_call_": false}}

This works for all callable types:

  • Classes: cls=MyClass{"_target_": "mylib.MyClass", "_call_": false}
  • Functions: fn=my_func{"_target_": "mylib.my_func", "_call_": false}
  • Built-ins: fn=sqrt{"_target_": "math.sqrt", "_call_": false}
  • Methods: fn=obj.method{"_target_": "mylib.MyClass.method", "_call_": false}
  • C-extension functions: fn=torch.abs{"_target_": "torch.abs", "_call_": false}

C-Extension Support: Functions from packages like torch that originate from internal C++ classes are correctly handled. The serializer uses the public module path rather than internal qualified names (e.g. torch.abs instead of torch._C._VariableFunctions.abs).

8. Include Defaults (include_defaults)

Capture complete configuration graphs with all parameter defaults:

class Encoder:
    def __init__(self, vocab_size: int, hidden_dim: int = 256, num_layers: int = 4):
        ...

config = fdl.Config(Encoder, vocab_size=50000)

# Without defaults - only explicit values
dyn.config_to_dict(config)
# {"_target_": "...", "vocab_size": 50000}

# With defaults - all values
dyn.config_to_dict(config, include_defaults=True)
# {"_target_": "...", "vocab_size": 50000, "hidden_dim": 256, "num_layers": 4}

9. Shallow vs Deep Defaults (deep_defaults)

Control how callable defaults are expanded using deep_defaults:

class Factory:
    def __init__(self, cls: type = Encoder, name: str = "default"):
        ...

config = fdl.Config(Factory)

# Deep (default): Callable defaults include THEIR parameter defaults
dyn.config_to_dict(config, include_defaults=True, deep_defaults=True)
# {
#   "_target_": "Factory",
#   "cls": {
#     "_target_": "Encoder",
#     "_call_": false,
#     "vocab_size": 32000,   # Encoder's defaults included
#     "hidden_dim": 256,
#     "num_layers": 4
#   },
#   "name": "default"
# }

# Shallow: Callable defaults are NOT expanded
dyn.config_to_dict(config, include_defaults=True, deep_defaults=False)
# {
#   "_target_": "Factory",
#   "cls": {"_target_": "Encoder", "_call_": false},  # No Encoder defaults
#   "name": "default"
# }

CLI Usage

With Factory File (-f)

# Load base config
python main.py -f config.yaml

# Override scalar values
python main.py -f config.yaml model.lr=0.001 epochs=100

# Override nested values with dot notation
python main.py -f config.yaml model.encoder.vocab_size=100000

# Override with file content
python main.py -f config.yaml [email protected]

# Multiple overrides
python main.py -f config.yaml model=@large_model.yaml [email protected] epochs=200

Without Factory File (Direct @ Syntax)

Build configurations entirely from CLI without a base file:

# Load entire config from file
python main.py [email protected] [email protected]

# Mix file loading with overrides
python main.py [email protected] model.dropout=0.2 optimizer.lr=0.001

# Inline YAML values
python main.py model="{_target_: mylib.Model, hidden_size: 512}"

List Index Overrides

# Override specific list elements
python main.py -f config.yaml callbacks.0.patience=10 callbacks.1.save_path=/new/path

API Reference

CLI

config = dyn.parse_cli()                    # Load from CLI args (NEMO backend)
config = dyn.parse_cli(backend="fiddle")    # Use Fiddle backend
data = dyn.parse_cli(as_dict=True)          # Return raw dict (no Config objects)

I/O

# Load YAML
config = dyn.load_yaml("config.yaml")                      # Auto-create context
config = dyn.load_yaml("config.yaml", ctx)                 # With explicit context
data = dyn.load_yaml("config.yaml", as_dict=True)          # Return raw dict

# Save YAML
dyn.dump_yaml(config, "output.yaml")                       # Write to file
yaml_str = dyn.dump_yaml(config)                           # Return string
dyn.dump_yaml(config, "full.yaml", include_defaults=True)  # Include defaults (deep)
dyn.dump_yaml(config, "shallow.yaml", include_defaults=True, deep_defaults=False)  # Shallow

Parsing

# Dictionary to Config
config = dyn.dict_to_config(data)                          # Fiddle backend (default)
config = dyn.dict_to_config(data, backend=dyn.Backend.NEMO)
config = dyn.dict_to_config(data, ctx=ctx)                 # With shared context

# Class-based API
parser = dyn.ConfigParser(backend=dyn.Backend.FIDDLE)
config = parser.parse(data, ctx)

Serialization

# Config to dictionary
data = dyn.config_to_dict(config)
data = dyn.config_to_dict(config, include_defaults=True)              # Include defaults (deep)
data = dyn.config_to_dict(config, include_defaults=True, deep_defaults=False)  # Shallow defaults

# Class-based API
serializer = dyn.ConfigSerializer(include_defaults=True, deep_defaults=True)
data = serializer.serialize(config)
yaml_str = serializer.to_yaml(config)

Reference Resolution

# Resolve DeferredReference placeholders
ctx = dyn.ParserContext()
config1 = dyn.load_yaml("file1.yaml", ctx)
config2 = dyn.load_yaml("file2.yaml", ctx)
dyn.resolve_placeholders(config2, ctx.registry)

# Class-based API
resolver = dyn.ReferenceResolver()
resolver.resolve(config, registry)

Utilities

# Resolve string path to Python object
cls = dyn.resolve_target("torch.optim.Adam")

# Get qualified name of callable
name = dyn.get_target_name(torch.optim.Adam)
# "torch.optim.adam.Adam"

Backend Selection

FiddleDyn supports two backends:

Backend Config Type Partial Type Requirement
fiddle (default) fiddle.Config fiddle.Partial (included)
nemo nemo_run.Config nemo_run.Partial pip install fiddledyn[nemo]
# Check if NeMo Run is available
from fiddledyn import HAS_NEMO_RUN
if HAS_NEMO_RUN:
    config = dyn.parse_cli(backend="nemo")
else:
    config = dyn.parse_cli(backend="fiddle")

# With ParserContext
ctx = dyn.ParserContext(backend=dyn.Backend.FIDDLE)
config = dyn.load_yaml("config.yaml", ctx)

Example: Complete Training Configuration

# train_config.yaml
shared_encoder:
  _target_: mylib.Encoder
  _id_: encoder
  vocab_size: 50000
  hidden_dim: 768
  num_layers: 12

model:
  _target_: mylib.TransformerModel
  encoder:
    _ref_: encoder
  decoder:
    _target_: mylib.Decoder
    hidden_dim: 768

optimizer:
  _target_: torch.optim.AdamW
  _partial_: true
  lr: 0.0001
  weight_decay: 0.01

trainer:
  _target_: mylib.Trainer
  model:
    _ref_: model
  optimizer:
    _ref_: optimizer
  callbacks:
    - _target_: mylib.EarlyStopping
      patience: 5
    - _target_: mylib.ModelCheckpoint
      save_path: checkpoints/
python train.py \
  -f train_config.yaml \
  shared_encoder.vocab_size=100000 \
  optimizer.lr=0.00005 \
  trainer.callbacks.0.patience=10

Architecture

src/fiddledyn/
├── core/           # Backend, ParserContext, DeferredReference
├── parsing/        # dict_to_config, ConfigParser
├── serialization/  # config_to_dict, ConfigSerializer
├── resolution/     # resolve_placeholders, ReferenceResolver
├── cli.py          # parse_cli
├── io.py           # load_yaml, dump_yaml
└── utils.py        # resolve_target, get_target_name

License

MIT

About

A dynamic yaml loader and parser with permissible cli override compatible with Google fiddle config library (and its NVIDIA NeMo Run extension)

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages