What
Currently, cuprate_database's backend can be changed by compiling it with different feature flags, i.e.:
cargo build --features {heed,redb}
This issue is for discussing various methods cuprate_database could use to hot-swap backends at runtime.
Why
This would allow end-users to choose a backend at runtime, e.g. via config or CLI.
Method 1: dyn Env
The concrete object that represents the database environment is cuprate_database::ConcreteEnv.
This is a non-generic object; it's just a struct with some internals that switch depending on the backend feature flag.
This struct implements trait Env, the database environment trait, from where all other database operations can occur.
Passing around a dyn Env that all backends implement would solve these issues but there's a few problems:
trait Env is not object-safe because...
- It uses associated types/constants because...
- It must specify certain concrete types (e.g. the transaction type) because...
Env's are only compatible with their own types
For example, even though heed::RoTxn and redb::ReadTransaction both implement trait TxRo, you cannot pass a heed::RoTxn to redb and expect it to work. This means it cannot be object safe, and that types are not compatible with each other.
Another problem is performance; dyn will dynamically dispatch at runtime for each call, this compounds as the other traits (TxRo, DatabaseRo, etc) will probably have to be behind dyn as well.
Pros
- Uses the type system
- Most maintainable
Cons
- Slowest method
- Probably not possible without large changes
Method 2: enum for each trait
This is the same idea dyn, except there is a concrete enum that defines all backends.
There would have to be an enum for each trait and the backend's specific type, e.g.:
enum EnvEnum {
Heed(heed::Env),
Redb(redb::Database),
}
enum TxRoEnum<'a> {
Heed(heed::RoTxn<'a>),
Redb(redb::ReadTransaction),
}
/* continue for each trait */
and cuprate_database would expose EnvEnum where users would have to match at every layer.
Pros
- Faster than
dyn
- Doesn't run into the object safety problem
Cons
- Terrible maintainability
- Terrible usability
Method 3: Branching at the high level
Another method is shifting the responsibility for "hot-swapping" upwards, i.e. instead of making cuprate_database hot-swap, the crates building on-top will do so.
This comes with the pro that the "branch" to determine which backend is used only needs to be done once.
The con is that each crate building on-top must take on this responsibility (although, there's only 2 currently, cuprate-blockchain and cuprate-txpool).
For example, cuprate_blockchain::service could look something like this:
// storage/blockchain/src/service/free.rs
pub fn init(config: Config) -> Result<(DatabaseReadHandle, DatabaseWriteHandle), InitError> {
let db = if config.backend == Backend::Heed {
/* init heed backend */
} else {
/* init redb backend */
};
/* spawn threadpool with backend */
}
// storage/blockchain/src/service/read.rs
pub struct DatabaseReadHandle {
// old field
// env: Arc<ConcreteEnv>,
// new field
spawn_fn: fn(BCReadRequest) -> InfallibleOneshotReceiver,
}
- The blockchain read/write handle now only holds a function pointer that spawns some work to be done inside the
rayon threadpool, instead of owning the Env itself
- Each handler function would have to take in
<E: Env> instead of ConcreteEnv
Pros
- Fastest method (one branch at
init())
Problems
- Who owns the
Arc<Env> now? rayon doesn't have custom storage, recreating handler logic for rayon threads instead of as-needed spawning means we lose (or have to re-create) rayon work stealing logic
What
Currently,
cuprate_database's backend can be changed by compiling it with different feature flags, i.e.:cargo build --features {heed,redb}This issue is for discussing various methods
cuprate_databasecould use to hot-swap backends at runtime.Why
This would allow end-users to choose a backend at runtime, e.g. via config or CLI.
Method 1:
dyn EnvThe concrete object that represents the database environment is
cuprate_database::ConcreteEnv.This is a non-generic object; it's just a struct with some internals that switch depending on the backend feature flag.
This struct implements
trait Env, the database environment trait, from where all other database operations can occur.Passing around a
dyn Envthat all backends implement would solve these issues but there's a few problems:trait Envis not object-safe because...Env's are only compatible with their own typesFor example, even though
heed::RoTxnandredb::ReadTransactionboth implementtrait TxRo, you cannot pass aheed::RoTxntoredband expect it to work. This means it cannot be object safe, and that types are not compatible with each other.Another problem is performance;
dynwill dynamically dispatch at runtime for each call, this compounds as the other traits (TxRo,DatabaseRo, etc) will probably have to be behinddynas well.Pros
Cons
Method 2:
enumfor eachtraitThis is the same idea
dyn, except there is a concreteenumthat defines all backends.There would have to be an
enumfor eachtraitand the backend's specific type, e.g.:and
cuprate_databasewould exposeEnvEnumwhere users would have tomatchat every layer.Pros
dynCons
Method 3: Branching at the high level
Another method is shifting the responsibility for "hot-swapping" upwards, i.e. instead of making
cuprate_databasehot-swap, the crates building on-top will do so.This comes with the pro that the "branch" to determine which backend is used only needs to be done once.
The con is that each crate building on-top must take on this responsibility (although, there's only 2 currently,
cuprate-blockchainandcuprate-txpool).For example,
cuprate_blockchain::servicecould look something like this:rayonthreadpool, instead of owning theEnvitself<E: Env>instead ofConcreteEnvPros
init())Problems
Arc<Env>now?rayondoesn't have custom storage, recreating handler logic forrayonthreads instead of as-needed spawning means we lose (or have to re-create)rayonwork stealing logic