Non-Library Alternatives#
In this document, we highlight why a user may want to favor using dilib
over alternatives that use no or very light frameworks. (Some of these
are DI-compatible and some are not.)
Comparison#
Method |
Global addressability |
Static auto-complete & type safety |
Delayed instantation |
Self-consistent perturb |
---|---|---|---|---|
|
*️⃣ |
❌ |
❌ |
❌ |
|
*️⃣ |
✅ |
❌ |
❌ |
Text config (e.g., YAML) |
✅ |
❌ |
✅ |
✅ |
Global variables |
*️⃣ |
✅ |
❌ |
❌ |
Nested getter functions |
❌ |
✅ |
✅ |
❌ |
Generic container |
✅ |
❌ |
✅ |
✅ |
Custom containers |
*️⃣ |
✅ |
✅ |
*️⃣ |
|
✅ |
✅ |
✅ |
✅ |
*️⃣ Achievable with a little helper code. E.g., one could write a helper
for the TypedDict
variant that translates "engine_config.engine"
to getting attrs one level at a time, much like dilib
does.
("engine_config"
first, followed by "engine"
).
dict[str, Any]
#
# Engine config/container
container["address"] = "some-db-address"
container["db_engine"] = DBEngine(container["address"])
container["mock_engine"] = MockEngine()
container["engine"] = container["engine_config"]["alt_engine"]
# Car config/container
container = {"engine_config": container}
container["car_a"] = Car(container["engine_config"]["engine"])
Note that there’s no type safety, no auto-complete, and no delayed instantiation.
TypedDict
or dataclasses
#
class EngineConfig(TypedDict):
address: str
db_engine: DBEngine
mock_engine: MockEngine
engine: Engine
class CarConfig(TypedDict):
engine_config: EngineConfig
car: Car
address = "some-db-address"
db_engine = DBEngine(address)
engine = db_engine
engine_config = EngineConfig({
address: address,
db_engine: db_engine,
mock_engine: MockEngine(),
engine: engine,
})
car_config = CarConfig({
"engine_config": engine_config,
"car": Car(engine_config["engine"]),
})
We’ve gained type safety and auto-complete, but we still don’t have
delayed instantiation. In addition, composing the objects and configs
is unwieldy and requires thought (e.g., for engine
to point to the
same instance as db_engine
, one is required to create them outside
of the dicts).
One could imagine the same setup but with dataclasses.dataclass()
instead of TypedDict
.
Text config (e.g., YAML)#
With native YAML aliases:
engine_config:
address: &address "some-db-address"
db_engine: &engine
class: "module_a.DBEngine"
address: *address
mock_engine:
class: "module_b.MockEngine"
engine: *engine
car_config:
car: "module_c.Car"
engine: *engine
Or one could handle aliases with custom config loading logic instead
(here we interpret ref:
as a reference to another value in the config):
engine_config:
address: "some-db-address"
db_engine:
class: "module_a.DBEngine"
address: "ref:..address"
mock_engine:
class: "module_b.MockEngine"
engine: "ref:db_engine"
car_config:
car: "module_c.Car"
engine: "ref:..engine_config.engine"
You now have delayed instantation and explicit wiring of dependencies between objects and values, but the syntax can become unwieldy and you don’t have auto-complete.
Global variables#
# At module level
ADDRESS = "some-db-address"
DB_ENGINE = DBEngine(ADDRESS)
MOCK_ENGINE = MockEngine()
ENGINE = DB_ENGINE
CAR = Car(ENGINE)
It’s generally considered an anti-pattern to do a lot of work (whether expensive compute or IO calls) at import time. In addition, there’s no way to perturb the parameters programmatically and no way to hold multiple views of the universe of objects in the same process.
Nested getter functions#
def get_engine() -> Engine:
return DBEngine(...)
def get_car() -> Car:
engine = get_engine()
return Car(engine)
One issue with this approach is that you will probably want to cache
the engine instance across various downstream users of it–probably
with something like functools.cache
. The issue then is that
you can only have exactly 1 set of objects per process.
In addition, there’s no obvious way to enable perturbations.
One approach could be to pass through parameters, but this doesn’t
scale well to hundreds of objects, each with their own parameters.
Additionally, the config user can’t introduce new implementaitons of Engine
that the config author didn’t know about. E.g.:
@functools.cache
def get_engine(use_mock_engine: bool) -> Engine:
if use_mock_engine:
return MockEngine()
return DBEngine(...)
@functools.cache
def get_car(use_mock_engine: bool) -> Car:
engine = get_engine(use_mock_engine=use_mock_engine)
return Car(engine)
Generic container#
One could imagine creating a generic dict-like container, like:
class Container:
def register(
self,
name: str,
factory: Callable[[...], Any],
dependencies: Sequence[str] | None = None,
) -> None:
"""Register a factory function with optional dependencies."""
...
def get(self, name: str) -> Any:
"""Instantiate and return the requested object lazily."""
...
container = Container()
container.register("address", lambda: "some-db-address")
container.register(
"db_engine", lambda addr: DBEngine(addr), dependencies=["address"]
)
container.register("engine", lambda engine: engine, dependencies=["engine"])
container.register("car", lambda engine: Car(engine), dependencies=["engine"])
However, much like with dicts, we have no static type checking or auto-complete functionality here. Plus, it’s unclear how downstream containers could use this container as a child container.
Custom containers#
@dataclasses.dataclass(frozen=True)
class EngineContainer:
@functools.cached_property
def db_engine(self) -> DBEngine:
return DBEngine(...)
@functools.cached_property
def engine(self) -> Engine:
return self.db_engine
@dataclasses.dataclass(frozen=True)
class CarContainer:
engine_container: EngineContainer
@functools.cached_property
def car(self) -> Car:
return Car(self.engine_container.engine)
This is very similar to the functionality provided by dilib
!
However, it would require some more work on the part of the user.
First, if the containers are heavily nested, you have to worry about using the same container reference in all the downstream containers, thus creating a DI problem for containers in a DI solution for objects.
Second, although one could monkey-patch perturbations on the container objects, there’s no 3-step process, so you don’t have a guarantee that no objects have been instantiated when you start perturbing. (Because if you perturb after instantiating, you have to worry about whether the perturbed value would have created a different object instance than the one already cached.)
One could imagine creating some helpers to address some of these issues:
class EngineContainer(dilib.Container):
@dilib.singleton
def db_engine(self) -> DBEngine:
return DBEngine(...)
@dilib.forward
def engine(self) -> Engine:
return self.db_engine
class CarContainer(dilib.Container):
engine_container: EngineContainer
@dilib.singleton
def car(self) -> Car:
return Car(self.engine_container.engine)
container = CarContainer.get()
engine = container.engine_container.engine
This library would have to:
Create child containers by type in the
.get()
Prevent property perturbing after any object has been instantianted in any container or its child containers
Expose nested attribute retrieval
Support global and local inputs
Though, at this point, you basically have dilib
with slightly more
verbose syntax.
TODO: We should consider adding an option to specify config fields as properties for cases where there are a lot of anonymous inner singletons.