Factory System
Introduction
The factory system is aiNXT's configuration-driven object creation mechanism. It enables you to build datasets, models, metrics, and visualizations from simple YAML configuration files rather than writing instantiation code.
Think of it as a "recipe system" where YAML files describe what you want to build, and the factory system handles how to build it.
Looking for practical examples?
This page provides detailed API reference documentation. For a practical guide with real-world examples, see Factory Objects Guide.
Architecture
graph TB
CONFIG[config.yaml] --> CONTEXT[Context]
subgraph Context["Context (Container)"]
DATASET_FACTORY[dataset_builder: Factory]
MODEL_FACTORY[model_builder: Factory]
METRIC_FACTORY[metric_builder: Factory]
VIZ_FACTORY[viz_builder: Factory]
end
CONTEXT --> DATASET_FACTORY
CONTEXT --> MODEL_FACTORY
CONTEXT --> METRIC_FACTORY
CONTEXT --> VIZ_FACTORY
DATASET_FACTORY -.uses.-> BUILDERS1[Builders]
MODEL_FACTORY -.uses.-> BUILDERS2[Builders]
METRIC_FACTORY -.uses.-> BUILDERS3[Builders]
VIZ_FACTORY -.uses.-> BUILDERS4[Builders]
DATASET_FACTORY -.creates.-> DATASET[Dataset]
MODEL_FACTORY -.creates.-> MODEL[Model]
METRIC_FACTORY -.creates.-> METRIC[Metric]
VIZ_FACTORY -.creates.-> VIZ[Visualization]
classDef orangeBox fill:#FF6B35,stroke:#333,stroke-width:2px,color:#fff
classDef mediumBlueBox fill:#5A8A9C,stroke:#333,stroke-width:2px
classDef lightBlueBox fill:#0097B1,stroke:#333,stroke-width:2px,color:#fff
class CONFIG,DATASET,MODEL,METRIC,VIZ orangeBox
class CONTEXT,BUILDERS1,BUILDERS2,BUILDERS3,BUILDERS4 mediumBlueBox
class DATASET_FACTORY,MODEL_FACTORY,METRIC_FACTORY,VIZ_FACTORY lightBlueBox
The factory system consists of three main components:
- Builder: Maps
(task, name)tuples to constructor functions - Factory: Combines multiple builders and handles decorator application
- Context: Global container holding multiple factories (one per object type)
Builder: Constructor Registry
Purpose
A Builder is a mapping from BuilderKey (task, name) tuples to constructor functions. It provides:
- Smart constructor resolution with wildcard matching
- Type-safe object creation from configuration
- Introspection tools for debugging and documentation
BuilderKey Structure
- task: ML task type (e.g., "classification", "regression", None for wildcard)
- name: Constructor identifier (e.g., "random_forest", "svm")
Registering Constructors
Option 1: Direct Registration
from ainxt.factory import Builder
from ainxt.data import Dataset
class DatasetBuilder(Builder[Dataset]):
def __init__(self):
super().__init__()
# Register with specific task
self[("classification", "my_dataset")] = MyDataset
# Register with wildcard (works for any task)
self[(None, "generic_dataset")] = GenericDataset
Option 2: Using Decorators
from ainxt.factory import builder_name
@builder_name(task="classification", name="seeds_dataset")
class Seeds_Dataset(Dataset):
def __init__(self, path: str):
self.path = path
# ... dataset implementation
Building Objects
From task and name:
builder = DatasetBuilder()
# Exact match
dataset = builder.build("classification", "seeds_dataset", path="data.csv")
# Wildcard matching (None matches any)
dataset = builder.build(None, "seeds_dataset", path="data.csv")
From configuration dict:
config = {
"task": "classification",
"name": "seeds_dataset",
"path": "data.csv"
}
dataset = builder.build_from_config(config)
Constructor Resolution
Builders use similarity-based matching to find the best constructor:
# Registered constructors:
builder[(None, "linear")] = LinearModel
builder[("classification", "linear")] = LinearClassifier
# Query resolution
builder.resolve((None, "linear"))
# Returns: ("classification", "linear") - More specific match preferred
builder.resolve(("classification", "linear"))
# Returns: ("classification", "linear") - Exact match
Similarity Scoring: - Both task and name match: score = 2 (highest priority) - Either task or name matches: score = 1 - Neither matches: score = 0 (won't be selected)
Introspection
Search for constructors:
# Find all classification constructors
matches = list(builder.search("classification", None))
# Find specific constructor
matches = list(builder.search("classification", "svm"))
Tabulate available constructors:
Output:
╒═══════════════╤══════════════╤════════════════════╤═══════════════════╤══════════════╕
│ Task │ Name │ Required Arguments │ Optional Arguments│ Return Type │
╞═══════════════╪══════════════╪════════════════════╪═══════════════════╪══════════════╡
│classification │ svm │ kernel: str │ C: float = 1.0 │ SVMModel │
│classification │ random_forest│ n_trees: int │ depth: int = 10 │ RFModel │
╘═══════════════╧══════════════╧════════════════════╧═══════════════════╧══════════════╛
Factory: Combining Builders with Decorators
Purpose
A Factory combines multiple builders and optionally applies decorators to modify created objects.
Basic Factory Usage
from ainxt.factory import Factory
# Create factory with multiple builders
dataset_builder = DatasetBuilder()
model_builder = ModelBuilder()
factory = Factory(dataset_builder, model_builder)
# Access constructors through factory
dataset = factory[("classification", "seeds_dataset")](path="data.csv")
Decorators: Modifying Objects After Creation
Decorators allow you to apply transformations to objects automatically based on keyword arguments:
# Create a decorator builder
decorator = Factory()
decorator.register(None, "normalize", lambda dataset, method: dataset.normalize(method))
decorator.register(None, "balance", lambda dataset: dataset.balance_classes())
# Create factory with decorators
factory = Factory(dataset_builder, decorator=decorator)
# Build dataset with automatic normalization
dataset = factory[("classification", "my_dataset")](
path="data.csv",
normalize="min_max", # Triggers normalize decorator
balance=True # Triggers balance decorator
)
# Equivalent to:
# dataset = MyDataset(path="data.csv")
# dataset = normalize(dataset, method="min_max")
# dataset = balance(dataset)
How Decorators Work:
- Factory builds the base object using the constructor
- Checks kwargs for names matching registered decorators
- Applies matching decorators in sequence
- Each decorator receives only its matching keyword argument
Adding Builders Dynamically
factory = Factory(dataset_builder)
# Register additional builder
factory.register_builder(model_builder)
# Register additional decorator
factory.register_decorator(preprocessing_decorator)
# Combine factories
combined = factory1 + factory2 # Merges builders and decorators
Context: Global Access Point
Purpose
The Context class provides a centralized access point to all factories in aiNXT. It simplifies script writing by bundling all builders together.
Structure
@dataclass
class Context[X]:
encoder: ainxtJSONEncoder[X]
decoder: ainxtJSONDecoder[X]
dataset_builder: Builder[Dataset[X]]
model_builder: Builder[Model[X]]
metric_builder: Builder[Metric]
visualization_builder: Builder[Visualization]
parsers: Mapping[str, Builder]
Global CONTEXT Object
aiNXT provides a pre-configured global CONTEXT instance:
from context import CONTEXT
# Load dataset from config
config = {
"task": "classification",
"name": "seeds_dataset",
"path": "data/train.csv"
}
dataset = CONTEXT.load_dataset(config)
# Load model from config
model_config = {
"task": "classification",
"name": "random_forest",
"n_estimators": 100
}
model = CONTEXT.load_model(model_config)
# Load metrics from config
metrics_config = [
{"name": "accuracy"},
{"name": "f1_score", "average": "macro"}
]
metrics = CONTEXT.load_metrics(metrics_config, task="classification")
Context Methods
| Method | Purpose | Returns |
|---|---|---|
load_dataset(config) |
Build dataset from config | Dataset[X] |
load_model(config) |
Build model from config | Model[X] |
load_metrics(configs, task) |
Build multiple metrics | Sequence[Metric] |
load_visualizations(configs, task) |
Build visualizations | Sequence[Visualization] |
Configuration-Driven Workflow
YAML Configuration Files
data.yaml:
model.yaml:
metrics.yaml:
Using Configurations in Scripts
from context import CONTEXT
from ainxt.serving import parse_config_file
# Load configs
data_config = parse_config_file("config/data.yaml")
model_config = parse_config_file("config/model.yaml")
metrics_config = parse_config_file("config/metrics.yaml")
# Create objects
dataset = CONTEXT.load_dataset(data_config)
model = CONTEXT.load_model(model_config)
metrics = CONTEXT.load_metrics(
metrics_config["metrics"],
task=model_config["task"]
)
# Train
model.fit(dataset)
# Evaluate
for metric in metrics:
score = metric(model, dataset)
print(f"{metric.name}: {score}")
Registration Patterns
Pattern 1: Class Decorator
from ainxt.factory import builder_name
from ainxt.models import TrainableModel
@builder_name(task="classification", name="my_classifier")
class MyClassifier(TrainableModel):
def __init__(self, learning_rate: float = 0.001):
self.lr = learning_rate
def fit(self, dataset):
# Training logic
pass
def predict(self, instance):
# Prediction logic
pass
Pattern 2: Function Registration
from ainxt.factory import builder_name
@builder_name(task="classification", name="simple_classifier")
def create_simple_classifier(threshold: float = 0.5):
"""Factory function to create classifier"""
return SimpleClassifier(threshold)
Pattern 3: Builder Inheritance
from ainxt.factory import Builder
from ainxt.data import Dataset
class MyDatasetBuilder(Builder[Dataset]):
def __init__(self):
super().__init__()
self[("classification", "csv")] = CSVDataset
self[("classification", "json")] = JSONDataset
self[(None, "mock")] = MockDataset
# Add to global context
from context import CONTEXT
CONTEXT.dataset_builder.register_builder(MyDatasetBuilder())
Advanced Features
Wildcard Matching
# Register for all tasks
builder[(None, "generic_model")] = GenericModel
# Works for any task
model = builder.build("classification", "generic_model")
model = builder.build("regression", "generic_model")
model = builder.build(None, "generic_model")
Argument Type Conversion
Builders automatically convert config values to expected types:
class MyModel:
def __init__(self, layers: list[int]):
self.layers = layers
# Config can provide as list
config = {"task": "...", "name": "...", "layers": [128, 64, 32]}
model = builder.build_from_config(config)
Chaining Decorators
decorator = Factory()
decorator[(None, "normalize")] = normalize_data
decorator[(None, "augment")] = augment_data
decorator[(None, "balance")] = balance_classes
factory = Factory(dataset_builder, decorator=decorator)
# All three decorators applied in sequence
dataset = factory.build(
"classification", "my_dataset",
path="data.csv",
normalize="z_score",
augment={"rotation": 15},
balance=True
)
Summary
| Component | Purpose | Key Methods |
|---|---|---|
| Builder | Maps (task, name) to constructors | build(), build_from_config(), search(), resolve() |
| Factory | Combines builders with decorators | register(), register_builder(), register_decorator() |
| Context | Global access to all factories | load_dataset(), load_model(), load_metrics() |
Key Benefits:
✅ Configuration-Driven: Define objects in YAML, not code ✅ Type-Safe: Generic types ensure correctness ✅ Extensible: Easy to add new datasets, models, metrics ✅ Discoverable: Introspection tools show what's available ✅ Reusable: Same configs work across projects
Next Steps
- Factory Objects Guide - Practical examples and workflows
- Training Pipeline - Using the factory system in train scripts
- Evaluation Pipeline - Loading models and metrics for evaluation
- Core Abstractions - Understanding what objects the factory creates