Tutorial: Creating Custom Components with aiNXT
Introduction
In this hands-on tutorial, we'll build a complete image classification system from scratch using aiNXT's Factory system. You'll learn how to:
- Create custom parsers for configuration handling
- Set up loaders to discover your components
- Build factories that tie everything together
- Use Context to orchestrate the system
Time required: 30-45 minutes Prerequisites: Basic Python knowledge (classes, functions, dictionaries)
Project Setup
Let's create a simple project structure:
my_vision_project/
├── config/
│ └── experiments/
│ └── simple_classifier.yaml
├── my_vision/
│ ├── __init__.py
│ ├── data/
│ │ ├── __init__.py
│ │ └── datasets/
│ │ ├── __init__.py
│ │ └── classification.py
│ ├── models/
│ │ ├── __init__.py
│ │ └── classification.py
│ ├── parsers/
│ │ ├── __init__.py
│ │ └── augmentation.py
│ └── serving/
│ ├── __init__.py
│ └── singletons.py
├── context.py
└── train.py
Step 1: Create Your First Dataset
Let's start with a simple dataset class:
# my_vision/data/datasets/classification.py
from ainxt.data import Dataset
from typing import List, Tuple
import numpy as np
class SimpleImageDataset(Dataset):
"""A simple dataset that loads images from a folder."""
def __init__(self, path: str, image_size: int = 224):
"""
Initialize the dataset.
Args:
path: Path to the folder containing images
image_size: Size to resize images to (square)
"""
self.path = path
self.image_size = image_size
self.images = self._load_images()
def _load_images(self) -> List[np.ndarray]:
"""Load images from the path."""
# For this tutorial, we'll create dummy data
# In real code, you'd load actual images here
print(f"Loading images from {self.path}")
# Create 100 dummy images
return [np.random.rand(self.image_size, self.image_size, 3) for _ in range(100)]
def __len__(self) -> int:
"""Return the number of images in the dataset."""
return len(self.images)
def __getitem__(self, idx: int) -> Tuple[np.ndarray, int]:
"""
Get an image and its label.
Args:
idx: Index of the image to retrieve
Returns:
Tuple of (image, label)
"""
# Return image and dummy label
return self.images[idx], idx % 10 # 10 classes
# Also create a function constructor (alternative to class)
def load_from_csv(csv_path: str, image_column: str = "path") -> Dataset:
"""
Load dataset from a CSV file.
Args:
csv_path: Path to CSV file
image_column: Name of column containing image paths
Returns:
A dataset instance
"""
print(f"Loading from CSV: {csv_path}, column: {image_column}")
# For simplicity, return our simple dataset
return SimpleImageDataset(path="dummy_path")
Key Points:
- Your dataset inherits from ainxt.data.Dataset
- It must implement __len__ and __getitem__ methods
- You can create datasets as classes OR functions
Step 2: Create Your First Model
Now let's create a simple model:
# my_vision/models/classification.py
from ainxt.models import Model
import numpy as np
from typing import Any, Dict
class SimpleClassifier(Model):
"""A simple classifier for demonstration."""
def __init__(self, num_classes: int = 10, input_size: int = 224):
"""
Initialize the classifier.
Args:
num_classes: Number of output classes
input_size: Expected input image size
"""
self.num_classes = num_classes
self.input_size = input_size
# In real code, initialize your neural network here
print(f"Created classifier: {num_classes} classes, {input_size}x{input_size} input")
def predict(self, image: np.ndarray) -> Dict[str, Any]:
"""
Make a prediction on an image.
Args:
image: Input image as numpy array
Returns:
Dictionary containing predictions
"""
# Dummy prediction - in real code, run through neural network
probs = np.random.rand(self.num_classes)
probs = probs / probs.sum() # Normalize to sum to 1
return {
"probabilities": probs,
"class": int(np.argmax(probs)),
"confidence": float(np.max(probs))
}
# Function constructor alternative
def create_resnet(num_classes: int = 1000, pretrained: bool = False) -> Model:
"""
Create a ResNet model.
Args:
num_classes: Number of output classes
pretrained: Whether to load pretrained weights
Returns:
A ResNet model instance
"""
print(f"Creating ResNet: {num_classes} classes, pretrained={pretrained}")
# For demo, return simple classifier
return SimpleClassifier(num_classes=num_classes)
Key Points:
- Models inherit from ainxt.models.Model
- Must implement the predict method
- Can be defined as classes or factory functions
Step 3: Create Your First Parser
Parsers transform configuration into Python objects. Let's create an augmentation parser:
# my_vision/parsers/augmentation.py
from ainxt.factory import Factory
from typing import Any
import numpy as np
# First, create the augmentation functions/classes
class RandomFlip:
"""Randomly flip images horizontally."""
def __init__(self, probability: float = 0.5):
"""
Args:
probability: Probability of flipping (0.0 to 1.0)
"""
self.probability = probability
print(f"RandomFlip created with p={probability}")
def __call__(self, image: np.ndarray) -> np.ndarray:
"""Apply the augmentation."""
if np.random.random() < self.probability:
return np.fliplr(image)
return image
class RandomRotation:
"""Randomly rotate images."""
def __init__(self, degrees: float = 15.0):
"""
Args:
degrees: Maximum rotation in degrees
"""
self.degrees = degrees
print(f"RandomRotation created with degrees={degrees}")
def __call__(self, image: np.ndarray) -> np.ndarray:
"""Apply the augmentation."""
# Simplified - in real code, use proper rotation
angle = np.random.uniform(-self.degrees, self.degrees)
print(f"Would rotate by {angle} degrees")
return image
# Now create the parser Factory
AUGMENTERS = Factory()
# Register augmentation classes
# Format: (task, name) -> constructor
AUGMENTERS.register(task="image", name="flip", constructor=RandomFlip)
AUGMENTERS.register(task="image", name="rotate", constructor=RandomRotation)
# You can also register for any task (None = wildcard)
AUGMENTERS.register(task=None, name="identity", constructor=lambda x: x)
print("Augmentation parsers registered!")
Key Points:
- Create a Factory() object to hold your parsers
- Register constructors with (task, name) pairs
- The Factory will use these to transform configuration
Step 4: Create a Loader
Loaders automatically find components in your modules:
# my_vision/serving/singletons.py
from ainxt.factory import Factory, Loader
from ainxt.data import Dataset
from ainxt.models import Model
from ainxt.serving import (
create_dataset_factory,
create_model_factory,
create_parsers
)
# Define tasks your project supports
TASKS = ["classification", "detection"]
# Create loaders for datasets
dataset_loader = Loader(
template="my_vision.data.datasets.{task}", # Where to look
tasks=TASKS, # Which tasks
kind=Dataset, # What type to find
strict=True # Only from this module
)
# Create loaders for models
model_loader = Loader(
template="my_vision.models.{task}",
tasks=TASKS,
kind=Model,
strict=True
)
# Create factories from loaders
DATASETS = Factory(dataset_loader)
MODELS = Factory(model_loader)
# Import and register parsers
from my_vision.parsers.augmentation import AUGMENTERS
# Create parsers dictionary
PARSERS = {
"augmentation": AUGMENTERS,
"augmenter": AUGMENTERS, # Alternative name
}
print("Singletons initialized!")
Key Points:
- Loaders use templates with {task} placeholders
- They automatically scan modules for matching types
- Factories combine loaders with additional functionality
Step 5: Create the Context
The Context ties everything together:
# context.py
from ainxt.scripts.context import Context
from ainxt.serving.serialization import ainxtJSONEncoder, ainxtJSONDecoder
from my_vision.serving.singletons import DATASETS, MODELS, PARSERS
import numpy as np
# For this example, we'll use numpy arrays as our data type
DataType = np.ndarray
# Create the context
CONTEXT = Context[DataType](
encoder=ainxtJSONEncoder(), # For serialization
decoder=ainxtJSONDecoder(), # For deserialization
dataset_builder=DATASETS, # Our dataset factory
model_builder=MODELS, # Our model factory
parsers=PARSERS # Our parsers
)
print("Context created successfully!")
# Helper functions for ease of use
def load_dataset(config):
"""Load a dataset from configuration."""
return CONTEXT.load_dataset(config)
def load_model(config):
"""Load a model from configuration."""
return CONTEXT.load_model(config)
Step 6: Create a Configuration File
Now let's create a configuration that uses all our components:
# config/experiments/simple_classifier.yaml
# Dataset configuration
dataset:
task: classification
name: simpleimagedataset # Lowercase class name
path: /path/to/images
image_size: 224
# This triggers our augmentation parser!
augmentation:
task: image
name: flip
probability: 0.5
# Model configuration
model:
task: classification
name: simpleclassifier
num_classes: 10
input_size: 224
# Training configuration
training:
epochs: 10
batch_size: 32
learning_rate: 0.001
Step 7: Put It All Together
Let's create a training script that uses everything:
# train.py
import yaml
from context import CONTEXT
def main():
"""Main training function."""
# Load configuration
with open("config/experiments/simple_classifier.yaml", "r") as f:
config = yaml.safe_load(f)
print("\n=== Loading Dataset ===")
# The Context handles everything:
# 1. Parses the augmentation config into an object
# 2. Finds the right dataset class using the loader
# 3. Creates the dataset with parsed configuration
dataset = CONTEXT.load_dataset(config["dataset"])
print(f"Dataset loaded: {len(dataset)} samples")
print("\n=== Loading Model ===")
model = CONTEXT.load_model(config["model"])
print("Model loaded successfully")
print("\n=== Training ===")
# Simple training loop
for epoch in range(config["training"]["epochs"]):
print(f"Epoch {epoch + 1}/{config['training']['epochs']}")
# Get a sample from dataset
image, label = dataset[0]
# Make prediction
prediction = model.predict(image)
print(f" Predicted class: {prediction['class']}, "
f"confidence: {prediction['confidence']:.2f}")
print("\nTraining complete!")
if __name__ == "__main__":
main()
Step 8: Run Your System
To run the complete system:
Expected output:
Augmentation parsers registered!
Singletons initialized!
Context created successfully!
=== Loading Dataset ===
RandomFlip created with p=0.5
Loading images from /path/to/images
Dataset loaded: 100 samples
=== Loading Model ===
Created classifier: 10 classes, 224x224 input
Model loaded successfully
=== Training ===
Epoch 1/10
Predicted class: 3, confidence: 0.18
...
Training complete!
Advanced: Adding Dataset Decorators
Let's add a decorator that normalizes images:
# my_vision/data/decorators.py
from ainxt.factory import Factory
import numpy as np
def normalize(dataset, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]):
"""
Decorator that normalizes dataset images.
Args:
dataset: The dataset to wrap
mean: Mean values for normalization
std: Standard deviation values
Returns:
Wrapped dataset with normalization
"""
print(f"Adding normalization: mean={mean}, std={std}")
# Store original getitem
original_getitem = dataset.__getitem__
# Create wrapper function
def normalized_getitem(idx):
image, label = original_getitem(idx)
# Normalize the image
image = (image - mean) / std
return image, label
# Replace the getitem method
dataset.__getitem__ = normalized_getitem
return dataset
# Register as decorator
DECORATORS = Factory()
DECORATORS.register(None, "normalize", normalize)
Update your singletons:
# my_vision/serving/singletons.py (addition)
from my_vision.data.decorators import DECORATORS
# Add decorators to dataset factory
DATASETS.register_decorator(DECORATORS)
Now you can use it in config:
dataset:
task: classification
name: simpleimagedataset
path: /path/to/images
# This triggers the decorator!
normalize:
mean: [0.5, 0.5, 0.5]
std: [0.5, 0.5, 0.5]
Troubleshooting Common Issues
Issue 1: "No constructor found"
Error: KeyError: No constructor found for (classification, mymodel)
Solution: - Check that your class/function name matches - Ensure the module path in the Loader template is correct - Verify the task name matches
Issue 2: "Module not found"
Error: ModuleNotFoundError: No module named 'my_vision'
Solution:
- Add your project root to Python path
- Install your package: pip install -e .
- Check import statements
Issue 3: Parser not working
Symptom: Configuration stays as dictionary instead of becoming object
Solution: - Ensure parser key in config matches registered parser name - Check that the value is a dictionary (not a string or list) - Verify parser is registered in PARSERS dictionary
Issue 4: Loader not finding classes
Symptom: Factory is empty even though classes exist
Solution:
- Check that classes inherit from the correct base (Dataset, Model)
- Ensure strict=False if importing from other modules
- Verify the module path in template is correct
Best Practices
1. Naming Conventions
# Use clear, descriptive names
class ImageClassificationDataset(Dataset): # Good
class ICD(Dataset): # Too abbreviated
# Loaders use lowercase by default
# "ImageClassificationDataset" becomes "imageclassificationdataset"
# Use ignore_suffixes to clean this up:
loader = Loader(
template="...",
ignore_suffixes=["Dataset", "Model"]
)
# Now "ImageClassificationDataset" becomes "imageclassification"
2. Organization
# Group related components
my_project/
data/
datasets/ # All datasets here
augmentation/ # All augmentation here
models/
classification/ # Task-specific models
detection/
parsers/ # All parsers in one place
3. Configuration Structure
# Use clear hierarchy
experiment:
data:
dataset: ...
preprocessing: ...
model:
architecture: ...
hyperparameters: ...
training:
optimizer: ...
scheduler: ...
4. Type Hints
Always use type hints for clarity:
def create_model(
num_classes: int,
pretrained: bool = False
) -> Model: # Clear return type
"""Create a model."""
pass
Next Steps
You've now built a complete aiNXT system! Here's what you can explore next:
- Add more components: Create additional datasets, models, and parsers
- Use existing components: Integrate with torchvision, scikit-learn, etc.
- Create custom metrics: Build evaluation metrics as parsers
- Add visualization: Create visualization parsers for plotting
- Build a CLI: Use your Context in command-line tools
Summary
In this tutorial, you learned how to: - ✅ Create datasets and models following aiNXT patterns - ✅ Build parsers that transform configuration into objects - ✅ Set up loaders that automatically discover components - ✅ Create factories that manage object creation - ✅ Use Context to orchestrate everything - ✅ Write configuration files that leverage the system - ✅ Debug common issues
The power of this system is that once set up, you can: - Define new experiments entirely in YAML - Share configurations between projects - Automatically discover new components - Transform any configuration key into a Python object
Happy building with aiNXT!