Tutorial: Creating Custom Components with aiNXT

Introduction

In this hands-on tutorial, we'll build a complete image classification system from scratch using aiNXT's Factory system. You'll learn how to:

Create custom parsers for configuration handling
Set up loaders to discover your components
Build factories that tie everything together
Use Context to orchestrate the system

Time required: 30-45 minutes Prerequisites: Basic Python knowledge (classes, functions, dictionaries)

Project Setup

Let's create a simple project structure:

my_vision_project/
├── config/
│   └── experiments/
│       └── simple_classifier.yaml
├── my_vision/
│   ├── __init__.py
│   ├── data/
│   │   ├── __init__.py
│   │   └── datasets/
│   │       ├── __init__.py
│   │       └── classification.py
│   ├── models/
│   │   ├── __init__.py
│   │   └── classification.py
│   ├── parsers/
│   │   ├── __init__.py
│   │   └── augmentation.py
│   └── serving/
│       ├── __init__.py
│       └── singletons.py
├── context.py
└── train.py

Step 1: Create Your First Dataset

Let's start with a simple dataset class:

# my_vision/data/datasets/classification.py

from ainxt.data import Dataset
from typing import List, Tuple
import numpy as np

class SimpleImageDataset(Dataset):
    """A simple dataset that loads images from a folder."""

    def __init__(self, path: str, image_size: int = 224):
        """
        Initialize the dataset.

        Args:
            path: Path to the folder containing images
            image_size: Size to resize images to (square)
        """
        self.path = path
        self.image_size = image_size
        self.images = self._load_images()

    def _load_images(self) -> List[np.ndarray]:
        """Load images from the path."""
        # For this tutorial, we'll create dummy data
        # In real code, you'd load actual images here
        print(f"Loading images from {self.path}")
        # Create 100 dummy images
        return [np.random.rand(self.image_size, self.image_size, 3) for _ in range(100)]

    def __len__(self) -> int:
        """Return the number of images in the dataset."""
        return len(self.images)

    def __getitem__(self, idx: int) -> Tuple[np.ndarray, int]:
        """
        Get an image and its label.

        Args:
            idx: Index of the image to retrieve

        Returns:
            Tuple of (image, label)
        """
        # Return image and dummy label
        return self.images[idx], idx % 10  # 10 classes

# Also create a function constructor (alternative to class)
def load_from_csv(csv_path: str, image_column: str = "path") -> Dataset:
    """
    Load dataset from a CSV file.

    Args:
        csv_path: Path to CSV file
        image_column: Name of column containing image paths

    Returns:
        A dataset instance
    """
    print(f"Loading from CSV: {csv_path}, column: {image_column}")
    # For simplicity, return our simple dataset
    return SimpleImageDataset(path="dummy_path")

Key Points: - Your dataset inherits from ainxt.data.Dataset - It must implement __len__ and __getitem__ methods - You can create datasets as classes OR functions

Step 2: Create Your First Model

Now let's create a simple model:

# my_vision/models/classification.py

from ainxt.models import Model
import numpy as np
from typing import Any, Dict

class SimpleClassifier(Model):
    """A simple classifier for demonstration."""

    def __init__(self, num_classes: int = 10, input_size: int = 224):
        """
        Initialize the classifier.

        Args:
            num_classes: Number of output classes
            input_size: Expected input image size
        """
        self.num_classes = num_classes
        self.input_size = input_size
        # In real code, initialize your neural network here
        print(f"Created classifier: {num_classes} classes, {input_size}x{input_size} input")

    def predict(self, image: np.ndarray) -> Dict[str, Any]:
        """
        Make a prediction on an image.

        Args:
            image: Input image as numpy array

        Returns:
            Dictionary containing predictions
        """
        # Dummy prediction - in real code, run through neural network
        probs = np.random.rand(self.num_classes)
        probs = probs / probs.sum()  # Normalize to sum to 1

        return {
            "probabilities": probs,
            "class": int(np.argmax(probs)),
            "confidence": float(np.max(probs))
        }

# Function constructor alternative
def create_resnet(num_classes: int = 1000, pretrained: bool = False) -> Model:
    """
    Create a ResNet model.

    Args:
        num_classes: Number of output classes
        pretrained: Whether to load pretrained weights

    Returns:
        A ResNet model instance
    """
    print(f"Creating ResNet: {num_classes} classes, pretrained={pretrained}")
    # For demo, return simple classifier
    return SimpleClassifier(num_classes=num_classes)

Key Points: - Models inherit from ainxt.models.Model - Must implement the predict method - Can be defined as classes or factory functions

Step 3: Create Your First Parser

Parsers transform configuration into Python objects. Let's create an augmentation parser:

# my_vision/parsers/augmentation.py

from ainxt.factory import Factory
from typing import Any
import numpy as np

# First, create the augmentation functions/classes

class RandomFlip:
    """Randomly flip images horizontally."""

    def __init__(self, probability: float = 0.5):
        """
        Args:
            probability: Probability of flipping (0.0 to 1.0)
        """
        self.probability = probability
        print(f"RandomFlip created with p={probability}")

    def __call__(self, image: np.ndarray) -> np.ndarray:
        """Apply the augmentation."""
        if np.random.random() < self.probability:
            return np.fliplr(image)
        return image

class RandomRotation:
    """Randomly rotate images."""

    def __init__(self, degrees: float = 15.0):
        """
        Args:
            degrees: Maximum rotation in degrees
        """
        self.degrees = degrees
        print(f"RandomRotation created with degrees={degrees}")

    def __call__(self, image: np.ndarray) -> np.ndarray:
        """Apply the augmentation."""
        # Simplified - in real code, use proper rotation
        angle = np.random.uniform(-self.degrees, self.degrees)
        print(f"Would rotate by {angle} degrees")
        return image

# Now create the parser Factory

AUGMENTERS = Factory()

# Register augmentation classes
# Format: (task, name) -> constructor
AUGMENTERS.register(task="image", name="flip", constructor=RandomFlip)
AUGMENTERS.register(task="image", name="rotate", constructor=RandomRotation)

# You can also register for any task (None = wildcard)
AUGMENTERS.register(task=None, name="identity", constructor=lambda x: x)

print("Augmentation parsers registered!")

Key Points: - Create a Factory() object to hold your parsers - Register constructors with (task, name) pairs - The Factory will use these to transform configuration

Step 4: Create a Loader

Loaders automatically find components in your modules:

# my_vision/serving/singletons.py

from ainxt.factory import Factory, Loader
from ainxt.data import Dataset
from ainxt.models import Model
from ainxt.serving import (
    create_dataset_factory,
    create_model_factory,
    create_parsers
)

# Define tasks your project supports
TASKS = ["classification", "detection"]

# Create loaders for datasets
dataset_loader = Loader(
    template="my_vision.data.datasets.{task}",  # Where to look
    tasks=TASKS,                                 # Which tasks
    kind=Dataset,                                 # What type to find
    strict=True                                   # Only from this module
)

# Create loaders for models
model_loader = Loader(
    template="my_vision.models.{task}",
    tasks=TASKS,
    kind=Model,
    strict=True
)

# Create factories from loaders
DATASETS = Factory(dataset_loader)
MODELS = Factory(model_loader)

# Import and register parsers
from my_vision.parsers.augmentation import AUGMENTERS

# Create parsers dictionary
PARSERS = {
    "augmentation": AUGMENTERS,
    "augmenter": AUGMENTERS,  # Alternative name
}

print("Singletons initialized!")

Key Points: - Loaders use templates with {task} placeholders - They automatically scan modules for matching types - Factories combine loaders with additional functionality

Step 5: Create the Context

The Context ties everything together:

# context.py

from ainxt.scripts.context import Context
from ainxt.serving.serialization import ainxtJSONEncoder, ainxtJSONDecoder
from my_vision.serving.singletons import DATASETS, MODELS, PARSERS
import numpy as np

# For this example, we'll use numpy arrays as our data type
DataType = np.ndarray

# Create the context
CONTEXT = Context[DataType](
    encoder=ainxtJSONEncoder(),      # For serialization
    decoder=ainxtJSONDecoder(),      # For deserialization
    dataset_builder=DATASETS,        # Our dataset factory
    model_builder=MODELS,            # Our model factory
    parsers=PARSERS                  # Our parsers
)

print("Context created successfully!")

# Helper functions for ease of use
def load_dataset(config):
    """Load a dataset from configuration."""
    return CONTEXT.load_dataset(config)

def load_model(config):
    """Load a model from configuration."""
    return CONTEXT.load_model(config)

Step 6: Create a Configuration File

Now let's create a configuration that uses all our components:

# config/experiments/simple_classifier.yaml

# Dataset configuration
dataset:
  task: classification
  name: simpleimagedataset  # Lowercase class name
  path: /path/to/images
  image_size: 224
  # This triggers our augmentation parser!
  augmentation:
    task: image
    name: flip
    probability: 0.5

# Model configuration
model:
  task: classification
  name: simpleclassifier
  num_classes: 10
  input_size: 224

# Training configuration
training:
  epochs: 10
  batch_size: 32
  learning_rate: 0.001

Step 7: Put It All Together

Let's create a training script that uses everything:

# train.py

import yaml
from context import CONTEXT

def main():
    """Main training function."""

    # Load configuration
    with open("config/experiments/simple_classifier.yaml", "r") as f:
        config = yaml.safe_load(f)

    print("\n=== Loading Dataset ===")
    # The Context handles everything:
    # 1. Parses the augmentation config into an object
    # 2. Finds the right dataset class using the loader
    # 3. Creates the dataset with parsed configuration
    dataset = CONTEXT.load_dataset(config["dataset"])
    print(f"Dataset loaded: {len(dataset)} samples")

    print("\n=== Loading Model ===")
    model = CONTEXT.load_model(config["model"])
    print("Model loaded successfully")

    print("\n=== Training ===")
    # Simple training loop
    for epoch in range(config["training"]["epochs"]):
        print(f"Epoch {epoch + 1}/{config['training']['epochs']}")

        # Get a sample from dataset
        image, label = dataset[0]

        # Make prediction
        prediction = model.predict(image)
        print(f"  Predicted class: {prediction['class']}, "
              f"confidence: {prediction['confidence']:.2f}")

    print("\nTraining complete!")

if __name__ == "__main__":
    main()

Step 8: Run Your System

To run the complete system:

# From your project root
python train.py

Expected output:

Augmentation parsers registered!
Singletons initialized!
Context created successfully!

=== Loading Dataset ===
RandomFlip created with p=0.5
Loading images from /path/to/images
Dataset loaded: 100 samples

=== Loading Model ===
Created classifier: 10 classes, 224x224 input
Model loaded successfully

=== Training ===
Epoch 1/10
  Predicted class: 3, confidence: 0.18
...
Training complete!

Advanced: Adding Dataset Decorators

Let's add a decorator that normalizes images:

# my_vision/data/decorators.py

from ainxt.factory import Factory
import numpy as np

def normalize(dataset, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]):
    """
    Decorator that normalizes dataset images.

    Args:
        dataset: The dataset to wrap
        mean: Mean values for normalization
        std: Standard deviation values

    Returns:
        Wrapped dataset with normalization
    """
    print(f"Adding normalization: mean={mean}, std={std}")

    # Store original getitem
    original_getitem = dataset.__getitem__

    # Create wrapper function
    def normalized_getitem(idx):
        image, label = original_getitem(idx)
        # Normalize the image
        image = (image - mean) / std
        return image, label

    # Replace the getitem method
    dataset.__getitem__ = normalized_getitem
    return dataset

# Register as decorator
DECORATORS = Factory()
DECORATORS.register(None, "normalize", normalize)

Update your singletons:

# my_vision/serving/singletons.py (addition)

from my_vision.data.decorators import DECORATORS

# Add decorators to dataset factory
DATASETS.register_decorator(DECORATORS)

Now you can use it in config:

dataset:
  task: classification
  name: simpleimagedataset
  path: /path/to/images
  # This triggers the decorator!
  normalize:
    mean: [0.5, 0.5, 0.5]
    std: [0.5, 0.5, 0.5]

Troubleshooting Common Issues

Issue 1: "No constructor found"

Error: KeyError: No constructor found for (classification, mymodel)

Solution: - Check that your class/function name matches - Ensure the module path in the Loader template is correct - Verify the task name matches

Issue 2: "Module not found"

Error: ModuleNotFoundError: No module named 'my_vision'

Solution: - Add your project root to Python path - Install your package: pip install -e . - Check import statements

Issue 3: Parser not working

Symptom: Configuration stays as dictionary instead of becoming object

Solution: - Ensure parser key in config matches registered parser name - Check that the value is a dictionary (not a string or list) - Verify parser is registered in PARSERS dictionary

Issue 4: Loader not finding classes

Symptom: Factory is empty even though classes exist

Solution: - Check that classes inherit from the correct base (Dataset, Model) - Ensure strict=False if importing from other modules - Verify the module path in template is correct

Best Practices

1. Naming Conventions

# Use clear, descriptive names
class ImageClassificationDataset(Dataset):  # Good
class ICD(Dataset):  # Too abbreviated

# Loaders use lowercase by default
# "ImageClassificationDataset" becomes "imageclassificationdataset"
# Use ignore_suffixes to clean this up:
loader = Loader(
    template="...",
    ignore_suffixes=["Dataset", "Model"]
)
# Now "ImageClassificationDataset" becomes "imageclassification"

2. Organization

# Group related components
my_project/
  data/
    datasets/      # All datasets here
    augmentation/  # All augmentation here
  models/
    classification/  # Task-specific models
    detection/
  parsers/        # All parsers in one place

3. Configuration Structure

# Use clear hierarchy
experiment:
  data:
    dataset: ...
    preprocessing: ...
  model:
    architecture: ...
    hyperparameters: ...
  training:
    optimizer: ...
    scheduler: ...

4. Type Hints

Always use type hints for clarity:

def create_model(
    num_classes: int,
    pretrained: bool = False
) -> Model:  # Clear return type
    """Create a model."""
    pass

Next Steps

You've now built a complete aiNXT system! Here's what you can explore next:

Add more components: Create additional datasets, models, and parsers
Use existing components: Integrate with torchvision, scikit-learn, etc.
Create custom metrics: Build evaluation metrics as parsers
Add visualization: Create visualization parsers for plotting
Build a CLI: Use your Context in command-line tools

Summary

In this tutorial, you learned how to: - ✅ Create datasets and models following aiNXT patterns - ✅ Build parsers that transform configuration into objects - ✅ Set up loaders that automatically discover components - ✅ Create factories that manage object creation - ✅ Use Context to orchestrate everything - ✅ Write configuration files that leverage the system - ✅ Debug common issues

The power of this system is that once set up, you can: - Define new experiments entirely in YAML - Share configurations between projects - Automatically discover new components - Transform any configuration key into a Python object

Happy building with aiNXT!