Skip to content

Dataset Augmenters

Overview

Augmenters are data augmentation components that transform data on-the-fly during training. They work with aiNXT's Dataset system through the AugmentedDataset decorator (see Dataset Decorators).

Real-world analogy: An Augmenter is like a photo filter: - Input: Original image - Processing: Apply transformation (flip, rotate, blur) - Output: Modified image - Callable: Works like a function but configured like an object

Augmenters are classes that act like functions, providing flexible, stateful data transformations.


What ARE Augmenters?

Answer: Classes That Act Like Functions

Augmenters are classes that inherit from the Augmenter base class (ainxt/data/augmentation/augmenter.py:11). They have a special __call__ method which makes instances callable like functions.

Key characteristic: You create an instance once with configuration, then call it like a function many times.

# Create augmenter instance (configure once)
augmenter = FlipAugmenter(direction="horizontal")

# Call it like a function (use many times)
augmented_image1 = augmenter(image1)
augmented_image2 = augmenter(image2)
augmented_image3 = augmenter(image3)

The Augmenter Interface

Base Class: Three Levels of Augmentation

The Augmenter base class defines three levels of augmentation:

from abc import ABC, abstractmethod
from ainxt.data.augmentation import Augmenter

class Augmenter(ABC):
    """Base class for all augmentation logic.

    Provides three levels of augmentation:
    1. augment_raw: Augment only raw data (no annotation changes)
    2. augment_instance: Augment data + annotations
    3. augment_batch: Augment with awareness of other items in batch
    """

    def __call__(self, x):
        """Automatically routes to correct method based on input type.

        - If x is raw data → calls augment_raw(x)
        - If x is Instance → calls augment_instance(x)
        - If x is Sequence[Instance] → calls augment_batch(x)
        """
        if isinstance(x, Instance):
            return self.augment_instance(x)
        if isinstance(x, Sequence):
            if not x:
                return x
            if isinstance(next(iter(x)), Instance):
                return self.augment_batch(x)
        return self.augment_raw(x)

    @abstractmethod
    def augment_raw(self, raw):
        """Augment raw data only (labels unchanged)."""
        raise NotImplementedError

    @abstractmethod
    def augment_instance(self, instance):
        """Augment Instance (data + annotations)."""
        raise NotImplementedError

    @abstractmethod
    def augment_batch(self, batch):
        """Augment batch (can use context from other instances)."""
        raise NotImplementedError

Why This Design?

  1. Automatic routing: Call augmenter(x) and it picks the right method
  2. Flexible input types: Works with raw data, Instances, or batches
  3. Annotation awareness: Can update labels when needed (e.g., bounding boxes after rotation)
  4. Batch context: Some augmentations need to see multiple examples (e.g., mixup, cutmix)

Creating Custom Augmenters

Example 1: Simple Image Flip

import numpy as np
from ainxt.data.augmentation import Augmenter
from ainxt.data import Instance

class FlipAugmenter(Augmenter):
    """Augmenter that flips images."""

    def __init__(self, direction="horizontal", probability=1.0):
        """Initialize flip augmenter.

        Args:
            direction: "horizontal" or "vertical"
            probability: Chance of applying flip (0.0 to 1.0)
        """
        self.direction = direction
        self.probability = probability

    def augment_raw(self, raw: np.ndarray) -> np.ndarray:
        """Flip raw image array.

        Args:
            raw: Image as numpy array (H, W, C)

        Returns:
            Flipped image
        """
        if np.random.random() > self.probability:
            return raw  # No flip

        if self.direction == "horizontal":
            return np.fliplr(raw)
        else:
            return np.flipud(raw)

    def augment_instance(self, instance: Instance) -> Instance:
        """Flip Instance (image + annotations).

        For classification, annotations don't change.
        For object detection, would need to flip bounding boxes.

        Args:
            instance: Instance with image data

        Returns:
            Instance with flipped image
        """
        # Flip the image
        flipped_data = self.augment_raw(instance.data)

        # Create new instance with flipped data
        # (annotations unchanged for classification)
        return Instance(
            data=flipped_data,
            annotations=instance.annotations,
            meta=instance.meta
        )

    def augment_batch(self, batch: list[Instance]) -> list[Instance]:
        """Flip each instance in batch independently.

        Args:
            batch: List of instances

        Returns:
            List of flipped instances
        """
        return [self.augment_instance(inst) for inst in batch]


# Usage
augmenter = FlipAugmenter(direction="horizontal", probability=0.5)

# On raw data
flipped_image = augmenter(image_array)

# On Instance
flipped_instance = augmenter(instance)

# On batch
flipped_batch = augmenter([inst1, inst2, inst3])

Example 2: Rotation with Annotation Updates

import numpy as np
from ainxt.data.augmentation import Augmenter
from ainxt.data import Instance
from scipy.ndimage import rotate

class RotateAugmenter(Augmenter):
    """Rotate images and update bounding box annotations."""

    def __init__(self, max_degrees=15, probability=1.0):
        self.max_degrees = max_degrees
        self.probability = probability

    def augment_raw(self, raw: np.ndarray) -> np.ndarray:
        """Rotate raw image."""
        if np.random.random() > self.probability:
            return raw

        angle = np.random.uniform(-self.max_degrees, self.max_degrees)
        return rotate(raw, angle, reshape=False, mode='constant')

    def augment_instance(self, instance: Instance) -> Instance:
        """Rotate image AND update bounding boxes.

        This is where augment_instance differs from augment_raw:
        we need to rotate the bounding box coordinates too!
        """
        if np.random.random() > self.probability:
            return instance

        # Get rotation angle
        angle = np.random.uniform(-self.max_degrees, self.max_degrees)

        # Rotate image
        rotated_data = rotate(instance.data, angle, reshape=False)

        # Rotate bounding boxes (if present)
        rotated_annotations = self._rotate_bboxes(
            instance.annotations,
            angle,
            instance.data.shape
        )

        return Instance(
            data=rotated_data,
            annotations=rotated_annotations,
            meta=instance.meta
        )

    def _rotate_bboxes(self, annotations, angle, image_shape):
        """Rotate bounding box coordinates."""
        # Implementation depends on annotation format
        # This is where annotation awareness is crucial!
        ...

    def augment_batch(self, batch: list[Instance]) -> list[Instance]:
        """Rotate each instance independently."""
        return [self.augment_instance(inst) for inst in batch]

Example 3: Batch-Aware Augmentation (Mixup)

import numpy as np
from ainxt.data.augmentation import Augmenter
from ainxt.data import Instance

class MixupAugmenter(Augmenter):
    """Mixup augmentation: blend pairs of images.

    This REQUIRES batch context, so augment_batch is the key method.
    """

    def __init__(self, alpha=0.2):
        """Initialize Mixup.

        Args:
            alpha: Beta distribution parameter for mixing strength
        """
        self.alpha = alpha

    def augment_raw(self, raw: np.ndarray) -> np.ndarray:
        """Mixup doesn't work on single raw images."""
        return raw  # No-op

    def augment_instance(self, instance: Instance) -> Instance:
        """Mixup doesn't work on single instances."""
        return instance  # No-op

    def augment_batch(self, batch: list[Instance]) -> list[Instance]:
        """Mix pairs of instances in the batch.

        This is where Mixup actually works - it needs pairs!
        """
        if len(batch) < 2:
            return batch

        mixed_batch = []

        for i in range(0, len(batch) - 1, 2):
            inst1 = batch[i]
            inst2 = batch[i + 1]

            # Sample mixing coefficient
            lam = np.random.beta(self.alpha, self.alpha)

            # Mix images
            mixed_data = lam * inst1.data + (1 - lam) * inst2.data

            # Mix labels (for classification)
            # This requires special handling in loss function
            mixed_instance = Instance(
                data=mixed_data,
                annotations=inst1.annotations,  # Keep first label
                meta={
                    "mixup_lambda": lam,
                    "mixup_pair": (inst1.meta.get("id"), inst2.meta.get("id"))
                }
            )

            mixed_batch.append(mixed_instance)

        return mixed_batch

Real-World Example: Albumentations Integration

From DigitalNXT.Vision/vision/data/augmentation/classification.py:31:

import albumentations as A
import numpy as np
from ainxt.data.augmentation import RawAugmenter

class ImageClassificationAugmenter(RawAugmenter):
    """Augmenter using Albumentations library for image classification.

    RawAugmenter is a convenience base class for augmentations that
    only modify raw data (not annotations).
    """

    def augment_raw(self, raw: np.ndarray) -> np.ndarray:
        """Apply Albumentations pipeline to image.

        Args:
            raw: Image as numpy array (H, W, C)

        Returns:
            Augmented image
        """
        # Get augmentation pipeline
        aug = self.get_augmentation_pipeline()

        # Apply augmentations
        return aug(image=raw)["image"]

    def get_augmentation_pipeline(self) -> A.Compose:
        """Define augmentation pipeline."""
        return A.Compose([
            A.GaussianBlur(blur_limit=(3, 5), p=0.2),
            A.Rotate(limit=(-10, 10), border_mode=4, p=0.4),
            A.GaussNoise(var_limit=(0.6, 6.5), mean=0.0, p=0.4),
            A.CoarseDropout(
                min_holes=1, max_holes=2,
                min_height=12, max_height=27,
                min_width=12, max_width=27,
                p=0.25
            ),
            A.HorizontalFlip(p=0.3),
            A.VerticalFlip(p=0.1),
            A.ToGray(p=0.2)
        ])


# Register with factory
from ainxt.serving import AUGMENTERS

AUGMENTERS.register(
    task=None,  # Works for any task
    name="image_classification",
    constructor=ImageClassificationAugmenter
)

# Use in configuration
"""
dataset:
  name: imagenet
  path: /data/imagenet
  augmenters:  # ← Triggers AugmentedDataset decorator
    - name: image_classification
"""

How Augmenters Work with Datasets

The AugmentedDataset Decorator

Augmenters are used through the AugmentedDataset decorator:

# Manual usage
from ainxt.data.datasets.decorators import AugmentedDataset

base_dataset = ImageNetDataset(path="/data")

# Create augmenters
flip = FlipAugmenter(direction="horizontal")
rotate = RotateAugmenter(max_degrees=15)

# Wrap dataset with augmenters
augmented_dataset = AugmentedDataset(
    dataset=base_dataset,
    augmenters=[flip, rotate]
)

# Now when you iterate, augmentations are applied automatically
for instance in augmented_dataset:
    # instance.data has been flipped AND rotated
    print(instance.data.shape)

Configuration-Based Usage

The power comes from YAML configuration:

dataset:
  name: imagenet
  path: /data/imagenet
  augmenters:  # ← This triggers AugmentedDataset decorator!
    - name: flip
      direction: horizontal
      probability: 0.5
    - name: rotate
      max_degrees: 15
      probability: 0.8
    - name: image_classification  # Albumentations pipeline

Behind the scenes: 1. Parser (AUGMENTERS) creates augmenter instances from config 2. Decorator (AugmentedDataset) wraps dataset with those instances 3. Iteration applies augmentations on-the-fly

See Dataset Decorators for complete details on this two-phase process.


Why Classes Instead of Functions?

1. Stateful Configuration

# Class: Store configuration
augmenter = FlipAugmenter(direction="horizontal", probability=0.5)
img1 = augmenter(image1)  # Uses stored config
img2 = augmenter(image2)  # Same config

# Function: Would need to pass params each time
img1 = flip(image1, direction="horizontal", probability=0.5)
img2 = flip(image2, direction="horizontal", probability=0.5)  # Repetitive!

2. Multiple Methods

# Class: Different methods for different input types
augmenter = FlipAugmenter()
augmenter.augment_raw(image)       # Raw numpy array
augmenter.augment_instance(inst)   # Instance object
augmenter.augment_batch(batch)     # List of instances

# Function: Would need separate functions
flip_raw(image)
flip_instance(inst)
flip_batch(batch)

3. Inheritance and Reusability

# Base class with common logic
class BaseImageAugmenter(Augmenter):
    def __init__(self, probability=1.0):
        self.probability = probability

    def should_augment(self):
        return np.random.random() < self.probability

# Inherit common logic
class FlipAugmenter(BaseImageAugmenter):
    def augment_raw(self, raw):
        if not self.should_augment():
            return raw
        return np.fliplr(raw)

class RotateAugmenter(BaseImageAugmenter):
    def augment_raw(self, raw):
        if not self.should_augment():
            return raw
        return rotate(raw, angle=15)

4. Callable Interface

# Create once, call many times like a function
augmenter = FlipAugmenter()

# Works like a function
result = augmenter(data)

# But has class benefits (state, methods, inheritance)
print(augmenter.direction)  # Access state
augmenter.probability = 0.8  # Modify state

Best Practices

1. Keep Augmenters Focused

# GOOD - single responsibility
class FlipAugmenter(Augmenter):
    """Only does flipping."""
    ...

class RotateAugmenter(Augmenter):
    """Only does rotation."""
    ...

# AVOID - doing too much
class MegaAugmenter(Augmenter):
    """Flips, rotates, blurs, crops, ..."""  # Too complex!
    ...

2. Make Them Composable

# GOOD - compose multiple small augmenters
augmenters = [
    FlipAugmenter(probability=0.5),
    RotateAugmenter(max_degrees=15, probability=0.5),
    BlurAugmenter(kernel_size=3, probability=0.3)
]

augmented_dataset = AugmentedDataset(dataset, augmenters=augmenters)

# AVOID - one giant augmenter
augmenter = MegaAugmenter(
    do_flip=True, flip_prob=0.5,
    do_rotate=True, rotate_prob=0.5,
    do_blur=True, blur_prob=0.3
)  # Hard to configure!

3. Update Annotations When Needed

# GOOD - update annotations for geometric transforms
def augment_instance(self, instance):
    rotated_image = rotate(instance.data, angle=15)
    rotated_bboxes = self._rotate_bboxes(instance.annotations, angle=15)
    return Instance(data=rotated_image, annotations=rotated_bboxes)

# AVOID - ignoring annotation changes
def augment_instance(self, instance):
    rotated_image = rotate(instance.data, angle=15)
    return Instance(data=rotated_image, annotations=instance.annotations)
    # BBoxes are now misaligned!

4. Add Probability Control

# GOOD - make augmentation stochastic
class BlurAugmenter(Augmenter):
    def __init__(self, kernel_size=3, probability=0.5):
        self.kernel_size = kernel_size
        self.probability = probability

    def augment_raw(self, raw):
        if np.random.random() > self.probability:
            return raw  # Skip augmentation
        return cv2.GaussianBlur(raw, (self.kernel_size, self.kernel_size), 0)

# AVOID - always applying
class BlurAugmenter(Augmenter):
    def augment_raw(self, raw):
        return cv2.GaussianBlur(raw, (3, 3), 0)  # Always blurs!

Summary

Augmenters provide flexible, stateful data augmentation:

  1. Classes that act like functions: Configured once, called many times
  2. Three-level interface: augment_raw, augment_instance, augment_batch
  3. Automatic routing: __call__ picks the right method
  4. Annotation awareness: Can update labels when needed
  5. Batch context: Support for augmentations like Mixup
  6. Composable: Chain multiple augmenters together
  7. Configuration-driven: Work with YAML through Parsers and Decorators

Integration with aiNXT: - Registered with AUGMENTERS factory (see Parsers) - Applied via AugmentedDataset decorator (see Dataset Decorators) - Used automatically during dataset iteration

See Also