Skip to content

aiNXT

Welcome to aiNXT documentation - a standardized foundation library for building machine learning applications with consistent patterns for data handling, model training, and experiment tracking.

Overview

aiNXT provides a standardized foundation through abstract base classes and supporting functionality that enables you to:

  • Define Data Structures - Abstract base classes (Annotation, Instance, Dataset) for consistent data representation
  • Build ML Models - Abstract base classes (Model, TrainableModel) with standardized interfaces for training and prediction
  • Automate Workflows - Factory pattern system with configuration-driven object creation
  • Track Experiments - Integrated MLflow support for experiment tracking and artifact management
  • Deploy Models - Serialization utilities and serving infrastructure
  • Develop Locally - DevSpace environment with local MLflow + MinIO stack

New to aiNXT? Start here:

Understanding the library architecture:

For developers working with aiNXT:

Core Concepts

Building Blocks: Annotation, Instance & Dataset

aiNXT standardizes data handling through three abstract base classes:

  • Annotation: Represents labels and metadata for data points (e.g., classification labels, bounding boxes)
  • Instance: Combines raw data with its annotations - represents a single training/inference example
  • Dataset: Collection of instances with standardized iteration, batching, and splitting capabilities
# Example: Creating a custom dataset
from ainxt.data import Dataset, RawInstance, Annotation

annotation = Annotation(labels="1", meta={1: "Class A", 2: "Class B"})
instance = RawInstance(data=[1.2, 3.4, 5.6], annotations=[annotation])
# Your custom Dataset class inherits from Dataset and defines how to parse raw data

Model Abstraction: Model, Prediction & Training

aiNXT provides abstract base classes for building ML models:

  • Model: Base class requiring predict(), save(), and load() methods
  • TrainableModel: Extends Model with a fit() method for training
  • Prediction: Standardized prediction objects with classifications/scores and metadata
# Example: Custom model implementation
from ainxt.models import TrainableModel, Prediction

class MyModel(TrainableModel):
    def fit(self, dataset):
        # Your training logic
        pass

    def predict(self, instance):
        # Returns list of Prediction objects
        return [Prediction(classification={...}, meta={...})]

Factory Pattern: Configuration-Driven Workflows

The factory system automates object creation from configuration files:

  • Builder: Maps (task, name) tuples to object constructors
  • Factory: Registry managing multiple builders
  • Context: Global container providing factories for datasets, models, metrics, and visualizations
# Example: Loading objects from configuration
from context import CONTEXT

# Load dataset from config file
dataset = CONTEXT.load_dataset(config.data)

# Load model from config file
model = CONTEXT.load_model(config.model)

# Train using configuration
model.fit(dataset, **config.training.params)

Standardized Scripts

aiNXT includes production-ready scripts for common workflows:

  • Train Script (ainxt.scripts.training.train): Configuration-driven model training with MLflow logging
  • Evaluate Script (ainxt.scripts.evaluation.evaluate): Model evaluation with metrics and visualizations
  • Inference Script: Apply trained models to new data

All scripts use the same configuration-driven approach, making experiments reproducible and deployments consistent.

Quick Start

# Setup environment
just install

# Start local development environment (MLflow + MinIO)
just dev-start

# Run tests
just test

# Start documentation server
just docs

Philosophy

aiNXT is designed as a foundation library, not a complete ML framework. It provides:

Abstract base classes for data and models ✅ Factory patterns for configuration-driven workflows ✅ MLflow integration for experiment tracking ✅ Standardized scripts for training and evaluation ✅ Reusable components - concrete implementations (e.g., Seeds_Dataset) for common use cases

While primarily foundational, aiNXT includes non-abstract implementations of certain base classes. These serve as both examples and reusable components that other packages can leverage directly, promoting consistency and reducing duplication across the ML ecosystem.

Other packages build upon aiNXT to create domain-specific ML applications with their own concrete implementations of datasets, models, and workflows. The most prominent package that uses aiNXT is digitalNXT Vision.

Azure Databricks Integration

While aiNXT works in any Python environment (local, cloud, containers), it is primarily designed for Azure Databricks for the following reasons:

  1. Computational Power - Leverage Databricks clusters for distributed training and large-scale data processing
  2. Built-in MLflow - Native MLflow integration for seamless experiment tracking and model registry
  3. Unified Workflows - Run the same configuration-driven scripts locally (DevSpace) or on Databricks
  4. Azure Ecosystem - Integrated with Azure DevOps, Storage Accounts and other Azure services

The DevSpace environment (local MLflow + MinIO) mirrors the Databricks MLflow setup, enabling you to develop and test locally before deploying to Databricks production clusters.

Development Team

  • Laurens Reulink (Lead Data Scientist)
  • Sahar Hoseini (Data Scientist)