Skip to main content

Machine learning with PyTorch

In this example, you'll build a complete CNN-based digit classifier that:

Build production-ready ML pipelines using Dagster's asset-based architecture
Train and deploy CNN models with automated quality gates and rollback capabilities
Implement configurable training workflows that adapt across development and production environments
Create scalable inference services supporting both batch and real-time prediction scenarios

Prerequisites

To follow the steps in this guide, you'll need:

Basic Python knowledge
Python 3.10+ installed on your system. Refer to the Installation guide for information.
Basic familiarity with machine learning concepts (neural networks, training/validation splits)
Understanding of PyTorch fundamentals (tensors, models, training loops)

Step 1: Set up your Dagster environment

First, set up a new Dagster project with the ML dependencies.

Clone the Dagster repo and navigate to the project:
```
cd examples/docs_projects/project_ml
```
Install the required dependencies with uv:
```
uv sync
```
Activate the virtual environment:
- MacOS
- Windows
source .venv/bin/activate

Step 2: Launch the Dagster webserver

To make sure Dagster and its dependencies were installed correctly, navigate to the project root directory and start the Dagster webserver:

dg dev

Next steps

Continue this example with data ingestion

Prerequisites
Step 1: Set up your Dagster environment
Step 2: Launch the Dagster webserver
Next steps