The easiest way to start building a Dagster project is by using the
dagster project CLI. This CLI tool helps generate files and folder structures that enable you to quickly get started with Dagster.
You can scaffold a new project using the default project skeleton, or start with one of the official Dagster examples.
To get started, you can run:
pip install dagster dagster project scaffold --name my-dagster-project
dagster project scaffold generates a folder structure with a single Dagster code location and other files such as
setup.py. This helps you to quickly start with an empty project with everything set up.
Here's a breakdown of the files and directories that are generated:
|my_dagster_project/||A Python package that contains your new Dagster code.|
|my_dagster_project_tests/||A Python package that contains tests for |
|README.md||A description and starter guide for your new Dagster project.|
|pyproject.toml||A file that specifies package core metadata in static, tool-agnostic way.|
It includes a
|setup.py||A build script with Python package dependencies for your new project as a package.|
|setup.cfg||An ini file that contains option defaults for |
Inside of the
my_dagster_project/ directory, the following files and directories are generated:
Refer to the Code locations documentation to learn other ways to deploy and load your Dagster code.
|my_dagster_project/assets.py||A Python module that contains software-defined assets.|
Note: As your project grows, we recommend organizing assets in sub-packages or sub-modules. For example, you can put all analytics related assets in a
The newly generated
my-dagster-project directory is a fully functioning Python package and can be installed with
pip. To install it as a package and its Python dependencies, run:
pip install -e ".[dev]"
By using the
pip will install your code location as a Python package in "editable mode" so that as you develop, local code changes will automatically apply.
Then, start the Dagit web server:
Open http://localhost:3000 with your browser to see the project.
Now, you can start writing assets in
You can specify new Python dependencies in
Environment variables, which are key-value pairs configured outside your source code, allow you to dynamically modify application behavior depending on environment.
Using environment variables, you can define various configuration options for your Dagster application and securely set up secrets. For example, instead of hard-coding database credentials - which is bad practice and cumbersome for development - you can use environment variables to supply user details. This allows you to parameterize your pipeline without modifying code or insecurely storing sensitive data.
Refer to the Using environment variables and secrets in Dagster code guide for more info and examples.
Tests can be added in the
my_dagster_project_tests directory and you can run tests using
Once your project is ready to move to production, check out our recommendation for Transitioning Data Pipelines from Development to Production.
Check out the following resources to learn more about deployment options: