Create a New Project#

This section will show you how to a create a new Dagster project and organize your files as you build larger and larger jobs. Dagster comes with a convenient CLI command for generating a project skeleton, but you can also choose to organize your files differently as your project evolves.

If you're completely new to Dagster, we recommend that you visit our Tutorial to learn all the basic concepts of Dagster.

Generating a Project Skeleton#

If you're just starting a new Dagster project, the CLI command dagster new-project will generate a project skeleton with boilerplate code for development and testing. If you have dagster installed in your Python environment, then you can run the following shell command to generate a Dagster project called PROJECT_NAME:

dagster new-project PROJECT_NAME
cd PROJECT_NAME

The newly generated PROJECT_NAME directory is in fact a fully functioning Python package and can be installed with pip. A workspace.yaml file is also created that tells Dagit to load your code from this package. See the Workspaces page for more information on how to tell Dagster how to load your code.

Here's a breakdown of the files and directories that are generated:

File/DirectoryDescription
PROJECT_NAME/A Python package that contains code for your new Dagster repository
PROJECT_NAME_tests/A Python package that contains tests for PROJECT_NAME
workspace.yamlA file that specifies the location of your code for Dagit and the Dagster CLI. Visit the Workspaces overview for more details.
README.mdA description and guide for your new code repository
setup.pyA build script with Python package dependencies for your new code repository

Inside of the directory PROJECT_NAME/, the following files and directories are generated:

File/DirectoryDescription
PROJECT_NAME/ops/A Python package that contains OpDefinitions, which represent individual units of computation
PROJECT_NAME/jobs/A Python package that contains JobDefinitions, which are built up from ops
PROJECT_NAME/schedules/A Python package that contains ScheduleDefinitions, to trigger recurring job runs based on time
PROJECT_NAME/sensors/A Python package that contains SensorDefinitions, to trigger job runs based on external state
PROJECT_NAME/repository.pyA Python module that contains a RepositoryDefinition, to specify which jbos, schedules, and sensors are available in your repository

This file structure is a good starting point and suitable for most Dagster projects. As you build more and more jobs, you may eventually find your own way of structuring your code that works best for you.

Local Development#

  1. Install your repository as a Python package. By using the --editable flag, pip will install your repository in "editable mode" so that as you develop, local code changes will automatically apply.
pip install --editable .
  1. Start the Dagit process. This will start a Dagit web server that, by default, is served on http://localhost:3000.
dagit

The Dagit process automatically uses the file workspace.yaml to find your repositories, from which Dagster will load your jobs, schedules, and sensors. To see how you can customize the Dagit process, run dagit --help.

  1. (Optional) If you want to enable Dagster Schedules or Sensors for your jobs, start the Dagster Daemon process in a different shell or terminal:
dagster-daemon run

Once your Dagster Daemon process is running, you should be able to enable schedules and sensors for your Dagster jobs.

Local Testing#

Once you have created a new Dagster repository with the CLI command dagster new-project, you can find tests in PROJECT_NAME_tests, where PROJECT_NAME is the name of your project. You can run all of your tests with the following command:

pytest PROJECT_NAME_tests

As you create Dagster ops and jobs, add tests in PROJECT_NAME_tests/ to check that your code behaves as desired and does not break over time.

For hints on how to write tests for ops and jobs in Dagster, see our documentation tutorial on Testing.

Deployment#

Once your Dagster project is ready, visit the Deployment Guides to learn how to run Dagster in production environments, such as Docker, Kubernetes, AWS EC2, etc.