Before you can start developing, you need to tell Dagster how to find the Python code containing your assets and jobs. There are a few ways to do this, which are outlined in the tabs below.
Note: If using an example Dagster project, or if you used the dagster CLI to create a project, you can run the dagster dev command in the same folder as the project to load the project code.
Dagster can load a file directly as a code location. In the following example, we used the -f argument to supply the name of the file:
dagster dev -f my_file.py
This command loads the definitions in my_file.py as a code location in the current Python environment.
You can also include multiple files at a time, where each file will be loaded as a code location:
dagster dev -f my_file.py -f my_second_file.py
Dagster can also load Python modules as code locations. When this approach is used, Dagster loads the definitions defined at the top-level of the module, in a variable containing the Definitions object of its root __init__.py file. As this style of development eliminates an entire class of Python import errors, we strongly recommend it for Dagster projects deployed to production.
In the following example, we used the -m argument to supply the name of the module:
dagster dev -m your_module_name
This command loads the definitions in the variable containing the Definitions object in the named module - defined as the root __init__.py file - in the current Python environment.
You can also include multiple modules at a time, where each module will be loaded as a code location:
dagster dev -m your_module_name -m your_second_module
To load definitions without supplying command line arguments, you can use the pyproject.toml file. This file, included in all Dagster example projects, contains a tool.dagster section with a module_name variable:
module_name = "your_module_name" ## name of project's Python module
code_location_name = "your_code_location_name" ## optional, name of code location to display in the Dagster UI
When defined, you can run this in the same directory as the pyproject.toml file:
When running dagster dev, you may see log output that looks like this:
Using temporary directory /Users/rhendricks/tmpqs_fk8_5 for storage.
This indicates that any runs or materialized assets created during your session won't be persisted once the session ends. This can be useful when using Dagster for temporary local development or testing, when you don't care about the results being persisted.
To designate a more permanent home for your runs and assets, you can set the DAGSTER_HOME environment variable to a folder on your filesystem. Dagster will then use the specified folder for storage on all subsequent runs of dagster dev.
You can optionally use a dagster.yaml file to configure your Dagster instance - for example, to configure run concurrency limits or specify that runs should be stored in a Postgres database instead of on the filesystem.
If the DAGSTER_HOME environment variable is set, dagster dev will look for a dagster.yaml file in the DAGSTER_HOME folder. If DAGSTER_HOME is not set, dagster dev will look for that file from the folder where the command was run.
dagster dev is primarily useful for running Dagster for local development and testing. It isn't suitable for the demands of most production deployments. Most importantly, dagster dev does not include authentication or web security. Additionally, in a production deployment, you might want to run multiple webserver replicas, have zero downtime continuous deployment of your code, or set up your Dagster daemon to automatically restart if it crashes.