To complete this tutorial, you'll need:
To install Python and pip. This tutorial assumes that you have some familiarity with Python, but you should be able to follow along even if you're coming from a different programming language. To check that Python and
pip (Python's package manager) are already installed in your environment or install them, follow the instructions here.
Dagster supports Python 3.8+.
To install Dagster. Run the following command from your terminal:
pip install dagster
To verify that you successfully installed Dagster, let's create your first Dagster project! You'll use the Dagster scaffolding command to give you an empty Dagster project that you can run locally.
To create the project, run:
dagster project from-example --example tutorial
After running this command, you should see a new directory called
tutorial in your current directory. This directory contains the files that make up your Dagster project. Next, you'll install the Python dependencies you'll be using during the tutorial.
cd tutorial pip install -e ".[dev]"
This command also installs packages that aren't necessary for every Dagster project but are used for this tutorial. You don't have to read up on them, but if you're curious:
requestswill be used to download data from the internet
pandasis a popular library for working with tabular data
matplotlibmakes it easy to make charts in Python
dagster_duckdbmanages how Dagster can read and write to DuckDB, an in-process data warehouse similar to SQLite, that you'll use for this tutorial
dagster_duckdb_pandasallows loading data from DuckDB into Pandas DataFrames
Fakeris a library for generating fake data
-e flag installs the project in editable mode, which means that most changes you make in your Dagster project will automatically be applied. The main exceptions are when you're adding new assets or installing additional dependencies.
To verify that it worked and that you can run Dagster locally, run:
Navigate to localhost:3000. You should see the Dagster UI. This command will run Dagster until you're ready to stop it. To stop the long-running process, press Control+C from the terminal that the process is running on.
If you'd like an explanation of what files were made and why, refer to the Creating a new Dagster project guide.
When you've made your first Dagster project, you are ready to start writing your own pipeline by creating your first asset.