dbt (dagster_dbt)¶
This library provides a Dagster integration with dbt (data build tool), created by Fishtown Analytics.
CLI¶
-
dagster_dbt.
dbt_cli_compile
(*args, **kwargs)[source]¶ This solid executes
dbt compile
via the dbt CLI.
-
dagster_dbt.
dbt_cli_run_operation
(*args, **kwargs)[source]¶ This solid executes
dbt run-operation
via the dbt CLI.
-
dagster_dbt.
dbt_cli_snapshot
(*args, **kwargs)[source]¶ This solid executes
dbt snapshot
via the dbt CLI.
-
dagster_dbt.
dbt_cli_snapshot_freshness
(*args, **kwargs)[source]¶ This solid executes
dbt source snapshot-freshness
via the dbt CLI.
-
class
dagster_dbt.
DbtCliOutput
[source]¶ The results of executing a dbt command, along with additional metadata about the dbt CLI process that was run.
Note that users should not construct instances of this class directly. This class is intended to be constructed from the JSON output of dbt commands.
If the executed dbt command is either
run
ortest
, then the.num_*
attributes will contain non-None
integer values. Otherwise, they will beNone
.-
classmethod
from_dict
(d: Dict[str, Any]) → dagster_dbt.cli.types.DbtCliOutput[source]¶ Constructs an instance of
DbtCliOutput
from a dictionary.- Parameters
d (Dict[str, Any]) – A dictionary with key-values to construct a
DbtCliOutput
.- Returns
An instance of
DbtCliOutput
.- Return type
-
classmethod
RPC¶
-
dagster_dbt.
create_dbt_rpc_run_sql_solid
(name: str, output_def: Optional[dagster.core.definitions.output.OutputDefinition] = None, **kwargs) → Callable[source]¶ This function is a factory which constructs a solid that will copy the results of a SQL query run within the context of a dbt project to a pandas
DataFrame
.Any kwargs passed to this function will be passed along to the underlying
@solid
decorator. However, note that overridingconfig_schema
,input_defs
, andrequired_resource_keys
is not allowed and will throw aDagsterInvalidDefinitionError
.If you would like to configure this solid with different config fields, you could consider using
@composite_solid
to wrap this solid.- Parameters
name (str) – The name of this solid.
output_def (OutputDefinition, optional) – The
OutputDefinition
for the solid. This value should always be a representation of a pandasDataFrame
. If not specified, the solid will default to anOutputDefinition
named “df” with aDataFrame
dagster type.
- Returns
Returns the constructed solid definition.
- Return type
-
dagster_dbt.
dbt_rpc_compile_sql
(*args, **kwargs)[source]¶ This solid sends the
dbt compile
command to a dbt RPC server and returns the request token.This dbt RPC solid is asynchronous. The request token can be used in subsequent RPC requests to poll the progress of the running dbt process.
-
dagster_dbt.
dbt_rpc_run
(*args, **kwargs)[source]¶ This solid sends the
dbt run
command to a dbt RPC server and returns the request token.This dbt RPC solid is asynchronous. The request token can be used in subsequent RPC requests to poll the progress of the running dbt process.
-
dagster_dbt.
dbt_rpc_run_and_wait
(*args, **kwargs)[source]¶ This solid sends the
dbt run
command to a dbt RPC server and returns the result of the executed dbt process.This dbt RPC solid is synchronous, and will periodically poll the dbt RPC server until the dbt process is completed.
-
dagster_dbt.
dbt_rpc_run_operation
(*args, **kwargs)[source]¶ This solid sends the
dbt run-operation
command to a dbt RPC server and returns the request token.This dbt RPC solid is asynchronous. The request token can be used in subsequent RPC requests to poll the progress of the running dbt process.
-
dagster_dbt.
dbt_rpc_run_operation_and_wait
(*args, **kwargs)[source]¶ This solid sends the
dbt run-operation
command to a dbt RPC server and returns the result of the executed dbt process.This dbt RPC solid is synchronous, and will periodically poll the dbt RPC server until the dbt process is completed.
-
dagster_dbt.
dbt_rpc_snapshot
(*args, **kwargs)[source]¶ This solid sends the
dbt snapshot
command to a dbt RPC server and returns the request token.This dbt RPC solid is asynchronous. The request token can be used in subsequent RPC requests to poll the progress of the running dbt process.
-
dagster_dbt.
dbt_rpc_snapshot_and_wait
(*args, **kwargs)[source]¶ This solid sends the
dbt snapshot
command to a dbt RPC server and returns the result of the executed dbt process.This dbt RPC solid is synchronous, and will periodically poll the dbt RPC server until the dbt process is completed.
-
dagster_dbt.
dbt_rpc_snapshot_freshness
(*args, **kwargs)[source]¶ This solid sends the
dbt source snapshot-freshness
command to a dbt RPC server and returns the request token.This dbt RPC solid is asynchronous. The request token can be used in subsequent RPC requests to poll the progress of the running dbt process.
-
dagster_dbt.
dbt_rpc_snapshot_freshness_and_wait
(*args, **kwargs)[source]¶ This solid sends the
dbt source snapshot
command to a dbt RPC server and returns the result of the executed dbt process.This dbt RPC solid is synchronous, and will periodically poll the dbt RPC server until the dbt process is completed.
-
dagster_dbt.
dbt_rpc_test
(*args, **kwargs)[source]¶ This solid sends the
dbt test
command to a dbt RPC server and returns the request token.This dbt RPC solid is asynchronous. The request token can be used in subsequent RPC requests to poll the progress of the running dbt process.
-
dagster_dbt.
dbt_rpc_test_and_wait
(*args, **kwargs)[source]¶ This solid sends the
dbt test
command to a dbt RPC server and returns the result of the executed dbt process.This dbt RPC solid is synchronous, and will periodically poll the dbt RPC server until the dbt process is completed.
-
dagster_dbt.
dbt_rpc_resource
ResourceDefinition[source]¶ This resource defines a dbt RPC client.
To configure this resource, we recommend using the configured method.
Examples:
custom_dbt_rpc_resource = dbt_rpc_resource.configured({"host": "80.80.80.80","port": 8080,}) @pipeline(mode_defs=[ModeDefinition(resource_defs={"dbt_rpc": custom_dbt_rpc_resource})]) def dbt_rpc_pipeline(): # Run solids with `required_resource_keys={"dbt_rpc", ...}`.
-
dagster_dbt.
local_dbt_rpc_resource
ResourceDefinition¶ This resource defines a dbt RPC client for an RPC server running on 0.0.0.0:8580.
-
class
dagster_dbt.
DbtRpcClient
(host: str = '0.0.0.0', port: int = 8580, jsonrpc_version: str = '2.0', logger: Optional[Any] = None, **_)[source]¶ A client for a dbt RPC server.
If you are need a dbt RPC server as a Dagster resource, we recommend that you use
dbt_rpc_resource
.-
cli
(*, cli: str, **kwargs) → requests.models.Response[source]¶ Sends a request with CLI syntax to the dbt RPC server, and returns the response. For more details, see the dbt docs for running CLI commands via RPC.
- Parameters
cli (str) – a dbt command in CLI syntax.
- Returns
the HTTP response from the dbt RPC server.
- Return type
Response
-
compile
(*, models: List[str] = None, exclude: List[str] = None, **kwargs) → requests.models.Response[source]¶ Sends a request with the method
compile
to the dbt RPC server, and returns the response. For more details, see the dbt docs for compiling projects via RPC.
-
compile_sql
(*, sql: str, name: str) → requests.models.Response[source]¶ Sends a request with the method
compile_sql
to the dbt RPC server, and returns the response. For more details, see the dbt docs for compiling SQL via RPC.
-
generate_docs
(*, models: List[str] = None, exclude: List[str] = None, compile: bool = False, **kwargs) → requests.models.Response[source]¶ Sends a request with the method
docs.generate
to the dbt RPC server, and returns the response. For more details, see the dbt docs for the RPC method docs.generate.- Parameters
- Returns
the HTTP response from the dbt RPC server.
- Return type
Response
-
kill
(*, task_id: str) → requests.models.Response[source]¶ Sends a request with the method
kill
to the dbt RPC server, and returns the response. For more details, see the dbt docs for the RPC method kill.- Parameters
task_id (str) – the ID of the task to terminate.
- Returns
the HTTP response from the dbt RPC server.
- Return type
Response
-
property
logger
¶ A property for injecting a logger dependency.
- Type
Optional[Any]
-
poll
(*, request_token: str, logs: bool = False, logs_start: int = 0) → requests.models.Response[source]¶ Sends a request with the method
poll
to the dbt RPC server, and returns the response. For more details, see the dbt docs for the RPC method poll.
-
ps
(*, completed: bool = False) → requests.models.Response[source]¶ Sends a request with the method
ps
to the dbt RPC server, and returns the response. For more details, see the dbt docs for the RPC method ps.- Parameters
compelted (bool) – If
True
, then also return completed tasks. Defaults toFalse
.- Returns
the HTTP response from the dbt RPC server.
- Return type
Response
-
run
(*, models: List[str] = None, exclude: List[str] = None, **kwargs) → requests.models.Response[source]¶ Sends a request with the method
run
to the dbt RPC server, and returns the response. For more details, see the dbt docs for the RPC method run.
-
run_operation
(*, macro: str, args: Optional[Dict[str, Any]] = None, **kwargs) → requests.models.Response[source]¶ Sends a request with the method
run-operation
to the dbt RPC server, and returns the response. For more details, see the dbt docs for the command run-operation.
-
run_sql
(*, sql: str, name: str) → requests.models.Response[source]¶ Sends a request with the method
run_sql
to the dbt RPC server, and returns the response. For more details, see the dbt docs for running SQL via RPC.
-
seed
(*, show: bool = False, **kwargs) → requests.models.Response[source]¶ Sends a request with the method
seed
to the dbt RPC server, and returns the response. For more details, see the dbt docs for the RPC method seed.- Parameters
show (bool, optional) – If
True
, then show a sample of the seeded data in the response. Defaults toFalse
.- Returns
the HTTP response from the dbt RPC server.
- Return type
Response
-
snapshot
(*, select: List[str] = None, exclude: List[str] = None, **kwargs) → requests.models.Response[source]¶ Sends a request with the method
snapshot
to the dbt RPC server, and returns the response. For more details, see the dbt docs for the command snapshot.
-
snapshot_freshness
(*, select: Optional[List[str]] = None, **kwargs) → requests.models.Response[source]¶ Sends a request with the method
snapshot-freshness
to the dbt RPC server, and returns the response. For more details, see the dbt docs for the command source snapshot-freshness.- Parameters
select (List[str], optional) – the models to include in calculating snapshot freshness.
- Returns
the HTTP response from the dbt RPC server.
- Return type
Response
-
status
()[source]¶ Sends a request with the method
status
to the dbt RPC server, and returns the response. For more details, see the dbt docs for the RPC method status.- Returns
the HTTP response from the dbt RPC server.
- Return type
Response
-
test
(*, models: List[str] = None, exclude: List[str] = None, data: bool = True, schema: bool = True, **kwargs) → requests.models.Response[source]¶ Sends a request with the method
test
to the dbt RPC server, and returns the response. For more details, see the dbt docs for the RPC method test.- Parameters
- Returns
the HTTP response from the dbt RPC server.
- Return type
Response
-
-
class
dagster_dbt.
DbtRpcOutput
[source]¶ The output from executing a dbt command via the dbt RPC server.
Note that users should not construct instances of this class directly. This class is intended to be constructed from the JSON output of dbt commands.
-
classmethod
from_dict
(d: Dict[str, Any]) → dagster_dbt.rpc.types.DbtRpcOutput[source]¶ Constructs an instance of
DbtRpcOutput
from a dictionary.- Parameters
d (Dict[str, Any]) – A dictionary with key-values to construct a
DbtRpcOutput
.- Returns
An instance of
DbtRpcOutput
.- Return type
-
classmethod
Types¶
-
class
dagster_dbt.
DbtResult
[source]¶ The results of executing a dbt command.
Note that users should not construct instances of this class directly. This class is intended to be constructed from the JSON output of dbt commands.
-
results
¶ Details about each executed dbt node (model) in the run.
- Type
List[NodeResult]]
-
-
class
dagster_dbt.
NodeResult
[source]¶ The result of executing a dbt node (model).
Note that users should not construct instances of this class directly. This class is intended to be constructed from the JSON output of dbt commands.
-
fail
¶ The
fail
field from the results of the executed dbt node.- Type
Optional[Any]
-
warn
¶ The
warn
field from the results of the executed dbt node.- Type
Optional[Any]
-
skip
¶ The
skip
field from the results of the executed dbt node.- Type
Optional[Any]
-
step_timings
¶ The timings for each step in the executed dbt node (model).
- Type
List[StepTiming]
-
table
¶ Details about the table/view that is created from executing a run_sql command on an dbt RPC server.
- Type
Optional[Dict]
-
classmethod
from_dict
(d: Dict[str, Any]) → dagster_dbt.types.NodeResult[source]¶ Constructs an instance of
NodeResult
from a dictionary.- Parameters
d (Dict[str, Any]) – A dictionary with key-values to construct a
NodeResult
.- Returns
An instance of
NodeResult
.- Return type
-
-
class
dagster_dbt.
StepTiming
[source]¶ The timing information of an executed step for a dbt node (model).
Note that users should not construct instances of this class directly. This class is intended to be constructed from the JSON output of dbt commands.
-
started_at
¶ An ISO string timestamp of when the step started executing.
- Type
-
completed_at
¶ An ISO string timestamp of when the step completed execution.
- Type
-
property
duration
¶ The execution duration of the step.
- Type
-
Errors¶
-
exception
dagster_dbt.
DagsterDbtError
(description=None, metadata_entries=None)[source]¶ The base exception of the
dagster-dbt
library.
-
exception
dagster_dbt.
DagsterDbtCliRuntimeError
(description: str, logs: List[Dict[str, Any]], raw_output: str)[source]¶ Represents an error while executing a dbt CLI command.
-
exception
dagster_dbt.
DagsterDbtCliFatalRuntimeError
(logs: List[Dict[str, Any]], raw_output: str)[source]¶ Represents a fatal error in the dbt CLI (return code 2).
-
exception
dagster_dbt.
DagsterDbtCliHandledRuntimeError
(logs: List[Dict[str, Any]], raw_output: str)[source]¶ Represents a model error reported by the dbt CLI at runtime (return code 1).
-
exception
dagster_dbt.
DagsterDbtCliOutputsNotFoundError
(path: str)[source]¶ Represents a problem in finding the
target/run_results.json
artifact when executing a dbt CLI command.For more details on
target/run_results.json
, see https://docs.getdbt.com/reference/dbt-artifacts#run_resultsjson.