Skip to main content

Get started with the dbt Semantic Layer

The dbt Semantic Layer, powered by MetricFlow, simplifies defining and using critical business metrics. It centralizes metric definitions, eliminates duplicate coding, and ensures consistent self-service access to metrics in downstream tools.

MetricFlow, a powerful component of the dbt Semantic Layer, simplifies the creation and management of company metrics. It offers flexible abstractions, SQL query generation, and enables fast retrieval of metric datasets from a data platform.

Use this guide to fully experience the power of the universal dbt Semantic Layer. Here are the following steps you'll take:

MetricFlow allows you to define metrics in your dbt project and query them whether in dbt Cloud or dbt Core with MetricFlow commands.

However, to experience the power of the universal dbt Semantic Layer and query those metrics in downstream tools, you'll need a dbt Cloud Team or Enterprise account.

πŸ“Ή Learn about the dbt Semantic Layer with on-demand video courses!

Explore our dbt Semantic Layer on-demand course to learn how to define and query metrics in your dbt project.

Additionally, dive into mini-courses for querying the dbt Semantic Layer in your favorite tools: Tableau (beta), Hex, and Mode.

Prerequisites​

  • Have a dbt Cloud Team or Enterprise account. Suitable for both Multi-tenant and Single-tenant deployment.
    • Note: Single-tenant accounts should contact their account representative for necessary setup and enablement.
  • Have both your production and development environments running dbt version 1.6 or higher.
  • Use Snowflake, BigQuery, Databricks, or Redshift.
  • Create a successful run in the environment where you configure the Semantic Layer.
    • Note: Semantic Layer currently supports the Deployment environment for querying. (development querying experience coming soon)
  • Set up the Semantic Layer API in the integrated tool to import metric definitions.
    • dbt Core or Developer accounts can define metrics but won't be able to dynamically query them.
  • Understand MetricFlow's key concepts, which powers the latest dbt Semantic Layer.
  • Note that the dbt Semantic Layer doesn't yet support SSH tunneling for Postgres or Redshift connections. It also doesn't support using Single sign-on (SSO) for Semantic Layer production credentials, however, SSO is supported for development user accounts.
tip

New to dbt or metrics? Try our Jaffle shop example project to help you get started!

Create a semantic model​

The following steps describe how to set up semantic models. Semantic models consist of entities, dimensions, and measures.

We highly recommend you read the overview of what a semantic model is before getting started. If you're working in the Jaffle shop example, delete the orders.yml config or delete the .yml extension so it's ignored during parsing. We'll be rebuilding it step by step in this example.

If you're following the guide in your own project, pick a model that you want to build a semantic manifest from and fill in the config values accordingly.

  1. Create a new yml config file for the orders model, such as orders.yml.

It's best practice to create semantic models in the /models/semantic_models directory in your project. Semantic models are nested under the semantic_models key. First, fill in the name and appropriate metadata, map it to a model in your dbt project, and specify model defaults. For now, default_agg_time_dimension is the only supported default.

semantic_models:
#The name of the semantic model.
- name: orders
defaults:
agg_time_dimension: ordered_at
description: |
Order fact table. This table is at the order grain with one row per order.
#The name of the dbt model and schema
model: ref('orders')
  1. Define your entities. These are the keys in your table that MetricFlow will use to join other semantic models. These are usually columns like customer_id, order_id, and so on.
  #Entities. These usually correspond to keys in the table.
entities:
- name: order_id
type: primary
- name: location
type: foreign
expr: location_id
- name: customer
type: foreign
expr: customer_id
  1. Define your dimensions and measures. Dimensions are properties of the records in your table that are non-aggregatable. They provide categorical or time-based context to enrich metrics. Measures are the building block for creating metrics. They are numerical columns that MetricFlow aggregates to create metrics.
    #Measures. These are the aggregations on the columns in the table.
measures:
- name: order_total
description: The total revenue for each order.
agg: sum
- name: order_count
expr: 1
agg: sum
- name: tax_paid
description: The total tax paid on each order.
agg: sum
- name: customers_with_orders
description: Distinct count of customers placing orders
agg: count_distinct
expr: customer_id
- name: locations_with_orders
description: Distinct count of locations with order
expr: location_id
agg: count_distinct
- name: order_cost
description: The cost for each order item. Cost is calculated as a sum of the supply cost for each order item.
agg: sum
#Dimensions. Either categorical or time. These add additional context to metrics. The typical querying pattern is Metric by Dimension.
dimensions:
- name: ordered_at
type: time
type_params:
time_granularity: day
- name: order_total_dim
type: categorical
expr: order_total
- name: is_food_order
type: categorical
- name: is_drink_order
type: categorical

Putting it all together, a complete semantic model configurations based on the order model would look like the following example:

semantic_models:
#The name of the semantic model.
- name: orders
defaults:
agg_time_dimension: ordered_at
description: |
Order fact table. This table is at the order grain with one row per order.
#The name of the dbt model and schema
model: ref('orders')
#Entities. These usually corespond to keys in the table.
entities:
- name: order_id
type: primary
- name: location
type: foreign
expr: location_id
- name: customer
type: foreign
expr: customer_id
#Measures. These are the aggregations on the columns in the table.
measures:
- name: order_total
description: The total revenue for each order.
agg: sum
- name: order_count
expr: 1
agg: sum
- name: tax_paid
description: The total tax paid on each order.
agg: sum
- name: customers_with_orders
description: Distinct count of customers placing orders
agg: count_distinct
expr: customer_id
- name: locations_with_orders
description: Distinct count of locations with order
expr: location_id
agg: count_distinct
- name: order_cost
description: The cost for each order item. Cost is calculated as a sum of the supply cost for each order item.
agg: sum
#Dimensions. Either categorical or time. These add additional context to metrics. The typical querying pattern is Metric by Dimension.
dimensions:
- name: ordered_at
type: time
type_params:
time_granularity: day
- name: order_total_dim
type: categorical
expr: order_total
- name: is_food_order
type: categorical
- name: is_drink_order
type: categorical
tip

If you're familiar with writing SQL, you can think of dimensions as the columns you would group by and measures as the columns you would aggregate.

select
metric_time_day, -- time
country, -- categorical dimension
sum(revenue_usd) -- measure
from
snowflake.fact_transactions -- sql table
group by metric_time_day, country -- dimensions

Define metrics​

Now that you've created your first semantic model, it's time to define your first metric! You can define metrics with the dbt Cloud IDE or command line.

MetricFlow supports different metric types like conversion, simple, ratio, cumulative, and derived. It's recommended that you read the metrics overview docs before getting started.

  1. You can define metrics in the same YAML files as your semantic models or create a new file. If you want to create your metrics in a new file, create another directory called /models/metrics. The file structure for metrics can become more complex from here if you need to further organize your metrics, for example, by data source or business line.

  2. The example metric we'll create is a simple metric that refers directly to the order_total measure, which will be implemented as a sum() function in SQL. Again, if you're working in the Jaffle shop sandbox, we recommend deleting the original orders.yml file, or removing the .yml extension so it's ignored during parsing. We'll be rebuilding the order_total metric from scratch. If you're working in your own project, create a simple metric like the one below using one of the measures you created in the previous step.

metrics:
- name: order_total
description: Sum of total order amount. Includes tax + revenue.
type: simple
label: Order Total
type_params:
measure: order_total
  1. Save your code, and in the next section, you'll validate your configs before committing them to your repository.

To continue building out your metrics based on your organization's needs, refer to the Build your metrics for detailed info on how to define different metric types and semantic models.

Test and query metrics​

This section explains how you can test and run MetricFlow commands with dbt Cloud or dbt Core (dbt Cloud IDE support coming soon). dbt Cloud IDE users can skip to Run a production job to run a model.

Testing and querying metrics in the dbt Cloud IDE is currently not supported

Support for running MetricFlow commands in the dbt Cloud IDE is not available but is coming soon.

You can use the Preview or Compile buttons in the IDE to run semantic validations and make sure your metrics are defined. Alternatively, you can run commands with the dbt Cloud CLI or with SQL client tools like DataGrip, DBeaver, or RazorSQL.

This section is for people using the dbt Cloud CLI (support for dbt Cloud IDE is coming soon). With dbt Cloud:

  • You can run MetricFlow commands after installing the dbt Cloud CLI. They're integrated with dbt Cloud so you can use them immediately.
  • Your account will automatically manage version control for you.

To get started:

  1. Make sure you've installed the dbt Cloud CLI.

  2. Navigate to your dbt project directory.

  3. Run a dbt command, such as dbt parse, dbt run, dbt compile, or dbt build. If you don't, you'll receive an error message that begins with: "ensure that you've ran an artifacts...."

    • MetricFlow builds a semantic graph and generates a semantic_manifest.json file in dbt Cloud, which is stored in the /target directory. If using the Jaffle shop example, run dbt seed && dbt run to ensure the required data is in your data platform before proceeding.
  4. Run dbt sl --help to confirm you have MetricFlow installed and that you can view the available commands.

  5. Run dbt sl query --metrics <metric_name> --group-by <dimension_name> to query the metrics and dimensions. For example, dbt sl query --metrics order_total --group-by metric_time

  6. Verify that the metric values are what you expect. To further understand how the metric is being generated, you can view the generated SQL if you type --compile in the command line.

  7. Commit and merge the code changes that contain the metric definitions.

To streamline your metric querying process, you can connect to the dbt Semantic Layer APIs to access your metrics programmatically. For SQL syntax, refer to Querying the API for metric metadata to query metrics using the API.

Run a production job​

Once you’ve defined metrics in your dbt project, you can perform a job run in your deployment environment in dbt Cloud to materialize your metrics. The deployment environment is only supported for the dbt Semantic Layer currently.

  1. Select Deploy from the top navigation bar.
  2. Select Jobs to rerun the job with the most recent code in the deployment environment.
  3. Your metric should appear as a red node in the dbt Cloud IDE and dbt directed acyclic graphs (DAG).
DAG with metrics appearing as a red nodeDAG with metrics appearing as a red node
What’s happening internally?
- Merging the code into your main branch allows dbt Cloud to pull those changes and builds the definition in the manifest produced by the run.
- Re-running the job in the deployment environment helps materialize the models, which the metrics depend on, in the data platform. It also makes sure that the manifest is up to date.
- The Semantic Layer APIs pulls in the most recent manifest and allows your integration information to extract metadata from it.

Set up dbt Semantic Layer​

You can set up the dbt Semantic Layer in dbt Cloud at the environment and project level. Before you begin:

  • You must be part of the Owner group, and have the correct license and permissions to configure the Semantic Layer:
    • Enterprise plan β€” Developer license with Account Admin permissions. Or Owner with a Developer license, assigned Project Creator, Database Admin, or Admin permissions.
    • Team plan β€” Owner with a Developer license.
  • You must have a successful run in your new environment.
tip

If you've configured the legacy Semantic Layer, it has been deprecated. dbt Labs strongly recommends that you upgrade your dbt version to dbt version 1.6 or higher to use the latest dbt Semantic Layer. Refer to the dedicated migration guide for details.

  1. In dbt Cloud, create a new deployment environment or use an existing environment on dbt 1.6 or higher.

    • Note β€” Deployment environment is currently supported (development experience coming soon)
  2. Navigate to Account Settings and select the specific project you want to enable the Semantic Layer for.

  3. In the Project Details page, navigate to the Semantic Layer section, and select Configure Semantic Layer.

Semantic Layer section in the Project Details pageSemantic Layer section in the Project Details page
  1. In the Set Up Semantic Layer Configuration page, enter the credentials you want the Semantic Layer to use specific to your data platform.

    • Use credentials with minimal privileges. This is because the Semantic Layer requires read access to the schema(s) containing the dbt models used in your semantic models for downstream applications
    • Note, Environment variables such as {{env_var('DBT_WAREHOUSE')}, doesn't supported the dbt Semantic Layer yet. You must use the actual credentials.
Enter the credentials you want the Semantic Layer to use specific to your data platform, and select the deployment environment.Enter the credentials you want the Semantic Layer to use specific to your data platform, and select the deployment environment.
  1. Select the deployment environment you want for the Semantic Layer and click Save.

  2. After saving it, you'll be provided with the connection information that allows you to connect to downstream tools. If your tool supports JDBC, save the JDBC URL or individual components (like environment id and host). If it uses the GraphQL API, save the GraphQL API host information instead.

After configuring, you'll be provided with the connection details to connect to you downstream tools.After configuring, you'll be provided with the connection details to connect to you downstream tools.
  1. Save and copy your environment ID, service token, and host, which you'll need to use downstream tools. For more info on how to integrate with partner integrations, refer to Available integrations.

  2. Return to the Project Details page, then select Generate Service Token. You will need Semantic Layer Only and Metadata Only service token permissions.

Great job, you've configured the Semantic Layer πŸŽ‰!

Connect and query API​

You can query your metrics in a JDBC-enabled tool or use existing first-class integrations with the dbt Semantic Layer.

  • You must have a dbt Cloud Team or Enterprise account. Suitable for both Multi-tenant and Single-tenant deployment.
    • Single-tenant accounts should contact their account representative for necessary setup and enablement.
  • To learn how to use the JDBC or GraphQL API and what tools you can query it with, refer to dbt Semantic Layer APIs.
  • To learn more about the sophisticated integrations that connect to the dbt Semantic Layer, refer to Available integrations for more info. You can also use Exports to expose tables of metrics and dimensions in your data platform and create a custom integration with tools such as PowerBI, and more.

Next steps​

0