Prerequisites

There are a number of prerequisites that must be in place before starting the installation process. There are also some optional prerequisites that will need to be completed to enable optional features of dbt Cloud.

Required

License

Each dbt Cloud installation requires a License file. To obtain a license file, contact sales.

Backend Database

dbt Cloud uses PostgreSQL as its backend database. The application bundle can provision an embedded Postgres instance, but this is not recommended for production use. You should configure an external Postgres instance with your cloud provider to enable enhanced backup and monitoring capability. When you configure the application using the configuration UI, you will provide credentials that dbt Cloud can use to connect to this Postgres instance.

Supported versions include 9.6, 10, and 11. The database should be configured with at least 50GB of storage.

dbt Cloud supports authentication to the database by password authentication only. Other PostgreSQL authentication methods are not currently supported.

For major cloud providers, it is recommended to use the PostgreSQL PaaS offered by that cloud provider. These PaaS offerings include monitoring, security, and backups, which should be configured consistent with customer data policies.

The dbt Cloud application can be rolled back at any time by restoring a PostgreSQL backup.

Amazon Web Services

Customers hosting dbt Cloud on AWS should use RDS for PostgreSQL. The RDS instance should be a db.t3.medium or larger.

Google Cloud Platform

Customers hosting dbt Cloud on Google Cloud Platform should use Cloud SQL for PostgreSQL. The instance should db-n1-standard-1 or larger.

Microsoft Azure

Customers hosting dbt Cloud on Microsoft Azure should use Azure Database for PostgreSQL. The instance should have at least 2 vCores and 4GB of memory.

Object Storage

dbt Cloud supports S3 (and S3-compatible APIs) or Azure Blob Storage as its object storage solution for logs and run artifacts. You will need to provide two buckets (s3) or containers (azure) to dbt Cloud: one for logs generated by your dbt runs, and another for all of the other artifacts generated by your dbt runs.

On AWS, you can use either instance profile based or AWS keypair based authentication with S3. The instance profile or keypair should be provisioned with the below IAM permissions before creating the instance and installing the application.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObjectAcl",
"s3:GetObject",
"s3:ListBucket",
"s3:HeadBucket"
],
"Resource": [
"arn:aws:s3:::<logs-bucket-name>",
"arn:aws:s3:::<logs-bucket-name>/*",
"arn:aws:s3:::<artifacts-bucket-name>",
"arn:aws:s3:::<artifacts-bucket-name>/*"
]
}
]
}

For Azure, you will need the full connection string for each of the two containers.

Google Cloud Storage

For Google Cloud Storage, you will need to use the S3-compatible API. After provisioning the two buckets, you will need to generate an HMAC key (https://cloud.google.com/storage/docs/authentication/hmackeys) that has access to the two buckets. You will use this keypair as your S3 Access Key and Access Secret in the Configuration console. When you go through the configuration process, under S3 Region, pick the GCS region (or multi-region alias). For S3 Endpoint, you should use https://storage.googleapis.com.

SSL/TLS Certificate

Your dbt Cloud instance should live at a domain name controlled by your organization (ex. dbt.<your-organization>.com). To secure users' connections to the dbt Cloud application, you will need to provision and upload a TLS certificate to the dbt Cloud instance.

For installations into an existing Kubernetes cluster, SSL termination should be handled by an external load balancer or a customer-managed proxy service. After installation, there will be service of type ClusterIP in the provided namespace called api-gateway. You must define a custom ingress or service connecting your load balancer to the dbt Cloud application.

Storage Classes (Existing Kubernetes Only)

dbt Cloud requires a StorageClass that supports creating PersistentVolumeClaims in ReadWriteMany mode. Kubernetes has a number of ready-made storage provisioners for supporting different custom storage classes here: https://github.com/kubernetes-incubator/external-storage

We recommend using NFS.

Optional

Auth0 (included in License)

dbt Cloud uses Auth0 to provide SSO integration with a variety of platforms. If you opt to use a SSO provider such as Okta, your license file will include a set of credentials specific to your installation to enable the Auth0 integration.

Email over SMTP

dbt Cloud has the ability to send emails over SMTP. The emails that dbt Cloud can send include:

  • Post-run notifications: your users can get notified when runs finish.
  • Email invitations: if your instance supports login via email/password, you can send email invitations that will allow users to log in.
  • Password resets: if your instance supports login via email/password, users can reset their own passwords.

To configure SMTP, you will provide a set of SMTP credentials in the configuration console. These credentials include:

  • SMTP Host: the hostname of your SMTP server.
  • SMTP Port: the port (often 25 or 465)
  • SMTP Username: the username to use when logging into the SMTP server. For some platforms, this will be someone's actual username. For others, you can generate a service account username.
  • SMTP Password: the password for the given username.
  • System "From" Email Address: The "From" email address that dbt Cloud will use. Typically this will be a "no-reply" or system address. If you'd like to enable your users to reply to these emails, you can change the "From" address accordingly. Make sure that you have configured your SMTP server to allow outbound email coming from this "From" address.

Github

dbt Cloud can integrate with your Github, Github Enterprise, or Github Enterprise Server installation to enable interacting with Github Pull Requests, as well as automatically permissioning your environments via Github's API. To set this up, you will need to be an Owner on your Github organization.

First, log into Github, and navigate to your organization's settings page. Click Github Apps (not OAuth Apps!), and then click New Github App to start configuring a new Github app for your dbt Cloud instance.

Enter the following values, replacing <hostname> with the hostname of your dbt Cloud installation.

  • GitHub App name: dbt Cloud
  • Homepage URL: https://www.getdbt.com
  • User authorization callback URL: https://<hostname>/complete/github
  • Setup URL: https://<hostname>/#/profile/integrations
  • Webhook URL: https://<hostname>/api/v1/webhooks/github/

Under Permissions, select the following:

  • Checks: Read & write
  • Repository contents: Read & write
  • Repository metadata: Read-only
  • Pull requests: Read-only
  • Commit statuses: Read & write

Under Subscribe to events, select the following:

  • Pull request
  • Push

Save the new application.

During dbt Cloud setup, the installer will need to provide credentials for this application to the dbt Cloud configuration console.

They will need the following:

  • the base URL of your Github enterprise installation, e.g. github.mycompany.com
  • the scheme-included API URL of your Github enterprise installation. Usually https://github.mycompany.com/api/v3
  • the App ID, Client ID, and Client Secret from the "About" page of the newly created Github app
  • the Configuration URL for your Github app: right click "Install app" on the "About" page, click "Copy Link Location" to get the Configuration URL
  • the Install URL: right click "Public page" on the "About" page, click "Copy Link Location," and paste in the value here.
  • the Private Key PEM: : on the "About" page of the newly created Github app, scroll down to the bottom of the page. Under "Private Keys," click "Generate a Private Key." Download the key and provide it to the installer.

Slack

dbt Cloud can send Slack notifications when runs finish. You can provide a set of Slack credentials for dbt Cloud to use to interact with a Slack app you create.

Datadog

dbt Cloud can be configured to send logs and metrics to your Datadog account. This requires configuration and installation of a Datadog agent.

Google SSO

If running dbt Cloud in GCP with Private Google Access enabled, SSO can be configured either using the default googleapis.com URL or setting a custom one.

Okta SSO

dbt Cloud can be configured to integrate with Okta SSO by providing a public SAML certificate and corresponding private key. The key pair can be generated using an openssl command such as the below.

openssl req -new -x509 -days 3652 -nodes -out saml.crt -keyout saml.key