ITNEXT

ITNEXT is a platform for IT developers & software engineers to share knowledge, connect, collaborate, learn and experience next-gen technologies.

Follow publication

Member-only story

dbt and Amazon Redshift Serverless: Serverless Lakehouse Data Modeling

Gary A. Stafford
ITNEXT
Published in
7 min readAug 23, 2022

--

[ Post updated on 2022–12–30 to reflect the latest version of dbt ]

AWS recently announced the general availability (GA) of Amazon Redshift Serverless on July 12, 2022. Amazon Redshift Serverless allows data analysts, developers, and data scientists to run and scale analytics without having to provision and manage data warehouse clusters. You may find specific use cases where Amazon Redshift Serverless is a more suitable analytics tool than provisioned Amazon Redshift. The good news, dbt (data build tool) is compatible with both provisioned Amazon Redshift and Redshift Serverless.

In the following post, we will explore the use of dbt (data build tool), developed by dbt Labs, to transform data in an AWS-based data lakehouse, built with Amazon Redshift Serverless, Amazon Redshift Spectrum, AWS Glue Data Catalog, and Amazon S3.

High-level architecture demonstrated for this blog post

Namespaces and Workgroups

As outlined in the AWS documentation, there are many differences between provisioned and serverless Redshift. One of the more significant differences is the concept of namespaces and workgroups with Redshift Serverless instead of clusters with provisioned Redshift. According to AWS, “Amazon Redshift Serverless doesn’t have the concept of a cluster. Instead, you have a workgroup, which contains the compute resources available to process workloads, and the namespace, which contains the associated database resources, snapshots, encryption keys, and users.

Personally, I feel the dissimilar hierarchal relationships within data services confuse many users — databases within database instances (e.g., Amazon RDS), databases within database clusters (e.g., Amazon Aurora and provisioned Redshift), databases within instances within clusters (e.g., Amazon DocumentDB), and now databases within namespaces within workgroups with Amazon Redshift Serverless.

Pricing

Another significant difference is how Amazon Redshift Serverless is billed. Unlike provisioned Redshift, billed per…

--

--

Published in ITNEXT

ITNEXT is a platform for IT developers & software engineers to share knowledge, connect, collaborate, learn and experience next-gen technologies.

Written by Gary A. Stafford

Area Principal Solutions Architect @ AWS | 10x AWS Certified Pro | Polyglot Developer | DataOps | GenAI | Technology consultant, writer, and speaker

Write a response