Open Source Data Lake Table Formats: Evaluating Current Interest and Rate of Adoption

Measuring the current levels of interest and potential adoption rates of leading data lake table formats using commonly available metrics

Gary A. Stafford
17 min readFeb 12, 2022

This post examines the current levels of interest and potential adoption rates for the three popular data lake table formats: Apache Hudi™, Apache Iceberg™, and Delta Lake™. Using publicly available data, this post unbiasedly reviews analytics community involvement, project activity, commercial support, and levels of third-party vendor integration. Understanding these metrics is critical to an organization’s decision to adopt a data lake table format. Being confident that an open-source project or commercial product has sufficient backing, longevity, and a robust user base must be part of any product selection criteria.

Image copyright: peshkov (123rf.com)
Image copyright: peshkov

Prelude: Big Data and Analytics Market

According to Pitchbook, US venture capital-backed companies raised $329.6 billion…

--

--

Gary A. Stafford

Area Principal Solutions Architect @ AWS | 10x AWS Certified Pro | Polyglot Developer | DataOps | GenAI | Technology consultant, writer, and speaker