Working with binary-encoded Apache Avro files and Avro schemas using Apache Avro Tools

Introduction

Amazon S3 Select, a feature of Amazon Simple Storage Service (Amazon S3), has been generally available since April 2018. According to documentation, “with Amazon S3 Select, you can use simple structured query language (SQL) statements to filter the contents of an Amazon S3 object and retrieve just the subset of…


Using a registry to decouple schemas from messages in an event streaming analytics architecture

In the last post, Getting Started with Spark Structured Streaming and Kafka on AWS using Amazon MSK and Amazon EMR, we learned about Apache Spark and Spark Structured Streaming on Amazon EMR (fka Amazon Elastic MapReduce) with Amazon Managed Streaming for Apache Kafka (Amazon MSK). We consumed messages from and…


Exploring Apache Spark with Apache Kafka using both batch queries and Spark Structured Streaming

Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. Using Structured Streaming, you can express your streaming computation the same way you would express a batch computation on static data. In this post, we will learn how to use Apache Spark and Spark…


Import data from Amazon RDS into Amazon S3 using Amazon MSK, Apache Kafka Connect, Debezium, Apicurio Registry, and Amazon EKS

In the last post, Hydrating a Data Lake using Query-based CDC with Apache Kafka Connect and Kubernetes on AWS, we utilized Kafka Connect to export data from an Amazon RDS for PostgreSQL relational database and import the data into a data lake built on Amazon Simple Storage Service (Amazon S3)…


Import data from an Amazon RDS database into an Amazon S3-based data lake using Amazon EKS, Amazon MSK, and Apache Kafka Connect

A data lake, according to AWS, is a centralized repository that allows you to store all your structured and unstructured data at any scale. Data is collected from multiple sources and moved into the data lake. …


Securely decoupling Go-based microservices on Amazon EKS using Amazon MSK with IRSA, SASL/SCRAM, and data encryption

As organizations scale and mature, they frequently endeavor to move away from a monolithic application architecture toward a distributed, microservices-based paradigm. As part of this transition, organizations regularly embrace modern programming languages and frameworks, adopt containerization, acquire a preference for open-source software components, and opt for asynchronous event-driven communication models…


Observing a gRPC-based Kubernetes application using Jaeger, Zipkin, Prometheus, Grafana, and Kiali on Amazon EKS running Istio service mesh

In the previous two-part post, Kubernetes-based Microservice Observability with Istio Service Mesh, we explored a set of popular open source observability tools easily integrated with the Istio service mesh. The tools included Jaeger and Zipkin for distributed transaction monitoring, Prometheus for metrics collection and alerting, Grafana for metrics querying, visualization…


Automatically build and push Docker images to Docker Hub using GitHub Actions

According to GitHub, GitHub Actions allows you to automate, customize, and execute your software development workflows right in your repository. You can discover, create, and share actions to perform any job you would like, including continuous integration (CI) and continuous deployment (CD), and combine actions in a completely customized workflow.


Observing a distributed system using Jaeger, Prometheus, Grafana, Kiali, and Fluent Bit on Amazon EKS with Istio Service Mesh

In part two of this two-part post, we will continue to explore the set of popular open-source observability tools easily integrated with the Istio service mesh. While these tools are not a part of Istio, they are essential to making the most of Istio’s observability features. The tools include Jaeger


Observing a distributed system using Jaeger, Prometheus, Grafana, Kiali, and Fluent Bit on Amazon EKS with Istio Service Mesh

This two-part post explores a set of popular open-source observability tools easily integrated with the Istio service mesh. While these tools are not a part of Istio, they are essential to making the most of Istio’s observability features. The tools include Jaeger and Zipkin for distributed transaction monitoring, Prometheus for…

Gary A. Stafford

AWS Senior Solutions Architect | AWS Certified Pro | Polyglot Developer | Data Analytics | DataOps | DevOps

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store