Getting Started with PySpark for Big Data Analytics using Jupyter Notebooks and Jupyter Docker Stacks

Gary A. Stafford
16 min readNov 22, 2018

An updated version of this popular post is published in Towards Data Science: Getting Started with Data Analytics using Jupyter Notebooks, PySpark, and Docker

Introduction

There is little question, big data analytics, data science, artificial intelligence (AI), and machine learning (ML), a subcategory of AI, have all experienced a tremendous surge in popularity over the last few years. Behind the hype curves and marketing buzz…

--

--

Gary A. Stafford

Area Principal Solutions Architect @ AWS | 10x AWS Certified Pro | Polyglot Developer | DataOps | GenAI | Technology consultant, writer, and speaker