Toptube Video Search Engine

Title:Intro to Amazon EMR - Big Data Tutorial using Spark

Edit* Make sure you encrypt your Spark script as you upload it inside S3 (timestamp: 13:42) There's a small typo in line 41 of the code, should be "add_argument" Intro Today we're going to talk about a popular tool in Data Engineering. Amazon EMR is an industry-leading big data platform. It's a really mature service developed way back in 2009, and draws a lot of heuristics from the Apache Hadoop project. EMR is used for processing terabytes worth of data, and training machine learning models. In this tutorial, we'll dive deep into EMR's architecture, a live demo on how to trigger jobs using Steps, and demonstrate how to use Spark to extrapolate data from Amazon S3. Hope you enjoy this one! Timestamps ⏰ 0:00 Intro 1:16 Overview of Amazon EMR 5:10 Create filesystem, VPC, and configure EMR cluster 9:04 Writing our Spark script 13:42 3 ways to Trigger Steps in EMR 18:32 SSH into Resource Manager in YARN 19:50 Enable EMR managed auto-scaling 20:57 Summary Notes from video 📝 Who am I? 🙋🏻‍♂️ I'm Jay, I love making videos about travel, self-help and tech. I currently work in New York City as a data engineer, but I grew up in Malaysia and lived in the UK when I was 19. Back then, I had no idea what life was about, moving to so many places, navigating career in Tech. Today, I've learned a lot and wanna share my perspective through filmmaking. Socials 📱 instagram: Sub Count: 4,539


Download Server 1


Download Server 2


Alternative Download :