DEV 360 - Introduction to Apache Spark (Spark v1.6)

DEV 360 - Introduction to Apache Spark (Spark v1.6)

About this Course

This introductory course is targeted towards developers to enable them to build simple Spark applications in Apache Spark v1.6. It introduces the benefits of Spark for developing big data processing applications, loading and inspecting data using the Spark interactive shell and building a standalone application.

What's Covered

Course Lessons Lab Activities

1: Introduction to Apache Spark

Describe the Features of Apache Spark
Define Apache Spark Components

 

No labs

2: Load and Inspect Data in Spark

Describe the Different Data Sources and Formats in Spark
Create and Use Resilient Distributed Datasets (RDD)
Apply Operations to RDDs
Cache Intermediate RDD
Create and Use DataFrames

 

Load and Inspect Auction Data
Load and Inspect Data in DataFrames

3: Build a Simple Spark Application

Define the Lifecycle of a Spark Program
Define the Function of SparkContext
Define Different Ways to Run a Spark Application
Run Your Spark Application

 

Build a Simple Spark Application
Package the Spark Application
Launch the Application

Get Certified

This course is part of the preparation for the MapR Certified Spark Developer (MCSD) certification exam.

Prerequisites

  • Completion of ESS 100 - 102, and ESS 360
  • Basic Hadoop knowledge and intermediate Linux knowledge
  • Experience using a text editor such as vi
  • Terminal program installed; familiarity with command-line options such as mv, cp, ssh, grep, cd, and useradd
  • Knowledge of functional programming with Scala or Python, and experience with SQL

Curriculum

  • Lesson 1 - Introduction to Apache Spark
  • Quiz 1
  • Lesson 2 - Load and Inspect Data
  • Quiz 2
  • Lesson 3 - Build a Simple Apache Spark Application
  • Quiz 3
  • Course Materials
  • Slide Guide (Transcript)
  • Lab Guide

About this Course

This introductory course is targeted towards developers to enable them to build simple Spark applications in Apache Spark v1.6. It introduces the benefits of Spark for developing big data processing applications, loading and inspecting data using the Spark interactive shell and building a standalone application.

What's Covered

Course Lessons Lab Activities

1: Introduction to Apache Spark

Describe the Features of Apache Spark
Define Apache Spark Components

 

No labs

2: Load and Inspect Data in Spark

Describe the Different Data Sources and Formats in Spark
Create and Use Resilient Distributed Datasets (RDD)
Apply Operations to RDDs
Cache Intermediate RDD
Create and Use DataFrames

 

Load and Inspect Auction Data
Load and Inspect Data in DataFrames

3: Build a Simple Spark Application

Define the Lifecycle of a Spark Program
Define the Function of SparkContext
Define Different Ways to Run a Spark Application
Run Your Spark Application

 

Build a Simple Spark Application
Package the Spark Application
Launch the Application

Get Certified

This course is part of the preparation for the MapR Certified Spark Developer (MCSD) certification exam.

Prerequisites

  • Completion of ESS 100 - 102, and ESS 360
  • Basic Hadoop knowledge and intermediate Linux knowledge
  • Experience using a text editor such as vi
  • Terminal program installed; familiarity with command-line options such as mv, cp, ssh, grep, cd, and useradd
  • Knowledge of functional programming with Scala or Python, and experience with SQL

Curriculum

  • Lesson 1 - Introduction to Apache Spark
  • Quiz 1
  • Lesson 2 - Load and Inspect Data
  • Quiz 2
  • Lesson 3 - Build a Simple Apache Spark Application
  • Quiz 3
  • Course Materials
  • Slide Guide (Transcript)
  • Lab Guide