DEV 362 - Create Data Pipelines Using Apache Spark

Not currently available

About this Course

DEV 362 describes the benefits of the Apache Spark unified platform and how to build data pipeline application using Spark streaming, Spark SQLSpark GraphX and MLlib. The concepts are taught using scenarios in Scala that also form the basis of hands-on labs.

Prerequisites

  • Completion of ESS 100, ESS 101ESS 360
  • Basic Hadoop knowledge and intermediate linux knowledge
  • Experience using a text editor such as vi
  • Terminal program installed; familiarity with command-line options such as mv, cp, ssh, grep, cd, and useradd
  • Knowledge of functional programming with Scala or Python, and experience with SQL

Certification

This course is part of the preparation for the MapR Certified Spark Developer (MCSD) certification exam.

Syllabus

Lesson 7  Introduction to Apache Spark Data Pipelines

  • Identify Spark Unified Stack Components
  • List Benefits of Apache Spark Unified Stack Over Hadoop Ecosystem
  • Describe Spark Data Pipeline Use Cases

Lesson 8  Create an Apache Spark Streaming Application

  • Describe Spark Streaming Architecture
  • Create DStreams and a Spark Streaming Application
  • Apply Operations on DStreams
  • Define Windowed Operations
  • Describe How Streaming Applications are Fault-Tolerant
  • Lab 8: Create a Spark Streaming Application

Lesson 9  Use Apache Spark GraphX to Analyze Flight Data

  • Describe GraphX
  • Define Regular, Directed, and Property Graphs
  • Create a Property Graph
  • Perform Operations on Graphs
  • Lab 9: Use Apache Spark GraphX

Lesson 10  Use Apache Spark MLlib

  • Describe Apache Spark MLlib
  • Describe the Machine Learning Techniques
  • Use Collaborative Filtering to Predict User Choice
  • Lab 10: Use Apache Spark MLlib to Make Recommendations

Curriculum

  • Lesson 7 - Introduction to Apache Spark Data Pipelines
  • Quiz 7
  • Lesson 8 - Create Data Pipelines With Apache Spark
  • Quiz 8
  • Lesson 9 - Use Apache Spark GraphX
  • Quiz 9
  • Lesson 10 - Use Apache Spark MLlib
  • Quiz 10

About this Course

DEV 362 describes the benefits of the Apache Spark unified platform and how to build data pipeline application using Spark streaming, Spark SQLSpark GraphX and MLlib. The concepts are taught using scenarios in Scala that also form the basis of hands-on labs.

Prerequisites

  • Completion of ESS 100, ESS 101ESS 360
  • Basic Hadoop knowledge and intermediate linux knowledge
  • Experience using a text editor such as vi
  • Terminal program installed; familiarity with command-line options such as mv, cp, ssh, grep, cd, and useradd
  • Knowledge of functional programming with Scala or Python, and experience with SQL

Certification

This course is part of the preparation for the MapR Certified Spark Developer (MCSD) certification exam.

Syllabus

Lesson 7  Introduction to Apache Spark Data Pipelines

  • Identify Spark Unified Stack Components
  • List Benefits of Apache Spark Unified Stack Over Hadoop Ecosystem
  • Describe Spark Data Pipeline Use Cases

Lesson 8  Create an Apache Spark Streaming Application

  • Describe Spark Streaming Architecture
  • Create DStreams and a Spark Streaming Application
  • Apply Operations on DStreams
  • Define Windowed Operations
  • Describe How Streaming Applications are Fault-Tolerant
  • Lab 8: Create a Spark Streaming Application

Lesson 9  Use Apache Spark GraphX to Analyze Flight Data

  • Describe GraphX
  • Define Regular, Directed, and Property Graphs
  • Create a Property Graph
  • Perform Operations on Graphs
  • Lab 9: Use Apache Spark GraphX

Lesson 10  Use Apache Spark MLlib

  • Describe Apache Spark MLlib
  • Describe the Machine Learning Techniques
  • Use Collaborative Filtering to Predict User Choice
  • Lab 10: Use Apache Spark MLlib to Make Recommendations

Curriculum

  • Lesson 7 - Introduction to Apache Spark Data Pipelines
  • Quiz 7
  • Lesson 8 - Create Data Pipelines With Apache Spark
  • Quiz 8
  • Lesson 9 - Use Apache Spark GraphX
  • Quiz 9
  • Lesson 10 - Use Apache Spark MLlib
  • Quiz 10