DA 450 - Apache Pig Essentials

This introductory Apache Pig course, targeted at Data Analysts, Scientists and SQL programmers, covers how to use Pig to analyze structured data without writing MapReduce code.

Processing...
Processing...

About this Course

DA 450 – Apache Pig Essentials is an introductory-level course designed for data analysts and developers. The course begins with a review of data pipeline tools, then covers how to load and manipulate relations in Pig.

Together with DA 440 – Apache Hive Essentials, this course covers how to use Pig and Hive as part of a single data flow in a Hadoop cluster. The course begins with a review of data pipeline tools, then covers how to load and manipulate relations in Pig.

This on-demand course is designed to be flexible to fit your schedule. Each lesson and quiz takes approximately 30 to 45 minutes to complete. Lab activities take additional time and vary based on your system.

Prerequisites

Required:

  • Completion of the on-demand course ESS 100 - Big Data Essentials
  • Completion of the on-demand course ESS 101 - Apache Hadoop Essentials
  • Completion of the on-demand course ESS 102 - MapR Converged Data Platform Essentials

Recommended:

  • Basic Hadoop knowledge

Certification

The courses in this curriculum prepare you for the MapR Certified Data Analyst (MCDA) certification exam.

Syllabus

Lesson 1: Pig in the Hadoop Ecosystem
  • Use cases of Pig
  • Lab 1.1: Connect to the Grunt Shell
  • Steps in the data pipeline
  • Data types used in Pig
Lesson 2: Extract, Transform, and Load Data
  • Load data into relations
  • Lab 2.1: Load Data into Pig Relations
  • Debug Pig scripts
  • Lab 2.2: Examine Pig Relations
  • Perform simple manipulations
  • Lab 2.3: Basic Data Manipulations
  • Save relations as files
  • Lab 2.4: Store Data
Lesson 3: Manipulate Data
  • Subset relations
  • Lab 3.1: Load and Filter Relations
  • Combine relations
  • Lab 3.2: Transform and Join Relations
  • Use UDFs on relations
  • Lab 3.3: Explore Data

Curriculum

  • Lesson 1: Apache Pig in the Hadoop Ecosystem
  • Quiz 1
  • Lesson 2: Extract, Transform, and Load Data with Apache Pig
  • Quiz 2
  • Lesson 3: Manipulate Data with Apache Pig
  • Quiz 3
  • Course Materials
  • Slide Guide (Transcript)
  • Lab Guide
  • Join MapR Community Discussions

About this Course

DA 450 – Apache Pig Essentials is an introductory-level course designed for data analysts and developers. The course begins with a review of data pipeline tools, then covers how to load and manipulate relations in Pig.

Together with DA 440 – Apache Hive Essentials, this course covers how to use Pig and Hive as part of a single data flow in a Hadoop cluster. The course begins with a review of data pipeline tools, then covers how to load and manipulate relations in Pig.

This on-demand course is designed to be flexible to fit your schedule. Each lesson and quiz takes approximately 30 to 45 minutes to complete. Lab activities take additional time and vary based on your system.

Prerequisites

Required:

  • Completion of the on-demand course ESS 100 - Big Data Essentials
  • Completion of the on-demand course ESS 101 - Apache Hadoop Essentials
  • Completion of the on-demand course ESS 102 - MapR Converged Data Platform Essentials

Recommended:

  • Basic Hadoop knowledge

Certification

The courses in this curriculum prepare you for the MapR Certified Data Analyst (MCDA) certification exam.

Syllabus

Lesson 1: Pig in the Hadoop Ecosystem
  • Use cases of Pig
  • Lab 1.1: Connect to the Grunt Shell
  • Steps in the data pipeline
  • Data types used in Pig
Lesson 2: Extract, Transform, and Load Data
  • Load data into relations
  • Lab 2.1: Load Data into Pig Relations
  • Debug Pig scripts
  • Lab 2.2: Examine Pig Relations
  • Perform simple manipulations
  • Lab 2.3: Basic Data Manipulations
  • Save relations as files
  • Lab 2.4: Store Data
Lesson 3: Manipulate Data
  • Subset relations
  • Lab 3.1: Load and Filter Relations
  • Combine relations
  • Lab 3.2: Transform and Join Relations
  • Use UDFs on relations
  • Lab 3.3: Explore Data

Curriculum

  • Lesson 1: Apache Pig in the Hadoop Ecosystem
  • Quiz 1
  • Lesson 2: Extract, Transform, and Load Data with Apache Pig
  • Quiz 2
  • Lesson 3: Manipulate Data with Apache Pig
  • Quiz 3
  • Course Materials
  • Slide Guide (Transcript)
  • Lab Guide
  • Join MapR Community Discussions