DA 440 - Apache Hive Essentials

This introductory Apache Hive course is targeted at Data Analysts, Scientists and SQL programmers. It covers how to use Hive to query structured data without writing MapReduce code.

Processing...
Processing...

About this Course

DA 440 is an introductory-level course designed for data analysts and developers. You will learn how Apache Hive fits in the Hadoop ecosystem, how to create and load tables in Hive, and how to query data using the Hive Query Language.

Together with DA 450 - Apache Pig Essentials, this course covers how to use Pig and Hive as part of a single data flow in a Hadoop cluster. The course begins with a review of SQL-on-Hadoop tools, then covers how to create, load, query, and manipulate tables in Hive.

This on-demand course is designed to be flexible to fit your schedule. Each lesson and quiz takes approximately 30 to 45 minutes to complete. Lab activities take additional time and vary based on your system.

Prerequisites

Required:

  • Completion of the on-demand course ESS 100 - Big Data Essentials
  • Completion of the on-demand course ESS 101 - Apache Hadoop Essentials
  • Completion of the on-demand course ESS 102 - MapR Converged Data Platform Essentials

Recommended:

  • Basic Hadoop knowledge

Certification

The courses in this curriculum prepare you for the MapR Certified Data Analyst (MCDA) certification exam.

Syllabus

Lesson 1 - Hive in the Hadoop Ecosystem

  • Hive Use Cases
  • Lab 1.1: Connect to the Hive CLI
  • Steps in the data pipeline
  • Hive in the Hadoop Ecosystem
  • Data types use with Hive
  • Lab 1.4: Cast Data

Lesson 2 - Create and Load Data

  • Create databases and internal tables
  • Lab 2.1a: Create a Database
  • Lab 2.1b: Create a Simple Table
  • Create external tables and partitioned tables
  • Lab 2.2: Create Partitioned and External Tables
  • Load data into tables and databases
  • Lab 2.3: Load Data into Tables
  • Alter and drop tables
  • Lab 2.4: Examine Databases and Tables

Lesson 3 - Query and Manipulate Data

  • Query, sort, and filter data
  • Lab 3.1: Query Data with SELECT
  • Manipulate data with user-defined functions
  • Lab 3.2: Query Data with UDFs
  • Combine and store tables
  • Lab 3.3: Combine and Store Data

Curriculum

  • Lesson 1: Apache Hive in the Hadoop Ecosystem
  • Quiz 1
  • Lesson 2: Create and Load Tables in Apache Hive
  • Quiz 2
  • Lesson 3: Manipulate and Query Tables in Apache Hive
  • Quiz 3
  • Course Materials
  • Slide Guide (Transcript)
  • Lab Guide
  • Join MapR Community Discussions

About this Course

DA 440 is an introductory-level course designed for data analysts and developers. You will learn how Apache Hive fits in the Hadoop ecosystem, how to create and load tables in Hive, and how to query data using the Hive Query Language.

Together with DA 450 - Apache Pig Essentials, this course covers how to use Pig and Hive as part of a single data flow in a Hadoop cluster. The course begins with a review of SQL-on-Hadoop tools, then covers how to create, load, query, and manipulate tables in Hive.

This on-demand course is designed to be flexible to fit your schedule. Each lesson and quiz takes approximately 30 to 45 minutes to complete. Lab activities take additional time and vary based on your system.

Prerequisites

Required:

  • Completion of the on-demand course ESS 100 - Big Data Essentials
  • Completion of the on-demand course ESS 101 - Apache Hadoop Essentials
  • Completion of the on-demand course ESS 102 - MapR Converged Data Platform Essentials

Recommended:

  • Basic Hadoop knowledge

Certification

The courses in this curriculum prepare you for the MapR Certified Data Analyst (MCDA) certification exam.

Syllabus

Lesson 1 - Hive in the Hadoop Ecosystem

  • Hive Use Cases
  • Lab 1.1: Connect to the Hive CLI
  • Steps in the data pipeline
  • Hive in the Hadoop Ecosystem
  • Data types use with Hive
  • Lab 1.4: Cast Data

Lesson 2 - Create and Load Data

  • Create databases and internal tables
  • Lab 2.1a: Create a Database
  • Lab 2.1b: Create a Simple Table
  • Create external tables and partitioned tables
  • Lab 2.2: Create Partitioned and External Tables
  • Load data into tables and databases
  • Lab 2.3: Load Data into Tables
  • Alter and drop tables
  • Lab 2.4: Examine Databases and Tables

Lesson 3 - Query and Manipulate Data

  • Query, sort, and filter data
  • Lab 3.1: Query Data with SELECT
  • Manipulate data with user-defined functions
  • Lab 3.2: Query Data with UDFs
  • Combine and store tables
  • Lab 3.3: Combine and Store Data

Curriculum

  • Lesson 1: Apache Hive in the Hadoop Ecosystem
  • Quiz 1
  • Lesson 2: Create and Load Tables in Apache Hive
  • Quiz 2
  • Lesson 3: Manipulate and Query Tables in Apache Hive
  • Quiz 3
  • Course Materials
  • Slide Guide (Transcript)
  • Lab Guide
  • Join MapR Community Discussions