DA 440 - Query and Store Data with Apache Hive

DA 440 - Query and Store Data with Apache Hive

About this Course

This course begins with a review of SQL-on-Hadoop tools, then covers how to create, load, query, and manipulate tables in Hive. You will learn how to use Hive to query structured data without writing MapReduce code. You will learn how Apache Hive fits in the Hadoop ecosystem, how to create and load tables in Hive, and how to query data using the Hive Query Language. Together with DA 450 - Transform Data with Apache Pig, you can learn how to use Pig and Hive as part of a single data flow in a Hadoop cluster.

What's Covered

Course Lessons Lab Activities

1: Hive in the Hadoop Ecosystem

Hive Use Cases
Steps in the Data Pipeline
Hive in the Hadoop Ecosystem
Data Types Use With Hive

 

Connect to the Hive CLI
Cast Data

2: Create and Load Data

About Physical Query Plans
Create Databases and Internal Tables
Create External Tables and Partitioned Tables
Load Data Into Tables and Databases
Alter and Drop Tables

 

Create a Database
Create a Simple Table
Create Partitioned and External Tables
Load Data Into Tables
Examine Databases and Tables

3: Query and Store Data

Query, Sort, and Filter Data
Manipulate Data With User-defined Functions
Combine and Store Tables

 

Query Data With SELECT
Query Data With UDFs
Combine and Store Data

Get Certified

This course is part of the preparation for the MapR Certified Data Analyst (MCDA) certification exam.

Prerequisites

  • Completion of ESS 100 - 102
  • Linux skills, including familiarity with command-line options such as ls, cd, cp, and su
  • Beginning to intermediate proficiency with SQL
  • Basic Hadoop knowledge

Curriculum

  • Lesson 1 - Apache Hive in the Hadoop Ecosystem
  • Lesson 2 - Create and Load Data in Apache Hive
  • Quiz 2
  • Lesson 3 - Query Data in Apache Hive
  • Quiz 3
  • Course Materials
  • Slide Guide (Transcript)
  • Lab Guide

About this Course

This course begins with a review of SQL-on-Hadoop tools, then covers how to create, load, query, and manipulate tables in Hive. You will learn how to use Hive to query structured data without writing MapReduce code. You will learn how Apache Hive fits in the Hadoop ecosystem, how to create and load tables in Hive, and how to query data using the Hive Query Language. Together with DA 450 - Transform Data with Apache Pig, you can learn how to use Pig and Hive as part of a single data flow in a Hadoop cluster.

What's Covered

Course Lessons Lab Activities

1: Hive in the Hadoop Ecosystem

Hive Use Cases
Steps in the Data Pipeline
Hive in the Hadoop Ecosystem
Data Types Use With Hive

 

Connect to the Hive CLI
Cast Data

2: Create and Load Data

About Physical Query Plans
Create Databases and Internal Tables
Create External Tables and Partitioned Tables
Load Data Into Tables and Databases
Alter and Drop Tables

 

Create a Database
Create a Simple Table
Create Partitioned and External Tables
Load Data Into Tables
Examine Databases and Tables

3: Query and Store Data

Query, Sort, and Filter Data
Manipulate Data With User-defined Functions
Combine and Store Tables

 

Query Data With SELECT
Query Data With UDFs
Combine and Store Data

Get Certified

This course is part of the preparation for the MapR Certified Data Analyst (MCDA) certification exam.

Prerequisites

  • Completion of ESS 100 - 102
  • Linux skills, including familiarity with command-line options such as ls, cd, cp, and su
  • Beginning to intermediate proficiency with SQL
  • Basic Hadoop knowledge

Curriculum

  • Lesson 1 - Apache Hive in the Hadoop Ecosystem
  • Lesson 2 - Create and Load Data in Apache Hive
  • Quiz 2
  • Lesson 3 - Query Data in Apache Hive
  • Quiz 3
  • Course Materials
  • Slide Guide (Transcript)
  • Lab Guide