Lesson 1 - Apache Hive in the Hadoop Ecosystem
Lesson 2 - Create and Load Tables in Apache Hive
Lesson 3 - Manipulate and Query Tables in Apache Hive
- Course Materials
Slide Guide (Transcript)
Lab Environment Connection Guide
Join course discussions in the MapR Academy Community
DA 440 - Apache Hive Essentials
DA 440 is an introductory-level course designed for data analysts and developers. You will learn how Apache Hive fits in the Hadoop ecosystem, how to create and load tables in Hive, and how to query data using the Hive Query Language.
Together with DA 450 - Apache Pig Essentials, this course covers how to use Pig and Hive as part of a single data flow in a Hadoop cluster. The course begins with a review of SQL-on-Hadoop tools, then covers how to create, load, query, and manipulate tables in Hive.
This on-demand course is designed to be flexible to fit your schedule. Each lesson and quiz takes approximately 30 to 45 minutes to complete.
- Option 1: Complete the course in one session, approximately 90 to 120 minutes
- Option 2: Complete the course over a few days, 3 days of 30-45min/day
Lab activities take additional time and vary based on your system.
Lesson 1: Hive in the Hadoop Ecosystem
- Use cases of Hive
- Steps in the data pipeline
Lesson 2: Create and Load Data
- Create databases, internal tables, external tables, and partitioned tables
- Learn about data types and casting in Hive
- Load data into tables and databases
Lesson 3: Query and Manipulate Data
- Query, sort, and filter data
- Manipulate data with user-defined functions
- Familiarity with a command-line interface, such as a Unix shell
- Familiarity with RDBMS database tools, such as SQL
- Access to, and the ability to use, a laptop with an internet connection and a terminal program installed (such as terminal on the Mac, or PuTTY on Windows).
- Completion of the on-demand courses ESS 100 and ESS 101