DA 410 - Apache Drill Essentials

This introductory Apache Drill course, targeted at Data Analysts, Scientists and SQL programmers, covers how to use Drill to explore known or unknown data without writing code.

Processing...
Processing...

About this course

This introductory Apache Drill course, targeted at Data Analysts, Scientists and SQL programmers, covers how to use Drill to explore known or unknown data without writing code. You will write SQL queries on a variety of data types including structured data in a Hive table, semi-structured data in HBase or MapR-DB, and complex data file types, such as Parquet and JSON.

This on-demand course is designed to be flexible to fit your schedule. Each lesson and quiz takes approximately 30 to 45 minutes to complete.

  • Option 1: Complete the course in one session, approximately 60 to 90 minutes
  • Option 2: Complete the course over a few days, 2 days of 30-45min/day

Lab activities take additional time and vary based on your system.

Syllabus

Lesson 1 – SQL Queries

  • Perform familiar SQL queries with Drill on structured content
  • Perform familiar SQL queries on semi structured content
  • Join structured and semi structured content into a single query
  • Explore unknown data with drill explorer

Lesson 2 – Self Describing Data

  • Define self describing data
  • Determine how Drill discovers schema of data
  • Use drill explorer to explore unknown data and determine its structure to perform queries
  • Create a view and visualize the view with BI tools

Lab Exercises

  • Familiar SQL queries on structured Hive data
  • Familiar SQL queries on complex data
    • Query Parquet data
    • Query JSON data
    • A single query that joins Hive, HBase and JSON
  • Explore Multiple Data Sources with the Drill Explorer
    • Drill Explorer Interface
    • Data sources
    • Discover data schema
    • Preview data
    • Save a view

Prerequisites for Success

Required:

  • Completion of ADM 200 - Cluster Administration: Install a MapR Cluster
  • Basic to intermediate Linux knowledge, including
    • The ability to use a text editor, such as vi
    • Familiarity with basic command-line options such as mv, cp, ssh, grep, cd, and useradd
  • Access to, and the ability to use, a laptop with a browser and terminal program installed (such as terminal on the Mac, or PuTTY on Windows) 
  • Access to a MapR cluster (such as one installed as part of  ADM 200)

Recommended:

  • Completion of the on-demand course HDE 100 - Hadoop Essentials
  • Completion of the on-demand course HDE 110 - MapR Distribution Essentials

Optional:

  • Basic Hadoop knowledge

Curriculum

  • Get Started
  • Lesson 1 - SQL Queries
  • Lesson 2 - Query Self Describing Data
  • DA 410 Quiz
  • Course Materials
  • Slide Guide (Transcript)
  • Lab Guide
  • Lab Files
  • Apache Drill Sandbox
  • Lab Environment Connection Guide
  • Join course discussions in the MapR Academy Community

About this course

This introductory Apache Drill course, targeted at Data Analysts, Scientists and SQL programmers, covers how to use Drill to explore known or unknown data without writing code. You will write SQL queries on a variety of data types including structured data in a Hive table, semi-structured data in HBase or MapR-DB, and complex data file types, such as Parquet and JSON.

This on-demand course is designed to be flexible to fit your schedule. Each lesson and quiz takes approximately 30 to 45 minutes to complete.

  • Option 1: Complete the course in one session, approximately 60 to 90 minutes
  • Option 2: Complete the course over a few days, 2 days of 30-45min/day

Lab activities take additional time and vary based on your system.

Syllabus

Lesson 1 – SQL Queries

  • Perform familiar SQL queries with Drill on structured content
  • Perform familiar SQL queries on semi structured content
  • Join structured and semi structured content into a single query
  • Explore unknown data with drill explorer

Lesson 2 – Self Describing Data

  • Define self describing data
  • Determine how Drill discovers schema of data
  • Use drill explorer to explore unknown data and determine its structure to perform queries
  • Create a view and visualize the view with BI tools

Lab Exercises

  • Familiar SQL queries on structured Hive data
  • Familiar SQL queries on complex data
    • Query Parquet data
    • Query JSON data
    • A single query that joins Hive, HBase and JSON
  • Explore Multiple Data Sources with the Drill Explorer
    • Drill Explorer Interface
    • Data sources
    • Discover data schema
    • Preview data
    • Save a view

Prerequisites for Success

Required:

  • Completion of ADM 200 - Cluster Administration: Install a MapR Cluster
  • Basic to intermediate Linux knowledge, including
    • The ability to use a text editor, such as vi
    • Familiarity with basic command-line options such as mv, cp, ssh, grep, cd, and useradd
  • Access to, and the ability to use, a laptop with a browser and terminal program installed (such as terminal on the Mac, or PuTTY on Windows) 
  • Access to a MapR cluster (such as one installed as part of  ADM 200)

Recommended:

  • Completion of the on-demand course HDE 100 - Hadoop Essentials
  • Completion of the on-demand course HDE 110 - MapR Distribution Essentials

Optional:

  • Basic Hadoop knowledge

Curriculum

  • Get Started
  • Lesson 1 - SQL Queries
  • Lesson 2 - Query Self Describing Data
  • DA 410 Quiz
  • Course Materials
  • Slide Guide (Transcript)
  • Lab Guide
  • Lab Files
  • Apache Drill Sandbox
  • Lab Environment Connection Guide
  • Join course discussions in the MapR Academy Community