Building Data Lakes on AWS

The Building Data Lakes on AWS (ANBDLK) Course is designed for participants who want to learn how to design, build, and manage a data lake on AWS. Participants will learn how to build an operational data lake that supports the analysis of both structured and unstructured data. They will learn the components and capabilities of the services involved in building a data lake. They will use AWS Lake Formation to create a data lake, AWS Glue to create a data catalog, and Amazon Athena to analyze the data. This course helps prepare for the AWS Certified Data Analytics – Specialty Certification .

Course Objectives

Below is a summary of the main objectives of the Building Data Lakes on AWS (ANBDLK) course :

  1. Learn to design, build, and manage an operational data lake on AWS that supports analysis of both structured and unstructured data.
  2. Gain knowledge of the components and capabilities of AWS services involved in building a data lake.
  3. Use AWS Lake Formation to create a data lake, making it easy to set up, secure, and manage your data lake.
  4. Use AWS Glue to create a data catalog, making it easier to discover and prepare data for analysis.
  5. Use Amazon Athena to analyze data, enabling you to run SQL queries directly on data stored in the data lake.
  6. Data Security and Governance: Understand best practices for implementing data security and governance policies within your AWS data lake environment to ensure data integrity and compliance.
  7. Data Integration and Streaming: Learn techniques for integrating streaming data sources into your data lake using AWS services like Amazon Kinesis, ensuring real-time data availability for analytics.
  8. Advanced Analytics and Machine Learning: Explore advanced analytics capabilities on AWS, including integration with machine learning services like Amazon SageMaker, to derive deeper insights and predictions from your data lake.

Course Certification

This course helps you prepare to take the:
AWS Certified Data Analytics – Specialty Exam ;

Course Outline

Module 1: Introduction to data lakes

  • Describe the value of data lakes
  • Compare data lakes and data warehouses
  • Describe the components of a data lake
  • Recognize common architectures built on data lakes

Module 2: Data ingestion, cataloging, and preparation

  • Describe the relationship between data lake storage and data ingestion
  • Describe AWS Glue crawlers and how they are used to create a data catalog
  • Identify data formatting, partitioning, and compression for efficient storage and query
  • Lab 1: Set up a simple data lake

Module 3: Data processing and analytics

  • Recognize how data processing applies to a data lake
  • Use AWS Glue to process data within a data lake
  • Describe how to use Amazon Athena to analyze data in a data lake

Module 4: Building a data lake with AWS Lake Formation

  • Describe the features and benefits of AWS Lake Formation
  • Use AWS Lake Formation to create a data lake
  • Understand the AWS Lake Formation security model
  • Lab 2: Build a data lake using AWS Lake Formation

Module 5: Additional Lake Formation configurations

  • Automate AWS Lake Formation using blueprints and workflows
  • Apply security and access controls to AWS Lake Formation
  • Match records with AWS Lake Formation FindMatches
  • Visualize data with Amazon QuickSight
  • Lab 3: Automate data lake creation using AWS Lake Formation blueprints
  • Lab 4: Data visualization using Amazon QuickSight

Module 6: Architecture and course review

Course Mode

Instructor-Led Remote Live Classroom Training;

Trainers

Trainers are Amazon AWS accredited instructors and certified in other IT technologies, with years of practical experience in the sector and in training.

Lab Topology

For all types of delivery, the participant can access the equipment and actual systems in our laboratories or directly in international data centers remotely, 24/7. Each participant has access to implement various configurations, Thus immediately applying the theory learned. Below are some scenarios drawn from laboratory activities.

Course Details

Course Prerequisites

  • Attendance at the  AWS Technical Essentials Course  and the  Data Analytics Fundamentals Course is recommended .

Course Duration

Intensive duration 1 days;

Course Frequency

Course Duration: 1 days (9.00 to 17.00) - Ask for other types of attendance.

Course Date

  • Building Data Lakes on AWS(Formula Intensiva) – On Request – 9:00 – 17:00

Steps to Enroll

Registration takes place by asking to be contacted from the following link, or by contacting the office at the international number +355 45 301 313 or by sending a request to the email info@hadartraining.com