Building Data Lakes on AWS
The Building Data Lakes on AWS (ANBDLK) Course is designed for participants who want to learn how to design, build, and manage a data lake on AWS. Participants will learn how to build an operational data lake that supports the analysis of both structured and unstructured data. They will learn the components and capabilities of the services involved in building a data lake. They will use AWS Lake Formation to create a data lake, AWS Glue to create a data catalog, and Amazon Athena to analyze the data. This course helps prepare for the AWS Certified Data Analytics – Specialty Certification .
Course Objectives
Below is a summary of the main objectives of the Building Data Lakes on AWS (ANBDLK) course :
- Learn to design, build, and manage an operational data lake on AWS that supports analysis of both structured and unstructured data.
- Gain knowledge of the components and capabilities of AWS services involved in building a data lake.
- Use AWS Lake Formation to create a data lake, making it easy to set up, secure, and manage your data lake.
- Use AWS Glue to create a data catalog, making it easier to discover and prepare data for analysis.
- Use Amazon Athena to analyze data, enabling you to run SQL queries directly on data stored in the data lake.
- Data Security and Governance: Understand best practices for implementing data security and governance policies within your AWS data lake environment to ensure data integrity and compliance.
- Data Integration and Streaming: Learn techniques for integrating streaming data sources into your data lake using AWS services like Amazon Kinesis, ensuring real-time data availability for analytics.
- Advanced Analytics and Machine Learning: Explore advanced analytics capabilities on AWS, including integration with machine learning services like Amazon SageMaker, to derive deeper insights and predictions from your data lake.
Course Certification
This course helps you prepare to take the:
AWS Certified Data Analytics – Specialty Exam ;
Course Outline
Module 1: Introduction to data lakes
- Describe the value of data lakes
- Compare data lakes and data warehouses
- Describe the components of a data lake
- Recognize common architectures built on data lakes
Module 2: Data ingestion, cataloging, and preparation
- Describe the relationship between data lake storage and data ingestion
- Describe AWS Glue crawlers and how they are used to create a data catalog
- Identify data formatting, partitioning, and compression for efficient storage and query
- Lab 1: Set up a simple data lake
Module 3: Data processing and analytics
- Recognize how data processing applies to a data lake
- Use AWS Glue to process data within a data lake
- Describe how to use Amazon Athena to analyze data in a data lake
Module 4: Building a data lake with AWS Lake Formation
- Describe the features and benefits of AWS Lake Formation
- Use AWS Lake Formation to create a data lake
- Understand the AWS Lake Formation security model
- Lab 2: Build a data lake using AWS Lake Formation
Module 5: Additional Lake Formation configurations
- Automate AWS Lake Formation using blueprints and workflows
- Apply security and access controls to AWS Lake Formation
- Match records with AWS Lake Formation FindMatches
- Visualize data with Amazon QuickSight
- Lab 3: Automate data lake creation using AWS Lake Formation blueprints
- Lab 4: Data visualization using Amazon QuickSight
Module 6: Architecture and course review
Course Mode
Instructor-Led Remote Live Classroom Training;
Trainers
Trainers are Amazon AWS accredited instructors and certified in other IT technologies, with years of practical experience in the sector and in training.
Lab Topology
For all types of delivery, the participant can access the equipment and actual systems in our laboratories or directly in international data centers remotely, 24/7. Each participant has access to implement various configurations, Thus immediately applying the theory learned. Below are some scenarios drawn from laboratory activities.
Course Details
Course Prerequisites
- Attendance at the AWS Technical Essentials Course and the Data Analytics Fundamentals Course is recommended .
Course Duration
Intensive duration 1 days;
Course Frequency
Course Duration: 1 days (9.00 to 17.00) - Ask for other types of attendance.
Course Date
- Building Data Lakes on AWS(Formula Intensiva) – On Request – 9:00 – 17:00
Steps to Enroll
Registration takes place by asking to be contacted from the following link, or by contacting the office at the international number +355 45 301 313 or by sending a request to the email info@hadartraining.com