Ways to Build Bronze Layer
2 Scenarios
50 Minutes

Industry
general
Skills
approach
data-understanding
data-storage
data-quality
batch-etl
data-wrangling
Tools
databricks
Learning Objectives
Learn the best practices for creating a Bronze Layer for raw customer data.
Understand the difference between INSERT INTO and INSERT OVERWRITE
Understand the limitations of manual file processing
Learn the benefits of using COPY INTO with Directory Listing
Overview
This module focuses on understanding common methods for loading data into Delta tables, particularly when building a Bronze Layer for raw data storage. We will explore INSERT INTO and INSERT OVERWRITE, highlighting their differences, limitations, and a more efficient alternative.
Additionally, we will cover how to streamline data ingestion using the COPY INTO command with Directory Listing. This approach automates file scanning and loading, eliminating manual intervention while ensuring that only new data is processed. Mastering these techniques is crucial for efficient, scalable, and clean data ingestion in Databricks, especially when handling large-scale datasets.
Prerequisites
- Basic knowledge of Delta Lake
- Understanding of SQL
- Basic understanding of ETL/ELT