This PBQ requires a Premium Membership and is being shown in a read-only preview mode.
ETL Pipeline from Sales to Data Warehouse
Project Overview: We need to establish a daily ETL pipeline using the ETL App Console to incorporate the new sales-transactions feed into our enterprise data warehouse. The sales-transactions data is stored in a PostgreSQL database. Ensure you configure each step correctly, as detailed below.
Tasks to Complete:
- Connect to Data Source:
- Review the table below for connection and connector details
- Configure the Data Source settings accordingly
- Map Data Types:
- Assign the appropriate SQL data type for each field in the sales-transactions table. See the data example table below for expected SQL table structure and data types.
- Define Primary Keys:
- Identify and assign primary key(s) for the sales-transactions table to enforce data uniqueness.
- Configure Scheduling
- Set Data Quality Rules:
- Invalid or incomplete entries should be ignored
- Duplicate IDs should be treated as updates, for example a partial refund may result in an existing entry's amount property be modified
Table Format
Column | Example Data |
---|---|
TransactionID | 4f9ded2d-905a-4e0f-a384-e1c4a99713c0 |
Amount | 3.49 |
TransactionDate | 2023-10-26 14:30:00-05 |
Database Connectivity Details
Property | Value |
---|---|
Database Flavor | PostgreSQL |
Database Hostname | dbhost |
Database Port | 5432 |
Database Name | salesdb |
Database User | etl_user |
Database Password | securePass! |
Database Connection String Format | jdbc:postgresql://<hostname>:<port>/<database>?user=<user>&password=<password> |