Preview Mode — This PBQ requires a Premium Membership and is being shown in a read-only preview mode.     See Plans
This PBQ requires a Premium Membership and is being shown in a read-only preview mode.

ETL Pipeline from Sales to Data Warehouse

Project Overview: We need to establish a daily ETL pipeline using the ETL App Console to incorporate the new sales-transactions feed into our enterprise data warehouse. The sales-transactions data is stored in a PostgreSQL database. Ensure you configure each step correctly, as detailed below.

Tasks to Complete:

  1. Connect to Data Source:
    • Review the table below for connection and connector details
    • Configure the Data Source settings accordingly
  2. Map Data Types:
    • Assign the appropriate SQL data type for each field in the sales-transactions table. See the data example table below for expected SQL table structure and data types.
  3. Define Primary Keys:
    • Identify and assign primary key(s) for the sales-transactions table to enforce data uniqueness.
  4. Configure Scheduling
  5. Set Data Quality Rules:
  • Invalid or incomplete entries should be ignored
  • Duplicate IDs should be treated as updates, for example a partial refund may result in an existing entry's amount property be modified

Table Format

ColumnExample Data
TransactionID4f9ded2d-905a-4e0f-a384-e1c4a99713c0
Amount3.49
TransactionDate2023-10-26 14:30:00-05

Database Connectivity Details

PropertyValue
Database FlavorPostgreSQL
Database Hostnamedbhost
Database Port5432
Database Namesalesdb
Database Useretl_user
Database PasswordsecurePass!
Database Connection String Formatjdbc:postgresql://<hostname>:<port>/<database>?user=<user>&password=<password>
Connect to Data Source
Data Type Mapping
Define Primary Keys
Schedule Incremental Loads
Data Quality Rules