AWS Certified Solutions Architect Associate SAA-C03 Practice Question
Your client intends to enhance query speed while controlling analytics costs for their substantial collection of comma-separated values files hosted on a cloud object storage service. They require a managed service to transform these files into a columnar storage format. Which service should they use to meet these needs with the least maintenance burden?
Serverless compute service
The fully managed ETL service offered by the cloud provider
Streaming data ingestion and transformation service
The correct service provides a fully managed environment to perform the ETL (Extract, Transform, Load) jobs necessary for converting the dataset from its current format into a more efficient columnar storage format like Parquet. This choice allows for the automatic scaling of resources to manage the transformation tasks, which reduces manual oversight and optimizes costs associated with running large-scale data transformation jobs. Amazon S3 is primarily a storage solution and doesn't have in-built data transformation capabilities. Lambda, while capable of running custom code to convert data formats, would involve a higher management overhead as it is not specifically tailored for ETL. Lastly, Kinesis Data Firehose is designed for real-time streaming use cases and not for batch transformations, making it unsuitable for this scenario.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is ETL and why is it important?
Open an interactive chat with Bash
What are the benefits of using columnar storage formats?
Open an interactive chat with Bash
What differentiates a fully managed ETL service from other data processing options?