A data analyst has a table with records divided into three distinct groups. The analyst wants a representative sample that preserves each group’s approximate distribution. Which approach is best?
Best-fit an average of the table and omit smaller categories
Choose a large block of data from the largest group, then move on to other groups
Pick every tenth record across the entire table
Separate the groups and select a random subset from each based on its size
By separating and sampling each group in proportion to its total count, the overall distribution aligns with the original data. Selecting every nth row or starting with the largest group risks imbalance in smaller segments. Ignoring smaller subsets also reduces the chance of capturing important differences.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is a representative sample in data analysis?
Open an interactive chat with Bash
What does it mean to separate groups when sampling?
Open an interactive chat with Bash
What is the difference between random sampling and systematic sampling?