Winter Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: dumps65

CompTIA DA0-001 Dumps

Page: 1 / 31
Total 314 questions

CompTIA Data+ Certification Exam Questions and Answers

Question 1

During data cleansing, an analyst conducts measures of central tendency on a data set. Which of the following data is the analyst attempting to identify?

Options:

A.

Duplicate

B.

Missing

C.

Outlying

D.

Invalid

Question 2

An analyst is reporting on the average income for a county and is reviewing the following data:

as

Which of the following is the reason the analyst would need to cleanse the data in this data set?

Options:

A.

Data completeness

B.

Data outliers

C.

Duplicate data

D.

Missing values

Question 3

An analyst notices changes in sales ratios when analyzing a quarterly report. Which of the following is the analyst conducting?

Options:

A.

A gap analysis

B.

A link analysis

C.

A trend analysis

D.

A statistical analysis

Question 4

Which of the following is used for calculations and pivot tables?

Options:

A.

IBM SPSS

B.

SAS

C.

Microsoft Excel

D.

Domo

Question 5

Which of the following would be considered non-personally identifiable information?

Options:

A.

Cell phone device name

B.

Customer’s name

C.

Government ID number

D.

Telephone number

Question 6

Which of the following is a common data analytics tool that is also used as an interpreted, high-level, general-purpose programming language?

Options:

A.

SAS

B.

Microsoft Power B1

C.

IBM SPSS

D.

Python

Question 7

A report is scheduled to run and be distributed at the end of business each day. On Mondays, one of the recipients opens the previous week's reports and combines them to calculate the weekly totals and projections for the coming week. This is a tedious process, and the recipient asks an analyst for help. Which of the following should the analyst recommend?

Options:

A.

Add calculation fields to the daily report so the totals are built in.

B.

Create a new report with weekly totals set to run at the end of business on Friday.

C.

Provide a daily summary to the report with totals to save the user the effort of manual calculations.

D.

Reduce the frequency of the report to once a week and change the date range.

Question 8

Which of the following would a data analyst look for first if 100% participation is needed on survey results?

Options:

A.

Missing data

B.

Invalid data

C.

Redundant data

D.

Duplicate data

Question 9

An analyst is required to run a text analysis of data that is found in articles from a digital news outlet. Which of the following would be the BEST technique for the analyst to apply to acquire the data?

Options:

A.

Web scraping

B.

Sampling

C.

Data wrangling

D.

ETL

Question 10

Different people manually type a series of handwritten surveys into an online database. Which of the following issues will MOST likely arise with this data? (Choose two.)

Options:

A.

Data accuracy

B.

Data constraints

C.

Data attribute limitations

D.

Data bias

E.

Data consistency

F.

Data manipulation

Question 11

A data analyst received a large amount of third-party data that needs to be joined with in-house data files. After the data is joined, the analyst notices three columns all contain dates. Which of the following should the analyst do to maintain data consistency?

Options:

A.

Append all date columns and parse the strings.

B.

Impute all three date columns and then merge.

C.

Merge all date columns and unify the format.

D.

Separate the columns into a table and merge.

Question 12

An analyst is working with a data set that lists individuals' first and last names in separate columns. Which of the following processes should the analyst use to combine the first and last names into a single spreadsheet cell?

Options:

A.

Transpose

B.

Blend

C.

Concatenate

D.

Merges

Question 13

Which of the following is the best approach to use to gain a general understanding of a data set?

Options:

A.

Descriptive statistics

B.

Basic projections

C.

Gap analysis

D.

Trend analysis

Question 14

A company notifies its employees that emails will be automatically moved to a cloud-based server in 180 days. Which of the following describes this concept?

Options:

A.

Data deletion

B.

Data processing

C.

Data retention

D.

Data constraints

Question 15

An analyst is updating a customer contacts database with information obtained from a survey of new customers. Which of the following data manipulation techniques should the analyst use?

Options:

A.

Join

B.

Append

C.

Transform

D.

Blend

Question 16

A development company is constructing a new Init in its apartment complex. The complex has the following floor plans:

as

Using the average cost per square foot of the original floor plans. which of the following should be the price of the Rose Init?

Options:

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Question 17

Given the information in the following tables:

as

Which of the following describes merging these tables to create a master file that includes all transactions for both online and in-store sales?

Options:

A.

Data audit

B.

Data completeness

C.

Data validation

D.

Data consolidation

Question 18

Which of the following is the correct data type for text?

Options:

A.

Boolean

B.

String

C.

Integer

D.

Float

Question 19

Given the table below:

as

Which of the following variables can be considered inconsistent, and how many distinct values should the variable have?

Options:

A.

Name, one

B.

Gender, two

C.

Level, three

D.

Code, four

E.

Region, five

Question 20

Which of the following best describes how discrete data differs from continuous data?

Options:

A.

Discrete data cannot create a sloped line.

B.

Discrete data can only be a finite number of values.

C.

Discrete data can have decimal points.

D.

Discrete data applies only to numbers.

Question 21

Which of the ing is the correct ion for a tab-delimited spre file?

Options:

A.

tap

B.

tar

C.

sv

D.

az

Question 22

A sales manager wants quarterly sales reports broken down by unit and week. Which of the following data output lists includes the most necessary information?

Options:

A.

Order number. salesperson. date shipped, recipient address, and price

B.

Item name, salesperson. recipient address, shipping cost. and date shipped

C.

Item number, item name, salesperson. date sold. and price

D.

Item name. salesperson. price. shipping cost. and date shipped

Question 23

An analyst is working with the income data of suburban families in the United States. The data set has a lot of outliers, and the analyst needs to provide a measure that represents the typical income. Which of the following would BEST fulfill the analyst’s goal?

Options:

A.

Median

B.

Mean

C.

Mode

D.

Standard deviation

Question 24

Which of the following data sampling methods involves dividing a population into subgroups by similar characteristics?

Options:

A.

Systematic

B.

Simple random

C.

Convenience

D.

Stratified

Question 25

When analyzing the values of two variables, you decide to convert both variables so they are on a scale of 0 to 1.

What term describes this action?

Options:

A.

Filtering.

B.

Normalization.

C.

Transposition.

D.

Aggregation.

Question 26

A data analyst received the information in the table below from a recently completed marketing campaign:

as

Which of the following is the total order conversion rate?

Options:

A.

13.2%

B.

14.8%

C.

22.3%

D.

85.2%

Question 27

A company's human resources department has asked a data analyst to categorize the income of all employees into five salary bands:

as

Which of the following types of functions would be the most appropriate to use?

Options:

A.

Statistical

B.

Aggregate

C.

Logical

D.

Mathematical

Question 28

A data analyst reviews the following data set:

as

Which of the following is the range value?

Options:

A.

9

B.

10

C.

12

D.

13

Question 29

What category of data stewardship work is focused on ensuring that the organization respects the wishes of data subjects?

Options:

A.

Data quality.

B.

Data privacy.

C.

Data security.

D.

Regulatory compliance.

Question 30

A client has requested an analysis of all pet care items purchased by current customers and their social media connections in the past 12 months. Which of the following data analysis techniques would be the best choice given these requirements?

Options:

A.

Trend analysis

B.

Performance analysis

C.

Link analysis

D.

Exploratory data analysis

Question 31

Which of the following is an object associated with a table that sorts and stores table row data in a key-value pair?

Options:

A.

Foreign key

B.

Function

C.

Stored procedure

D.

Clustered index

Question 32

A financial institution is reporting on sales performance to a company at the account level. Due to the sensitive nature of the government the does il with, some account information is not shown. Which of the following fields should be masked?

Options:

A.

Sales volume

B.

Start date

C.

Product name

D.

Customer name

Question 33

What R package makes it easy to work with dates?

Options:

A.

Lubridate.

B.

Datemath.

C.

Stringr.

D.

ggplot.

Question 34

Which of the following technologies would be best suited for creating a multiple linear regression model?

Options:

A.

Microsoft Power Bl

B.

R

C.

SQL

D.

Tableau

Question 35

An organization would like to add a secondary email field to its customer database in order to enrich the customer profiles. Which of the following data manipulation techniques should the analyst use to add this information?

Options:

A.

Blend

B.

Merge

C.

Append

D.

Aggregate

Question 36

Analytics reports should follow corporate style guidelines.

Options:

A.

True.

B.

False.

Question 37

A data analyst is attempting to understand how ice cream consumption is affected by different attributes. such as cost, temperature. and income level. Which of the following

regression analyses should the data analyst perform to understand this relationship?

Options:

A.

Logistic

B.

Ordinary least squares

C.

Cox

D.

Polynomial

Question 38

A data set for sales per month includes the following data:

as

Which of the following cleaning and profiling methods should be applied to the data set?

Options:

A.

Data outliers

B.

Invalid data

C.

Duplicate data

D.

Data type validation

Question 39

Joseph is interpreting a left skewed distribution of test scores. Joe scored at the mean, Alfonso scored at the median, and gaby scored and the end of the tail.

Who had the highest score?

Options:

A.

Joseph

B.

Joe

C.

Alfonso

D.

Gaby

Question 40

Joe. an analyst. tests the loading time on a dashboard he is preparing to go live and finds it is slower than he would like. Which of the following must occur to decrease the loading time?

Options:

A.

Deploy the dashboard to production.

B.

Change the field definitions.

C.

Update the dashboard subscribers.

D.

Optimize the dashboard.

Question 41

Which of the following is an example of a discrete data type?

Options:

A.

8in (20cm)

B.

5 kids

C.

2.5mi (4km)

D.

10.7lbs (4.9kg)

Question 42

Which of the following best describes an exploratory analysis?

Options:

A.

Involves the use of descriptive statistics to understand observations

B.

Involves analysis of exploring data sets for performance tracking

C.

Involves the testing of specific hypotheses

D.

Involves the use of arithmetic algebra to determine the distribution

Question 43

A data analyst is asked on the morning of April 9, 2020, to create a sales report that identifies sales year to date. The daily sales data is current through the end of the day. Which of the following date ranges should be on the report?

Options:

A.

January 1, 2020 to April 1, 2020

B.

January 1, 2020 to April 7, 2020

C.

January 1, 2020 to April 8, 2020

D.

January 1, 2020 to April 9, 2020

Question 44

Which of the following is most likely to be used as a data-mining ETL tool?

Options:

A.

SSIS

B.

Stata

C.

SPSS

D.

Cognos

Question 45

Given the below:

as

Which of the following numbers represents a Type I error?

Options:

A.

1

B.

2

C.

3

D.

4

Question 46

Which of the following file formats is best suited to start exploratory analysis within statistical software?

Options:

A.

CSV

B.

XLSM

C.

XML

D.

JSON

Question 47

A reporting analyst is creating a dashboard that shows the year-over-year performance for a sales organization. Which of the following is the best visual for the analyst use to illustrate the organization's performance?

Options:

A.

Pie chart

B.

Scatter plot

C.

Heat map

D.

Line chart

Question 48

Standardized tests are given to students in the middle of each month, and the results are ready by the end of the month. The superintendent needs a quick view of test performance. Which of the following would be the best recommendation to meet the superintendent's requirements?

Options:

A.

A dashboard with a continuous data stream and saved searches

B.

A report of test scores by classroom, emailed to the superintendent at the end of the month

C.

A report of test scores with pie charts showing student performance

D.

A dashboard with a scheduled delivery, the ability to filter scores by school, and bar charts for comparison

Question 49

A data analyst has been asked to organize the table below in the following ways:

By sales from high to low -

By state in alphabetic order -

as

Which of the following functions will allow the data analyst to organize the table in this manner?

Options:

A.

Conditional formatting

B.

Grouping

C.

Filtering

D.

Sorting

Question 50

An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?

Options:

A.

Complete an audit on the data pulled for the report.

B.

Complete a check for quality in the report.

C.

Complete a review of the data and a check for consistency

D.

Complete a trend analysis to be included in the report.

Question 51

Given the customer table below:

as

Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?

Options:

A.

Pie chart

B.

Heat graph

C.

Scatter plot

D.

Line chart

Question 52

Emma is working in a data warehouse and finds a finance fact table links to an organization dimension, which in turn links to a currency dimension that not linked to the fact table.

What type of design pattern is the data warehouse using?

Options:

A.

Star.

B.

Sun.

C.

Snowflake.

D.

Comet.

Question 53

A sales analyst needs to report how the sales team is performing to target. Which of the following files will be important in determining 2019 performance attainment?

Options:

A.

2018 goal data

B.

2018 actual revenue

C.

2019 goal data

D.

2019 commission plan

Question 54

Which of the following is a characteristic of a relational database?

Options:

A.

It utilizes key-value pairs.

B.

It has undefined fields.

C.

It is structured in nature.

D.

It uses minimal memory.

Question 55

A data analyst has been asked to create one table that has each employee's first name, last name, sales, and address. The sales and addresses are listed in the tables below:

as

Which of the following steps should the analyst take to create the table?

Options:

A.

Transpose the first name and last name in both tables. Use lookup to pull the address field from Table 2 into Table 1.

B.

Use lookup with the first name or first name to pull the address field from Table 2 into Table 1.

C.

Use the append formula in both tables for the first name and last name. Use lookup to pull the address field from Table 2 into Table 1.

D.

Create a column that concatenates the first name and last name in each table. Use concatenate and lookup to bring the address field into Table 1.

Question 56

as

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D. over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Question 57

Which of the following is a common data analytics tool that is also used as an interpreted, high-level, general-purpose programming language?

Options:

A.

SAS

B.

Microsoft Power BI

C.

IBM SPSS

D.

Python

Question 58

Which of the following descriptive statistical methods are measures of central tendency? (Choose two.)

Options:

A.

Mean

B.

Minimum

C.

Mode

D.

Variance

E.

Correlation

F.

Maximum

Question 59

An analyst develops an IT document and needs to describe the technical terms used in the document. Which of the following is where the analyst should include descriptions of the technical terms?

Options:

A.

Glossary

B.

System diagram

C.

User requirements

D.

Index

Question 60

Which of the following roles is responsible for ensuring an organization's data quality, security, privacy, and regulatory compliance?

Options:

A.

Data owner.

B.

Data steward.

C.

Data custodian.

D.

Data processor.

Question 61

Samantha needs to share a list of her organization's top 50 customers with the VP of sales.

She would like to include the name of the customer, the business they represent, their contact information, and their total sales over the past year.

The VP does not have any specialized analytics skills or software but would like to make some personal notes on the dataset.

What would be the best tool for Samantha to use to share this information?

Options:

A.

Power BI.

B.

Microsoft Excel.

C.

Minitab.

D.

SAS.

Question 62

A company wants to know how its customers interact with an e-commerce website based on clicks over items. Which of the following is the primary requirement for this report?

Options:

A.

Data content

B.

Frequency

C.

Filtering

D.

Views

Question 63

Under which of the following circumstances should the null hypothesis be accepted when a = 0.05?

Options:

A.

When p is 0.00003

B.

When p is 0.001

C.

When p is 0.04

D.

When p is 0.06

Question 64

The senior management team at a company receives a detailed sales report at the end of each quarter. The report is several pages long and includes data from dozens of offices across the country. The team wants a better way to get a quick snapshot of what is included in the report. Which of the following modifications would best meet this requirement?

Options:

A.

Modifying documentation elements to include reference data sources

B.

Modifying the font size and style so important data points are more visible

C.

Modifying the report to include a summary section with observations and insights

D.

Modifying the report layout so it is easier to follow and understand

Question 65

Which of the following data types best describe 4Ac1? (Select two).

Options:

A.

Alphanumeric

B.

Symbolic

C.

Numeric

D.

Float

E.

Boolean

F.

String

Question 66

A customer list from a financial services company is shown below:

as

A data analyst wants to create a likely-to-buy score on a scale from 0 to 100, based on an average of the three numerical variables: number of credit cards, age, and income. Which of the following should the analyst do to the variables to ensure they all have the same weight in the score calculation?

Options:

A.

Recode the variables.

B.

Calculate the percentiles of the variables.

C.

Calculate the standard deviations of the variables.

D.

Normalize the variables.

Question 67

A data analyst needs to create a dashboard using the company's yearly revenue data sets. Which of the following would be the best way to plot the information to show the top-performing region?

Options:

A.

A line chart

B.

A waterfall chart

C.

A heat map

D.

A stacked bar chart

Question 68

Which one of the following in NOT a common data integration tool?

Options:

A.

XSS

B.

ELT

C.

ETL

D.

APIs

Question 69

Which of the following query statements would be used when filtering data in a relational database management system? (Select two).

Options:

A.

ORDER BY

B.

HAVING

C.

WHERE

D.

SELECT

E.

INSERT

F.

GROUP BY

Question 70

Which of the following is an example of PII?

Options:

A.

Age

B.

Name

C.

Ethnicity

D.

Gender

Question 71

Which of the following describes the method of sampling in which elements of data are selected randomly from each of the small subgroups within a population?

Options:

A.

Simple random

B.

Cluster

C.

Systematic

D.

Stratified

Question 72

A data analyst is helping a retail store categorize its customers into five different groups based on the following information:

• How recently the customers made purchases

• How frequently the customers made purchases

• How much the customers spent

Given the following information:

as

Which of the following would be most important for the analysis?

Options:

A.

CustomerJD. Channel, Order_Date

B.

CustomerJD, Territory. Amount

C.

CustomerJD, Order_Date. Amount

D.

CustomerJD. Quantity, Amount

Question 73

A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:

as

Customer Table -

In-store Transactions –

as

Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?

Options:

A.

INNER: 6 rows; LEFT: 9 rows

B.

INNER: 9 rows; LEFT: 6 rows

C.

INNER: 9 rows; LEFT: 15 rows

D.

INNER: 15 rows; LEFT: 9 rows

Question 74

Which of the following data protection methods provides confidentiality for data in transit?

Options:

A.

De-identification

B.

Encryption

C.

Masking

D.

Anonymization

Question 75

Which of the following types of analysis is used when comparing last week's sales to the previous week's sales?

Options:

A.

Trend analysis

B.

Exploratory analysis

C.

Prescriptive analysis

D.

Link analysis

Question 76

Which of the following is a KPI metric for tracking sales performance?

Options:

A.

Order status percentage

B.

Customer acquisition percentage

C.

Gross profit percentage

D.

Click-through rate percentage

Question 77

A database consists of one fact table that is composed of multiple dimensions. Depending on the dimension, each one can be represented by a denormalized table or multiple normalized tables. This structure is an example of a:

Options:

A.

transactional schema.

B.

star schema.

C.

non-relational schema.

D.

snowflake schema.

Question 78

Given the data below:

as

In which of the following file formats is the data presented?

Options:

A.

Xs

B.

CSV

C.

RIF

D.

XML

Question 79

Which of the following actions should be taken when transmitting data to mitigate the chance of a data leak occurring? (Choose two.)

Options:

A.

Data identification

B.

Data processing

C.

Data Reporting

D.

Data encryption

E.

Data masking

F.

Fata removal

Question 80

A data analyst is developing a data dictionary that aligns with a company's data management processes and policies. Which of the following best describes what should be included in the data dictionary?

Options:

A.

Information containing the links to business data

B.

Information explaining the business methodologies

C.

Information containing definitions of the business data

D.

Information describing the data analysis phases

Question 81

A database administrator is required to mask certain table columns containing Pll in order to comply with the company privacy policy. Which of the following are the most likely types of information the administrator should mask? (Select two).

Options:

A.

Government-issued ID

B.

Address

C.

Order ID

D.

Order date

E.

Customer ID

F.

Referral number

Question 82

A sales team wants visibility of current sales numbers, pipeline, and team performance. The team would also like to see calculations of individuals’ earned commissions and projected commissions based on sales, but they want that information to be kept confidential. Which of the following would be the BEST way to provide this visibility?

Options:

A.

Create a dashboard displaying a data refresh date so users know the current sales numbers and configure permissions to control access.

B.

Create a dashboard for sales numbers, pipeline, and team and individual performance for the management team.

C.

Create a dashboard with filters for the overall team, individuals, and management. Users can filter to see the data they want.

D.

Create a dashboard with views for team, individuals, and management. Configure permissions to control access.

Question 83

A research analyst collects ten data points from 1.000 specimens. The analyst will not need any additional data to complete the analysis and will not need to retrieve information by specifier. Which of the following is the best data structure for the analyst to use?

Options:

A.

NoSQL

B.

Flat file

C.

JSON

D.

Relational database

Question 84

Which of the following differentiates a flat text file from other data types?

Options:

A.

Data is separated by a delimiter.

B.

Data is stored in defined rows.

C.

Data is defined with key-value pairs.

D.

Data is housed in a markup language.

Question 85

Which of the following is a process that is used during data integration to collect, blend, and load data?

Options:

A.

MDM

B.

ETL

C.

OLTP

D.

BI

Question 86

A data analyst has been asked to derive a new variable labeled “Promotion_flag” based on the total quantity sold by each salesperson. Given the table below:

as

Which of the following functions would the analyst consider appropriate to flag “Yes” for every salesperson who has a number above 1,000,000 in the Quantity_sold column?

Options:

A.

Date

B.

Mathematical

C.

Logical

D.

Aggregate

Question 87

A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

as

Which of the following types of charts should be considered?

Options:

A.

Include a line chart using the site and average sales per customer.

B.

Include a pie chart using the site and sales to average sales per customer.

C.

Include a scatter chart using sales volume and average sales per customer.

D.

Include a column chart using the site and sales to average sales per customer.

Question 88

You would like to measure how well an organization is achieving its goals.

What type of analysis should you perform?

Options:

A.

Performance analysis.

B.

Outlier analysis.

C.

Predictive analysis.

D.

Trend analysis.

Question 89

An analyst is reviewing the following data:

Car IDSpeed

123155

566436

564418

650567

546436

645638

Which of the following should the analyst include in the measures of central tendency for speed?

Options:

A.

Mode = 38 Range = 31 Mean = 42.5

B.

Range = 49 Max = 67 Min = 18

C.

Mode = 36 Max = 67 Min = 18

D.

Mode = 36 Median = 37 Mean = 41.5

Question 90

Given the image below:

as

Which of the following file formats is depicted?

Options:

A.

JSON

B.

CSV

C.

XML

D.

HTML

Question 91

An analyst is building a new dashboard for a user. After an initial conversation with the user. the analyst created a mock-up of the dashboard. Which of the following best explains why the analyst created the mock-up?

Options:

A.

To identify the dimensions and measures

B.

To send to the client after deploying the dashboard to production

C.

To confirm important details before dashboard development begins

D.

To receive client approval for the final dashboard design

Question 92

Which of the following should an analyst do to best summarize the data on a data set?

Options:

A.

Filtering

B.

Aggregation

C.

Sorting

D.

Concatenation

Question 93

Given the following tables:

as

Which of the following will be the dimensions from a FULL JOIN of the tables above?

Options:

A.

Two rows and three columns

B.

Three rows and four columns

C.

Four rows and two columns

D.

Four rows and four columns

Question 94

After completing web scraping, which of the following file formats needs to be parsed?

Options:

A.

.html

B.

.txt

C.

.csv

D.

.tsv

Page: 1 / 31
Total 314 questions