Weekend Sale Discount Flat 70% Offer - Ends in 0d 00h 00m 00s - Coupon code: 70diswrap

CompTIA DA0-001 Dumps

Page: 1 / 40
Total 396 questions

CompTIA Data+ Certification Exam Questions and Answers

Question 1

A site reliability team wants to monitor the stability of their website. so they can proactively diagnose issues when they occur Which of the following deliverables would best suit their needs?

Options:

A.

A self-serve dashboard of website performance that updates in real time

B.

A weekly log report of site visits and user actions

C.

A portal that is refreshed daily and reports errors classified by type

D.

A daily summary email indicating website outages for the previous day

Question 2

A data analyst is performing a data merge within a spreadsheet using the tables below:

as

The analyst is attempting to pull the addresses from Table 2 into Table 1 using the last names and is receiving an error message. Which of the following steps can the analyst perform to fix the error?

Options:

A.

Use concatenate to combine the tables.

B.

Ensure the formula is pulling from right to left.

C.

Sort the data by the last name field.

D.

Review the spelling and data type.

Question 3

Different people manually type a series of handwritten surveys into an online database. Which of the following issues will MOST likely arise with this data? (Choose two.)

Options:

A.

Data accuracy

B.

Data constraints

C.

Data attribute limitations

D.

Data bias

E.

Data consistency

F.

Data manipulation

Question 4

Which of the following contains alphanumeric values?

Options:

A.

10.1Ε²

B.

13.6

C.

1347

D.

A3J7

Question 5

Which of the following best describes an exploratory analysis?

Options:

A.

Involves the use of descriptive statistics to understand observations

B.

Involves analysis of exploring data sets for performance tracking

C.

Involves the testing of specific hypotheses

D.

Involves the use of arithmetic algebra to determine the distribution

Question 6

A data analyst is setting up a data dashboard to monitor several ETL data streams to ensure that data is complete for later analysis. Which of the following audiences should the analyst target for this dashboard?

Options:

A.

Executives

B.

The management team

C.

Technical experts

D.

External vendors

Question 7

A data analyst is creating a report that will provide information about various regions, products, and time periods. Which of the following formats would be themost efficient way to deliver this report?

Options:

A.

A workbook with multiple tabs for each region

B.

A daily email with snapshots of regional summaries

C.

A static report with a different page for every filtered view

D.

A dashboard with filters at the top that the user can toggle

Question 8

Which of the following data cleansing issues will be fixed when a DISTINCT function is applied?

Options:

A.

Missing data

B.

Duplicate data

C.

Redundant data

D.

Invalid data

Question 9

Which of the following types of dashboards should a business intelligence engineer develop in order to provide information about failed data pipelines?

Options:

A.

Referencing

B.

Strategic

C.

Operational

D.

Technical

Question 10

An analyst is currently working on a ticket to revamp a company-wide dashboard that has been in use for five years. Which of the following should be the first step in the development process?

Options:

A.

Talk to the group that made the request to determine the desired goal.

B.

Make changes to a frequently used report that is already in production.

C.

Build an additional dashboard with fewer views tailored toward each specific team.

D.

Develop a more streamlined dashboard to roll out by the next delivery date.

Question 11

For which of the following test statistics would a low value imply a potentially meaningful result?

Options:

A.

Chi-squared

B.

p-value

C.

t-test

D.

F-test

Question 12

Each month an analyst needs to execute a data pull for the two prior months. Which of the following is the most efficient function for the analyst to use?

Options:

A.

Logical

B.

Date

C.

Aggregate

D.

System

Question 13

Which of the following is the best technique for transferring data from one database to another with some data manipulation?

Options:

A.

Application programming interfaces

B.

Delta load

C.

Extract, transform, load

D.

Export/import

Question 14

Which one of the following values will appear first if they are sorted in descending order?

Options:

A.

Aaron.

B.

Molly.

C.

Xavier.

D.

Adam.

Question 15

Exhibit.

as

Which of the following logical statements results in Table B?

A)

as

B)

as

C)

as

D)

as

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

Question 16

A company’s marketing department wants to do a promotional campaign next month. A data analyst on the team has been asked to perform customer segmentation, looking at how recently a customer bought the product, at what frequency, and at what value. Which of the following types of analysis would this practice be considered?

Options:

A.

Prescriptive

B.

Trend

C.

Gap

D.

Custer

Question 17

A data analyst is developing a data dictionary that aligns with a company's data management processes and policies. Which of the following best describes what should be included in the data dictionary?

Options:

A.

Information containing the links to business data

B.

Information explaining the business methodologies

C.

Information containing definitions of the business data

D.

Information describing the data analysis phases

Question 18

Given the below:

as

Which of the following numbers represents a Type I error?

Options:

A.

1

B.

2

C.

3

D.

4

Question 19

A data analyst needs to present the results of an online marketing campaign to the marketing manager. The manager wants to see the most important KPIs and measure the return on marketing investment. Which of the following should the data analyst use to BEST communicate this information to the manager?

Options:

A.

A real-time monitor that allows the manager to view performance the day the campaign was launched

B.

A sell-service dashboard that allows the manager to look at the company’s annual budget performance

C.

A spreadsheet of the raw data from all marketing campaigns and channels

D.

A summary with statistics, conclusions, and recommendations from the data analyst

Question 20

A database administrator is required to mask certain table columns containing Pll in order to comply with the company privacy policy. Which of the following are the most likely types of information the administrator should mask? (Select two).

Options:

A.

Government-issued ID

B.

Address

C.

Order ID

D.

Order date

E.

Customer ID

F.

Referral number

Question 21

An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?

Options:

A.

Complete an audit on the data pulled for the report.

B.

Complete a check for quality in the report.

C.

Complete a review of the data and a check for consistency

D.

Complete a trend analysis to be included in the report.

Question 22

Which of the following is a relational database?

Options:

A.

SQL

B.

Excel

C.

JSON

D.

NoSQL

Question 23

A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:

as

Customer Table -

In-store Transactions –

as

Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?

Options:

A.

INNER: 6 rows; LEFT: 9 rows

B.

INNER: 9 rows; LEFT: 6 rows

C.

INNER: 9 rows; LEFT: 15 rows

D.

INNER: 15 rows; LEFT: 9 rows

Question 24

An analyst is updating a customer contacts database with information obtained from a survey of new customers. Which of the following data manipulation techniques should the analyst use?

Options:

A.

Join

B.

Append

C.

Transform

D.

Blend

Question 25

Which of the following value is the measure of dispersion "range" between the scores of ten students in a test.

The scores of ten students in a test are 17, 23, 30, 36, 45, 51, 58, 66, 72, 77.

Options:

A.

90

B.

60

C.

70

D.

80

Question 26

A stakeholder wants to see daily sales targets organized in a dashboard by country, state, city, and ZIP Code. Which of the following delivery considerations must a data analyst take into account when creating the dashboard?

Options:

A.

Variable formatting

B.

Drill-down capability

C.

Saved searches

D.

Access permissions

Question 27

A junior web developer is developing a new application where users can upload short videos. The first task is to create a homepage that shows the headline "Upload Your Short Videos" and a clickable button that says "upload now".

Which of the following HTML commands would help the developer to complete the task successfully?

Options:

A.

< span >Upload Your Short Videos< /span >< button >upload now< /button >

B.

< p >Upload Your Short Videos< /p >< p >upload now< /p >

C.

< hl >Upload Your Short Videos< /h1 >< button >upload now< /button >

D.

< hl >Upload Your Short Videos< /h1 >< hl >upload now< /h1 >

Question 28

A healthcare data analyst notices that one data set in the column for BloodPressure contains several outliers that need to be replaced with meaningful values. Which of the following data manipulation techniques should the analyst use?

Options:

A.

Recode

B.

Impute

C.

Append

D.

Reduction

Question 29

After completing web scraping, which of the following file formats needs to be parsed?

Options:

A.

.html

B.

.txt

C.

.csv

D.

.tsv

Question 30

A sales manager wants quarterly sales reports broken down by unit and week. Which of the following data output lists includes the most necessary information?

Options:

A.

Order number. salesperson. date shipped, recipient address, and price

B.

Item name, salesperson. recipient address, shipping cost. and date shipped

C.

Item number, item name, salesperson. date sold. and price

D.

Item name. salesperson. price. shipping cost. and date shipped

Question 31

Which of the following is the best description of the term "data governance"?

Options:

A.

Data governance governs the development of a data visualization dashboard in an organization.

B.

Data governance is the policy that protects against data breaches by cybercriminals.

C.

Data governance is the process of analyzing, manipulating, and reporting data in an organization.

D.

Data governance is the availability, usability, integrity, and security of data in an enterprise.

Question 32

A data analyst is asked to create a sales report for the second-quarter 2020 board meeting, which will include a review of the business’s performance through the second quarter. The board meeting will be held on July 15, 2020, after the numbers are finalized. Which of the following report types should the data analyst create?

Options:

A.

Static

B.

Real-time

C.

Self-service

D.

Dynamic

Question 33

Which of the following database types is the best to use for transactional SQL?

Options:

A.

Snowflake schema

B.

Hierarchical

C.

Relational

D.

Star schema

Question 34

A data analyst needs to observe the relationship between two numeric variables and identify the clustering pattern as well as the outliers. Which of the following visualizations should the analyst use?

Options:

A.

Heat map

B.

Tree map

C.

Scatter plot

D.

Stacked chart

Question 35

An analyst needs to join two data sets that compare vehicle weights. One data set is in pounds, and the other has various units of measure. Which of the following should the analyst do first to the data prior to any type of join?

Options:

A.

Blend

B.

Reduce

C.

Concatenate

D.

Normalize

Question 36

An analyst for a small business with multiple locations is using each location’s quarterly sales reports from last year to create a single revenue report for the year. Which of the following data mining techniques should the analyst use to complete this task?

Options:

A.

Data merge

B.

Data append

C.

Data blending

D.

Data imputation

Question 37

Which of the following techniques should an analyst use to analyze a data set to get a snapshot of basic measures of central tendency?

Options:

A.

Forecasting

B.

Trend analysis

C.

Gap analysis

D.

Descriptive statistics

Question 38

An analyst is reviewing the following data:

Car IDSpeed

123155

566436

564418

650567

546436

645638

Which of the following should the analyst include in the measures of central tendency for speed?

Options:

A.

Mode = 38 Range = 31 Mean = 42.5

B.

Range = 49 Max = 67 Min = 18

C.

Mode = 36 Max = 67 Min = 18

D.

Mode = 36 Median = 37 Mean = 41.5

Question 39

An analyst wants to check the progress and performance regarding the number of customers an organization served in the last six years. Which of the following represents the type of analysis theanalyst should perform?

Options:

A.

Correlation analysis

B.

Trend analysis

C.

Regression analysis

D.

Descriptive analysis

Question 40

Which of the following can be used to translate data into another form so it can only be read by a user who has a key or a password?

Options:

A.

Data encryption.

B.

Data transmission.

C.

Data protection.

D.

Data masking.

Question 41

A collections manager has a team calling customers who are past due on their accounts in an attempt to collect payments. The manager receives the call list in the form of a printed report that is generated by the accounting department at the beginning of each week. Consequently, the collections team calls some customers who have made payments in the time since the report was last printed. Which of the following reporting enhancements could the accounting department implement to best reduce the number of calls on current accounts?

Options:

A.

Modify the date range on the report

B.

Include a time stamp on the report.

C.

Increase the frequency of report generation.

D.

Add a report run date to the report.

Question 42

A data analyst must fulfill a request for information that is needed weekly and should be automatically emailed to a specific set of users. Which of the following types of reports should theanalyst recommend?

Options:

A.

A self-service report

B.

A research report

C.

An ad hoc report

D.

An operational report

Question 43

An analyst needs to determine the appropriate data type for the following sample data:

sample data collected:

Which of the following data types should be used for this data?

Options:

A.

Text

B.

Float

C.

Alphanumeric

D.

Numeric

Question 44

Which of the following data types must be used when working with variables that require classification into two or more groups before analysis?

Options:

A.

Discrete

B.

Numerical

C.

Alphanumeric

D.

Categorical

Question 45

Which of the following types of data manipulation functions should a data analyst use to implement a YES/NO condition in a spreadsheet?

Options:

A.

Text

B.

Statistical

C.

Financial

D.

Logical

Question 46

A data analyst needs to write a SOL query measuring last month's website visits and distribute a summary report to the marketing team. Which of the following is the analyst creating?

Options:

A.

Date range

B.

Distribution list

C.

Data content

D.

Report view

Question 47

A data analyst received a large amount of third-party data that needs to be joined with in-house data files. After the data is joined, the analyst notices three columns all contain dates. Which of the following should the analyst do to maintain data consistency?

Options:

A.

Append all date columns and parse the strings.

B.

Impute all three date columns and then merge.

C.

Merge all date columns and unify the format.

D.

Separate the columns into a table and merge.

Question 48

A data set was recorded using multimedia technology. Which of the following is a necessary step on the way to interpretation?

Options:

A.

Structural equation modeling

B.

Transcription

C.

Sequential analysis

D.

Sampling

Question 49

An analyst computed a new variable of income per day in the household by multiplying the number of days worked by the number of people working in the household and the income earned per day. Which of the following is the correct name for this new variable?

Options:

A.

Derived

B.

Categorical

C.

Continuous

D.

Control

Question 50

A data analyst who works for a government agency is required to obtain the average income of citizens. The list of citizens is given in the following table:

as

A value for one citizen's income is missing. Which of the following approaches should the data analyst take to solve this issue?

Options:

A.

Replace the missing value with the average of the rest of the unemployed citizens.

B.

Insert the value 0 into the field with the missing value.

C.

Impute the mean of the other citizens' incomes into the field with the missing value.

D.

Exclude employed citizens from the analysis.

Question 51

Samantha needs to share a list of her organization's top 50 customers with the VP of sales.

She would like to include the name of the customer, the business they represent, their contact information, and their total sales over the past year.

The VP does not have any specialized analytics skills or software but would like to make some personal notes on the dataset.

What would be the best tool for Samantha to use to share this information?

Options:

A.

Power BI.

B.

Microsoft Excel.

C.

Minitab.

D.

SAS.

Question 52

A data architect is designing a data solution for a retail clothing store chain. Each store has a database that tracks sales transactions. The data architect needs to create a summary table that will be used for a senior executive dashboard. The summary table should not contain duplicate store information. Which of the following should the data architect create?

Options:

A.

A check constraint

B.

A primary key

C.

A foreign key

D.

A unique constraint

Question 53

Which of the following is the best reason for removing data outliers?

Options:

A.

Data varies significantly from others.

B.

Data is redundant in the table.

C.

Data is duplicated in the whole range.

D.

Data is missing from the table.

Question 54

Which of the ing is the correct ion for a tab-delimited spre file?

Options:

A.

tap

B.

tar

C.

sv

D.

az

Question 55

A gambler thinks that a coin is fair and is equally likely to turn up heads or tails when the coin is flipped. Which of the following tests should the gambler use to fest this hypothesis?

Options:

A.

t-test

B.

Chi-squared test

C.

Rank sum test

D.

Ratio test

Question 56

An analyst needs to conduct a quick analysis. Which of the following is the FIRST step the analyst should perform with the data?

Options:

A.

Conduct an exploratory analysis and use descriptive statistics.

B.

Conduct a trend analysis and use a scatter chart.

C.

Conduct a link analysis and illustrate the connection points.

D.

Conduct an initial analysis and use a Pareto chart.

Question 57

A data engineer needs to store data that can be natively used by an API. Which of the following should the engineer use to best accomplish this task?

Options:

A.

HTML

B.

JSON

C.

ZIF

D.

CSS

Question 58

A data analyst is working for a shipping company and calculating the volume of boxes according to the following formula:

volume = height × width × depth.

Which of the following variable types describes volume?

Options:

A.

Derived

B.

Normalized

C.

Concatenated

D.

Aggregated

Question 59

A data analyst is building a closed won quarter-over-quarter report for the sales team. Which of the following will be needed to complete this request?

Options:

A.

The report create date and closed dollar amount

B.

The closed won quarter and the closed dollar amount

C.

The segment and closed dollar amount

D.

The closed won year and sales leader name

Question 60

An analyst wants to combine two data sets into a single spreadsheet. Column names from the first spreadsheet are listed in rows in the second spreadsheet. Which of the following is the first step the analyst should take to combine the data sets?

Options:

A.

Blend

B.

Merge

C.

Concatenate

D.

Transpose

Question 61

A data analyst wants to create "Income Categories" that would be calculated based on the existing variable "Income". The "Income Categories" would be as follows:

Income category 1: less than $1.

Income category 2: more than $1 and less than $20,000.

Income category 3: more than $20,001 and less than $40,000.

Income category 4: more than $40,001.

Which of the following data manipulation techniques should the data analyst use to create "Income Categories"?

Options:

A.

Data merge

B.

Derived variables

C.

Data blending

D.

Data append

Question 62

A financial institution is reporting on sales performance to a company at the account level. Due to the sensitive nature of the government the does il with, some account information is not shown. Which of the following fields should be masked?

Options:

A.

Sales volume

B.

Start date

C.

Product name

D.

Customer name

Question 63

What R package makes it easy to work with dates?

Options:

A.

Lubridate.

B.

Datemath.

C.

Stringr.

D.

ggplot.

Question 64

A sales analyst needs to report how the sales team is performing to target. Which of the following files will be important in determining 2019 performance attainment?

Options:

A.

2018 goal data

B.

2018 actual revenue

C.

2019 goal data

D.

2019 commission plan

Question 65

The total values in this month's revenue report are twice as much as last month's. Which of the following most likely occurred during the ETL process?

Options:

A.

The data cleansing processes failed to execute.

B.

The database connectivity failed.

C.

The report included the previous month's data.

D.

The data normalization processes failed.

Question 66

A data analyst needs to create a data visualization that aids in un the cumulative impact of sequentially introduced values that are positive or negative. Which of the following

data visualization methods should the analyst use?

Options:

A.

A bubble chart

B.

A waterfall chart

C.

A scatter plot

D.

A line chart

Question 67

Which of the following data types would a telephone number formatted as XXX-XXX-XXXX be considered?

Options:

A.

Numeric

B.

Date

C.

Float

D.

Text

Question 68

Which of the following terms best describes a situation in which a rating scale does not conform to previously agreed-upon requirements?

Options:

A.

Specification mismatch

B.

Incorrect sampling

C.

Data corruption

D.

Redundancy

Question 69

Which of the following roles is responsible for ensuring an organization's data quality, security, privacy, and regulatory compliance?

Options:

A.

Data owner.

B.

Data steward.

C.

Data custodian.

D.

Data processor.

Question 70

A company's human resources department has asked a data analyst to categorize the income of all employees into five salary bands:

as

Which of the following types of functions would be the most appropriate to use?

Options:

A.

Statistical

B.

Aggregate

C.

Logical

D.

Mathematical

Question 71

An analyst reviews the following table:

as

Which of the following data types is represented in the values in the RefNo column?

Options:

A.

Numeric

B.

Real Number

C.

Currency

D.

Alphanumeric

Question 72

You have two databases tables that you would like to join together using a foreign key relationship.

What term best describes this action?

Options:

A.

Blending.

B.

Appending.

C.

Mixing.

D.

Merging.

Question 73

The number of phone calls that the call center receives in a day is an example of:

Options:

A.

continuous data.

B.

categorical data.

C.

ordinal data.

D.

discrete data.

Question 74

Which of the following best describes a business analytics tool with interactive visualization and business capabilities and an interface that is simple enough for end users to create their own reports and dashboards?

    Python

Options:

A.

R

B.

Microsoft Power Bl

C.

SAS

Question 75

Which of the following is the most likely reason for a data analyst to optimize a query using parameterization?

Options:

A.

To return a subset of records

B.

To insert a temporary table

C.

To prevent SQL injections

D.

To increase the query speed

Question 76

Given the following:

as

Which of the following is the most important thing for an analyst to do when transforming the table for a trend analysis?

Options:

A.

Fill in the missing cost where it is null.

B.

Separate the table into two tables and create a primary key

C.

Replace the extended cost field with a calculated field.

D.

Correct the dates so they have the same format.

Question 77

Which one of the following in NOT a common data integration tool?

Options:

A.

XSS

B.

ELT

C.

ETL

D.

APIs

Question 78

Which of the following types of analyses is best to use when tracking sales revenue against quarterly targets?

Options:

A.

Trend

B.

Performance

C.

Link

D.

Scope

Question 79

Which of the following is an example of PII?

Options:

A.

Age

B.

Name

C.

Ethnicity

D.

Gender

Question 80

Which of the following activities occurs during the ETL process?

Options:

A.

Reviewing and addressing missing values

B.

Creating a dashboard

C.

Inserting a pivot table and pivot chart

D.

Multiplying unique data

Question 81

Which of the following is the correct data type for text?

Options:

A.

Boolean

B.

String

C.

Integer

D.

Float

Question 82

What category of data stewardship work is focused on ensuring that the organization respects the wishes of data subjects?

Options:

A.

Data quality.

B.

Data privacy.

C.

Data security.

D.

Regulatory compliance.

Question 83

Which of the following is a characteristic of a relational database?

Options:

A.

It utilizes key-value pairs.

B.

It has undefined fields.

C.

It is structured in nature.

D.

It uses minimal memory.

Question 84

Which of the following statistical methods requires two or more categorical variables?

Options:

A.

Simple linear regression

B.

Chi-squared test

C.

Z-test

D.

Two-sample t-test

Question 85

Which of the following is concatenate typically used to combine?

Options:

A.

Rows

B.

Columns

C.

Tables

D.

Databases

Question 86

Which of the following differentiates a flat text file from other data types?

Options:

A.

Data is separated by a delimiter.

B.

Data is stored in defined rows.

C.

Data is defined with key-value pairs.

D.

Data is housed in a markup language.

Question 87

Which of the following is a domain-specific language used in programming that is designed for managing data that is held in a relational data stream management system?

Options:

A.

SAS

B.

SQL

C.

Python

D.

R

Question 88

An analyst is designing a dashboard to determine which site has the highest percentage of new customers. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

as

Which of the following types of charts should be considered to best display the data?

Options:

A.

Include a bar chart using the site and the percentage of new customers data.

B.

Include a line chart using the site and the percentage of new customers data.

C.

Include a pie chart using the site and percentage of new custorners data.

D.

Include a scatter chart using the site and the percent of new customers data.

Question 89

An analyst is training a new coworker on the importance of data governance and is focusing on security requirements. Which of the following should the analyst include in the training?

(Select two).

Options:

A.

Data masking

B.

Data encryption

C.

Data parallelism

D.

Data inclusiveness

E.

Data exclusiveness

F.

Data openness

Question 90

Given the following customer and order tables:

Which of the following describes the number of rows and columns of data that would be present after performing an INNER JOIN of the tables?

Options:

A.

Five rows, eight columns

B.

Seven rows, eight columns

C.

Eight rows, seven columns

D.

Nine rows, five columns

Question 91

A development company is constructing a new unit in its apartment complex. The complex has the following floor plans:

as

Using the average cost per square foot of the original floor plans, which of the following should be the price of the Rose unit?

Options:

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Question 92

Given the table below:

as

Which of the following variable types BEST describes the “Year” column?

Options:

A.

Numeric

B.

Date

C.

Alphanumeric

D.

Text

Question 93

An analyst wants to include a graph in a quarterly sales report that shows the comparison between two quantitative variables. Which of the following visual diagrams can the analyst use to most effectively represent this relationship?

Options:

A.

Bar graph

B.

Heat map

C.

Pie chart

D.

Histogram

Question 94

Which of the following defines the policies and procedures for managing the master data?

Options:

A.

Data administration

B.

Data stewardship

C.

Data ownership

D.

Data governance

Question 95

Given the following table:

Date of visit

Age

Gender

6/1/22

30

Male

6/15/22

65F

Fem.

6/19/2022

24

M

Which of the following describes the data quality issues with the age data?

Options:

A.

Completeness

B.

Consistency

C.

Accuracy

D.

Manipulation

Question 96

What role in a data governance is typically responsible for day-to-day oversight of data use?

Options:

A.

Data processors.

B.

Data custodians

C.

Data owners.

D.

Data stewards.

Question 97

The duration of a phone call in milliseconds is an example of:

Options:

A.

ordinal data.

B.

nominal data.

C.

boolean data.

D.

continuous data.

Question 98

Which of the following is the best approach to use to gain a general understanding of a data set?

Options:

A.

Descriptive statistics

B.

Basic projections

C.

Gap analysis

D.

Trend analysis

Question 99

Which of the following types of analysis is used when comparing last week's sales to the previous week's sales?

Options:

A.

Trend analysis

B.

Exploratory analysis

C.

Prescriptive analysis

D.

Link analysis

Question 100

A data analyst is creating a dashboard and trying to identify the type of information that should be included. Which of the following should the analyst consider first?

Options:

A.

Data refresh rate

B.

Consumer types

C.

Access permissions

D.

Data sources and attributes

Question 101

A data analyst for a media company needs to determine the most popular movie genre. Given the table below:

as

Which of the following must be done to the Genre column before this task can be completed?

Options:

A.

Append

B.

Merge

C.

Concatenate

D.

Delimit

Question 102

A data analyst needs to perform a full outer join of a customer's orders using the tables below:

as

Which of the following is the mean of the order quantity?

Options:

A.

73.5

B.

76.5

C.

78.8

D.

81.5

Question 103

Which of the following query statements would be used when filtering data in a relational database management system? (Select two).

Options:

A.

ORDER BY

B.

HAVING

C.

WHERE

D.

SELECT

E.

INSERT

F.

GROUP BY

Question 104

Which of the following would be the best way to identify multicollinear attributes in a data set?

Options:

A.

Correlation coefficient

B.

Chi-squared test

C.

Two-sample f-test

D.

Two-way ANOVA

Question 105

Which of the following is an example of structured data?

Options:

A.

A credit card number

B.

An email

C.

A photo

D.

Social media correspondence

Question 106

Given the following athlete workout data (with inconsistent units or formats for time/distance), which of the following best describes the data quality issue?

Options:

A.

Duplicate data

B.

Data outlier

C.

Data inconsistency

D.

Invalid data

Question 107

Given the following tables:

as

Which of the following will be the dimensions from a FULL JOIN of the tables above?

Options:

A.

Two rows and three columns

B.

Three rows and four columns

C.

Four rows and two columns

D.

Four rows and four columns

Question 108

Given the customer table below:

as

Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?

Options:

A.

Pie chart

B.

Heat graph

C.

Scatter plot

D.

Line chart

Question 109

Which of the following are reasons to conduct data cleansing? (Select two).

Options:

A.

To perform web scraping

B.

To track KPls

C.

To improve accuracy

D.

To review data sets

E.

To increase the sample size

F.

To calculate trends

Question 110

Which of the following technologies would be best suited for creating a multiple linear regression model?

Options:

A.

Microsoft Power Bl

B.

R

C.

SQL

D.

Tableau

Question 111

Encryption is a mechanism for protecting data.

When should encryption be applied to data?

Choose the best answer.

Options:

A.

When data is at rest.

B.

When data is at rest or in transit.

C.

When data is in transit.

D.

When data is at rest, unless you are using local storage.

Question 112

Which of the following data protection methods provides confidentiality for data in transit?

Options:

A.

De-identification

B.

Encryption

C.

Masking

D.

Anonymization

Question 113

Given the diagram below:

as

Which of the following data schemas shown?

Options:

A.

Key-value pairs

B.

Online transactional processing

C.

Data Lake

D.

Relational database

Question 114

Five dogs have the following heights in millimeters:

300, 430, 170, 470, 600

Which of the following is the mean height for the five dogs?

Options:

A.

394mm

B.

405mm

C.

493mm

D.

504mm

Question 115

Which of the following best describes a difference between JSON and XML?

Options:

A.

JSON is quicker to read and write.

B.

JSON has to use an end tag.

C.

JSON strings are longer

D.

JSON is much more difficult to parse.

Question 116

A sales team wants visibility of current sales numbers, pipeline, and team performance. The team would also like to see calculations of individuals’ earned commissions and projected commissions based on sales, but they want that information to be kept confidential. Which of the following would be the BEST way to provide this visibility?

Options:

A.

Create a dashboard displaying a data refresh date so users know the current sales numbers and configure permissions to control access.

B.

Create a dashboard for sales numbers, pipeline, and team and individual performance for the management team.

C.

Create a dashboard with filters for the overall team, individuals, and management. Users can filter to see the data they want.

D.

Create a dashboard with views for team, individuals, and management. Configure permissions to control access.

Question 117

A customer's telephone number is in the format 123-456-7890. Which of the following data types is used for the phone number?

Options:

A.

Boolean

B.

Date

C.

Text

D.

Number

Question 118

A data analyst is working with a data set and would like to combine two fields into a single field. Which of the following data manipulation techniques should the analyst use?

Options:

A.

Data merge

B.

Transpose

C.

Data append

D.

Concatenation

Page: 1 / 40
Total 396 questions