Month End Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: dumps65

EMC D-DS-FN-23 Dumps

Page: 1 / 6
Total 59 questions

Dell Data Science Foundations Questions and Answers

Question 1

In addition to quantitative and technical skills, what is a key aspect of the profile of a data scientist?

Options:

A.

Project management and administrative skills

B.

Proficient in Microsoft Project and Excel

C.

Skeptical and critical thinking

D.

Accounting and regulatory skills

Question 2

What metrics are used to help calculate relevance in text analysis?

Options:

A.

TF and R square

B.

IDF and information gain

C.

Information gain and confidence interval

D.

TF and IDF

Question 3

In time series analysis, what function is examined to identify the order of the autoregressive component of an ARIMA model?

Options:

A.

Logistic function

B.

Lognormal distribution function

C.

Partial autocorrelation function

D.

Normal distribution function

Question 4

What are categorized as cluster and workflow management tools for Hadoop?

Options:

A.

Flume, Sqoop, and Storm

B.

Drill, Hive, and HBase

C.

Spark, Tez, and Cassandra

D.

Ambari, Oozie, and Zookeeper

Question 5

How should project results be communicated to executives and the project sponsor?

Options:

A.

Focus on business outcomes and benefits

B.

Demonstrate your technical prowess to establish credibility

C.

Provide model performance visualizations

D.

Emphasize coding details and technical requirements

Question 6

as

Refer to the exhibit, which shows pairwise counts for items purchased together.

Consider the following association rule: Milk -> Eggs

What is value of the lift?

Options:

A.

1.18

B.

0.264

C.

120

D.

70.81

Question 7

as

Refer to the exhibit.

To predict whether or not a customer will renew their annual property insurance policy, an insurance company built and operationalized a naïve Bayes classification model. In the model, there are two class labels, renewal and non-renewal, that are assigned to each customer based on their attributes.

A subset of the key attributes, their values, and corresponding conditional probabilities are provided in the exhibit.

A customer has the following attributes:

● Age is greater than 65 years

● Owns their own home

● Renewal month is August

If 20% of customers do not renew the police every year, what is the score for a renewal in the naïve Bayesian model for the customer described above?

Options:

A.

0.0022

B.

0 0027

C.

0.0270

D.

0.0216

Question 8

MapReduce is designed to process data in which way?

Options:

A.

A few large files split into blocks processed in parallel across multiple machines

B.

Many small files processed serially on one machine

C.

A few large files split into blocks processed serially on one machine

D.

Many small files processed in parallel across multiple machines

Question 9

You have been given a task to improve sales force compensation of your organization. As a result of a study, your team decides to classify personnel as follows:

● Did not meet quota

● Met quota

● Exceeded 150% of quota

In which data analytics lifecycle phase should you define these categories for analysis purposes?

Options:

A.

Model building

B.

Communicate results

C.

Operationalize

D.

Model planning

Question 10

When should you consider using multinomial logistic regression over binary logistic regression?

Options:

A.

Dependent variable is continuous or dichotomous

B.

Dependent variable is continuous or categorical

C.

Dependent variable has more than two categories

D.

Dependent variable is continuous only

Question 11

After which phase of the data analytics lifecycle should you determine if the model needs any recalibration?

Options:

A.

Model planning

B.

Data preparation

C.

Discovery

D.

Operationalize

Question 12

What action occurs during feature selection in the model building phase of the data analytics lifecycle?

Options:

A.

Create new combinations of attributes

B.

Overfit the model to improve prediction accuracy

C.

Identify the most useful input variables

D.

Select a superset of variables to shorten training times

Question 13

What is the similarity between the matrix and array data structures in R?

Options:

A.

Both structures can contain only integers

B.

Both structures can only contain one data type

C.

Both structures can store multiple data types

D.

Both structures must be 2-dimensional

Question 14

In K-means clustering, what is a graph of the WSS versus the value of K used to help determine?

Options:

A.

Optimal distance between clusters

B.

Average distance between observations

C.

'Optimal number of clusters

D.

Average distance between clusters

Question 15

as

Refer to the exhibit.

What is the approximate R-squared value for a linear regression model fitted to the data associated with this scatterplot?

Options:

A.

4

B.

0.96

C.

0.25

D.

16

Question 16

When using association rules, what is an itemset?

Options:

A.

Set of continuous variables that are linked

B.

Set of discrete variables that are linked

C.

Support

D.

Confidence

Question 17

What are three built-in data types in the R programming language?

Options:

A.

Boolean, integer, and character

B.

Boolean, table, and character

C.

Boolean, table, and integer

D.

List, array, and integer

Page: 1 / 6
Total 59 questions