Snowflake Real Dumps Practice Exam Questions by Dumpswarp

SnowPro Advanced: Data Scientist Certification Exam Questions and Answers

Question 1

Secure Data Sharing do not let you share which of the following selected objects in a database in your account with other Snowflake accounts?

Options:

Sequences

Tables

External tables

Secure UDFs

Question 2

Mark the incorrect statement regarding Python UDF?

Options:

Python UDFs can contain both new code and calls to existing packages

For each row passed to a UDF, the UDF returns either a scalar (i.e. single) value or, if defined as a table function, a set of rows.

A UDF also gives you a way to encapsulate functionality so that you can call it repeatedly from multiple places in code

A scalar function (UDF) returns a tabular value for each input row

Question 3

You previously trained a model using a training dataset. You want to detect any data drift in the new data collected since the model was trained.

What should you do?

Options:

Create a new dataset using the new data and a timestamp column and create a data drift monitor that uses the training dataset as a baseline and the new dataset as a target.

Create a new version of the dataset using only the new data and retrain the model.

Add the new data to the existing dataset and enable Application Insights for the service where the model is deployed.

Retrained your training dataset after correcting data outliers & no need to introduce new data.

Question 4

All Snowpark ML modeling and preprocessing classes are in the ________ namespace?

Options:

snowpark.ml.modeling

snowflake.sklearn.modeling

snowflake.scikit.modeling

snowflake.ml.modeling

Question 5

What Can Snowflake Data Scientist do in the Snowflake Marketplace as Provider?

Options:

Publish listings for free-to-use datasets to generate interest and new opportunities among the Snowflake customer base.

Publish listings for datasets that can be customized for the consumer.

Share live datasets securely and in real-time without creating copies of the data or im-posing data integration tasks on the consumer.

Eliminate the costs of building and maintaining APIs and data pipelines to deliver data to customers.

Question 6

Which of the following metrics are used to evaluate classification models?

Options:

Area under the ROC curve

F1 score

Confusion matrix

All of the above

Answer:

Explanation:

Explanation

Evaluation metrics are tied to machine learning tasks. There are different metrics for the tasks of classification and regression. Some metrics, like precision-recall, are useful for multiple tasks. Classification and regression are examples of supervised learning, which constitutes a majority of machine learning applications. Using different metrics for performance evaluation, we should be able to im-prove our model’s overall predictive power before we roll it out for production on unseen data. Without doing a proper evaluation of the Machine Learning model by using different evaluation metrics, and only depending on accuracy, can lead to a problemwhen the respective model is deployed on unseen data and may end in poor predictions.

Classification metrics are evaluation measures used to assess the performance of a classification model. Common metrics include accuracy (proportion of correct predictions), precision (true positives over total predicted positives), recall (true positives over total actual positives), F1 score (har-monic mean of precision and recall), and area under the receiver operating characteristic curve (AUC-ROC).

Confusion Matrix

Confusion Matrix is a performance measurement for the machine learning classification problems where the output can be two or more classes. It is a table with combinations of predicted and actual values.

It is extremely useful for measuring the Recall, Precision, Accuracy, and AUC-ROC curves.

The four commonly used metrics for evaluating classifier performance are:

1. Accuracy: The proportion of correct predictions out of the total predictions.

2. Precision: The proportion of true positive predictions out of the total positive predictions (precision = true positives / (true positives + false positives)).

3. Recall (Sensitivity or True Positive Rate): The proportion of true positive predictions out of the total actual positive instances (recall = true positives / (true positives + false negatives)).

4. F1 Score: The harmonic mean of precision and recall, providing a balance between the two metrics (F1 score = 2 * ((precision * recall) / (precision + recall))).

These metrics help assess the classifier’s effectiveness in correctly classifying instances of different classes.

Understanding how well a machine learning model will perform on unseen data is the main purpose behind working with these evaluation metrics. Metrics like accuracy, precision, recall are good ways to evaluate classification models for balanced datasets, but if the data is imbalanced then other methods like ROC/AUC perform better in evaluating the model performance.

ROC curve isn’t just a single number but it’s a whole curve that provides nuanced details about the behavior of the classifier. It is also hard to quickly compare many ROC curves to each other.

Question 7

Which of the Following is not type of Windows function in Snowflake?

Options:

Rank-related functions.

Window frame functions.

Aggregation window functions.

Association functions.

Question 8

What is the formula for measuring skewness in a dataset?

Options:

MEAN - MEDIAN

MODE - MEDIAN

(3(MEAN - MEDIAN))/ STANDARD DEVIATION

(MEAN - MODE)/ STANDARD DEVIATION

Question 9

Which Python method can be used to Remove duplicates by Data scientist?

Options:

remove_duplicates()

duplicates()

drop_duplicates()

clean_duplicates()

Question 10

To return the contents of a DataFrame as a Pandas DataFrame, Which of the following method can be used in SnowPark API?

Options:

REPLACE_TO_PANDAS

SNOWPARK_TO_PANDAS

CONVERT_TO_PANDAS

TO_PANDAS

Question 11

Which ones are the type of visualization used for Data exploration in Data Science?

Options:

Heat Maps

Newton AI

Feature Distribution by Class

2D-Density Plots

Sand Visualization

Question 12

There are a couple of different types of classification tasks in machine learning, Choose the Correct Classification which best categorized the below Application Tasks in Machine learning?

· To detect whether email is spam or not

· To determine whether or not a patient has a certain disease in medicine.

· To determine whether or not quality specifications were met when it comes to QA (Quality Assurance).

Options:

Multi-Label Classification

Multi-Class Classification

Binary Classification

Logistic Regression

Answer:

Explanation:

The Supervised Machine Learning algorithm can be broadly classified into Regression and Classification Algorithms. In Regression algorithms, we have predicted the output for continuous values, but to predict the categorical values, we need Classification algorithms.

What is the Classification Algorithm?

The Classification algorithm is a Supervised Learning technique that is used to identify the category of new observations on the basis of training data. In Classification, a program learns from the given dataset or observations and then classifies new observation into a number of classes or groups. Such as, Yes or No, 0 or 1, Spam or Not Spam, cat or dog, etc. Classes can be called as targets/labels or categories.

Unlike regression, the output variable of Classification is a category, not a value, such as "Green or Blue", "fruit or animal", etc. Since the Classification algorithm is a Supervised learning technique, hence it takes labeled input data, which means it contains input with the corresponding output.

In classification algorithm, a discrete output function(y) is mapped to input variable(x).

y=f(x), where y = categorical output

The best example of an ML classification algorithm is Email Spam Detector.

The main goal of the Classification algorithm is to identify the category of a given dataset, and these algorithms are mainly used to predict the output for the categorical data.

The algorithm which implements the classification on a dataset is known as a classifier. There are two types of Classifications:

Binary Classifier: If the classification problem has only two possible outcomes, then it is called as Binary Classifier.

Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG, etc.

Multi-class Classifier: If a classification problem has more than two outcomes, then it is called as Multi-class Classifier.

Example: Classifications of types of crops, Classification of types of music.

Binary classification in deep learning refers to the type of classification where we have two class labels – one normal and one abnormal. Some examples of binary classification use:

· To detect whether email is spam or not

· To determine whether or not a patient has a certain disease in medicine.

· To determine whether or not quality specifications were met when it comes to QA (Quality Assurance).

For example, the normal class label would be that a patient has the disease, and the abnormal class label would be that they do not, or vice-versa.

As is with every other type of classification, it is only as good as the binary classification dataset that it has – or, in other words, the more training and data it has, the better it is.

Question 13

Which type of Python UDFs let you define Python functions that receive batches of input rows as Pandas DataFrames and return batches of results as Pandas arrays or Series?

Options:

MPP Python UDFs

Scaler Python UDFs

Vectorized Python UDFs

Hybrid Python UDFs

Question 14

Which type of Machine learning Data Scientist generally used for solving classification and regression problems?

Options:

Supervised

Unsupervised

Reinforcement Learning

Instructor Learning

Regression Learning

Question 15

Which of the following is a Python-based web application framework for visualizing data and analyzing results in a more efficient and flexible way?

Options:

StreamBI

Streamlit

Streamsets

Rapter

Question 16

Consider a data frame df with columns ['A', 'B', 'C', 'D'] and rows ['r1', 'r2', 'r3']. What does the ex-pression df[lambda x : x.index.str.endswith('3')] do?

Options:

Returns the row name r3

Results in Error

Returns the third column

Filters the row labelled r3

Question 17

Which are the following additional Metadata columns Stream contains that could be used for creating Efficient Data science Pipelines & helps in transforming only the New/Modified data only?

Options:

METADATA$ACTION

METADATA$FILE_ID

METADATA$ISUPDATE

METADATA$DELETE

METADATA$ROW_ID

Question 18

Which of the learning methodology applies conditional probability of all the variables with respec-tive the dependent variable?

Options:

Reinforcement learning

Unsupervised learning

Artificial learning

Supervised learning

Question 19

Which one is not Types of Feature Scaling?

Options:

Economy Scaling

Min-Max Scaling

Standard Scaling

Robust Scaling

Load More DSA-C02 Questions

Weekend Biggest Discount Flat 70% Offer - Ends in 0d 00h 00m 00s - Coupon code: 70diswrap

Dumpswrap Top Menu

breadcrumb

Snowflake DSA-C02 Dumps

DSA-C02 Free PDF Questions

SnowPro Advanced: Data Scientist Certification Exam Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

DSA-C02 Free PDF Answers

Dumpswrap Footer Menu

DumpsWrap All Rights Reserved