Winter Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: dumps65

Databricks Databricks-Machine-Learning-Professional Dumps

Databricks Certified Machine Learning Professional Questions and Answers

Question 1

A data scientist would like to enable MLflow Autologging for all machine learning libraries used in a notebook. They want to ensure that MLflow Autologging is used no matter what version of the Databricks Runtime for Machine Learning is used to run the notebook and no matter what workspace-wide configurations are selected in the Admin Console.

Which of the following lines of code can they use to accomplish this task?

Options:

A.

mlflow.sklearn.autolog()

B.

mlflow.spark.autolog()

C.

spark.conf.set(“autologging”, True)

D.

It is not possible to automatically log MLflow runs.

E.

mlflow.autolog()

Question 2

A machine learning engineer wants to view all of the active MLflow Model Registry Webhooks for a specific model.

They are using the following code block:

as

Which of the following changes does the machine learning engineer need to make to this code block so it will successfully accomplish the task?

Options:

A.

There are no necessary changes

B.

Replace list with view in the endpoint URL

C.

Replace POST with GET in the call to http request

D.

Replace list with webhooks in the endpoint URL

E.

Replace POST with PUT in the call to http request

Question 3

A machine learning engineer wants to log and deploy a model as an MLflow pyfunc model. They have custom preprocessing that needs to be completed on feature variables prior to fitting the model or computing predictions using that model. They decide to wrap this preprocessing in a custom model class ModelWithPreprocess, where the preprocessing is performed when calling fit and when calling predict. They then log the fitted model of the ModelWithPreprocess class as a pyfunc model.

Which of the following is a benefit of this approach when loading the logged pyfunc model for downstream deployment?

Options:

A.

The pvfunc model can be used to deploy models in a parallelizable fashion

B.

The same preprocessing logic will automatically be applied when calling fit

C.

The same preprocessing logic will automatically be applied when calling predict

D.

This approach has no impact when loading the logged Pvfunc model for downstream deployment

E.

There is no longer a need for pipeline-like machine learning objects

Question 4

Which of the following is a probable response to identifying drift in a machine learning application?

Options:

A.

None of these responses

B.

Retraining and deploying a model on more recent data

C.

All of these responses

D.

Rebuilding the machine learning application with a new label variable

E.

Sunsetting the machine learning application

Question 5

A data scientist has developed a model to predict ice cream sales using the expected temperature and expected number of hours of sun in the day. However, the expected temperature is dropping beneath the range of the input variable on which the model was trained.

Which of the following types of drift is present in the above scenario?

Options:

A.

Label drift

B.

None of these

C.

Concept drift

D.

Prediction drift

E.

Feature drift

Question 6

Which of the following MLflow operations can be used to delete a model from the MLflow Model Registry?

Options:

A.

client.transition_model_version_stage

B.

client.delete_model_version

C.

client.update_registered_model

D.

client.delete_model

E.

client.delete_registered_model

Question 7

Which of the following MLflow operations can be used to automatically calculate and log a Shapley feature importance plot?

Options:

A.

mlflow.shap.log_explanation

B.

None of these operations can accomplish the task.

C.

mlflow.shap

D.

mlflow.log_figure

E.

client.log_artifact

Question 8

A data scientist has computed updated feature values for all primary key values stored in the Feature Store table features. In addition, feature values for some new primary key values have also been computed. The updated feature values are stored in the DataFrame features_df. They want to replace all data in features with the newly computed data.

Which of the following code blocks can they use to perform this task using the Feature Store Client fs?

A)

as

B)

as

C)

as

D)

as

E)

as

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

E.

Option E

Question 9

Which of the following is an advantage of using thepython_function(pyfunc)model flavor over the built-in library-specific model flavors?

Options:

A.

python_function provides no benefits over the built-in library-specific model flavors

B.

python_function can be used to deploy models in a parallelizable fashion

C.

python_function can be used to deploy models without worrying about which library was used to create the model

D.

python_function can be used to store models in an MLmodel file

E.

python_function can be used to deploy models without worrying about whether they are deployed in batch, streaming, or real-time environments

Question 10

A machine learning engineer needs to select a deployment strategy for a new machine learning application. The feature values are not available until the time of delivery, and results are needed exceedingly fast for one record at a time.

Which of the following deployment strategies can be used to meet these requirements?

Options:

A.

Edge/on-device

B.

Streaming

C.

None of these strategies will meet the requirements.

D.

Batch

E.

Real-time

Question 11

Which of the following is a simple statistic to monitor for categorical feature drift?

Options:

A.

Mode

B.

None of these

C.

Mode, number of unique values, and percentage of missing values

D.

Percentage of missing values

E.

Number of unique values

Question 12

A machine learning engineer is manually refreshing a model in an existing machine learning pipeline. The pipeline uses the MLflow Model Registry model "project". The machine learning engineer would like to add a new version of the model to "project".

Which of the following MLflow operations can the machine learning engineer use to accomplish this task?

Options:

A.

mlflow.register_model

B.

MlflowClient.update_registered_model

C.

mlflow.add_model_version

D.

MlflowClient.get_model_version

E.

The machine learning engineer needs to create an entirely new MLflow Model Registry model

Question 13

A data scientist is utilizing MLflow to track their machine learning experiments. After completing a series of runs for the experiment with experiment ID exp_id, the data scientist wants to programmatically work with the experiment run data in a Spark DataFrame. They have an active MLflow Client client and an active Spark session spark.

Which of the following lines of code can be used to obtain run-level results for exp_id in a Spark DataFrame?

Options:

A.

client.list_run_infos(exp_id)

B.

spark.read.format("delta").load(exp_id)

C.

There is no way to programmatically return row-level results from an MLflow Experiment.

D.

mlflow.search_runs(exp_id)

E.

spark.read.format("mlflow-experiment").load(exp_id)

Question 14

A machine learning engineer is migrating a machine learning pipeline to use Databricks Machine Learning. They have programmatically identified the best run from an MLflow Experiment and stored its URI in themodel_urivariable and its Run ID in therun_idvariable. They have also determined that the model was logged with the name"model". Now, the machine learning engineer wants to register that model in the MLflow Model Registry with the name"best_model".

Which of the following lines of code can they use to register the model to the MLflow Model Registry?

Options:

A.

mlflow.register_model(model_uri, "best_model")

B.

mlflow.register_model(run_id, "best_model")

C.

mlflow.register_model(f"runs:/{run_id}/best_model", "model")

D.

mlflow.register_model(model_uri, "model")

E.

mlflow.register_model(f"runs:/{run_id}/model")

Question 15

A machine learning engineer is in the process of implementing a concept drift monitoring solution. They are planning to use the following steps:

1. Deploy a model to production and compute predicted values

2. Obtain the observed (actual) label values

3. _____

4. Run a statistical test to determine if there are changes over time

Which of the following should be completed as Step #3?

Options:

A.

Obtain the observed values (actual) feature values

B.

Measure the latency of the prediction time

C.

Retrain the model

D.

None of these should be completed as Step #3

E.

Compute the evaluation metric using the observed and predicted values

Question 16

A data scientist wants to remove the star_rating column from the Delta table at the location path. To do this, they need to load in data and drop the star_rating column.

Which of the following code blocks accomplishes this task?

Options:

A.

spark.read.format(“delta”).load(path).drop(“star_rating”)

B.

spark.read.format(“delta”).table(path).drop(“star_rating”)

C.

Delta tables cannot be modified

D.

spark.read.table(path).drop(“star_rating”)

E.

spark.sql(“SELECT * EXCEPT star_rating FROM path”)

Question 17

Which of the following tools can assist in real-time deployments by packaging software with its own application, tools, and libraries?

Options:

A.

Cloud-based compute

B.

None of these tools

C.

REST APIs

D.

Containers

E.

Autoscaling clusters

Question 18

Which of the following describes concept drift?

Options:

A.

Concept drift is when there is a change in the distribution of an input variable

B.

Concept drift is when there is a change in the distribution of a target variable

C.

Concept drift is when there is a change in the relationship between input variables and target variables

D.

Concept drift is when there is a change in the distribution of the predicted target given by the model

E.

None of these describe Concept drift

Page: 1 / 6
Total 60 questions