Weekend Biggest Discount Flat 70% Offer - Ends in 0d 00h 00m 00s - Coupon code: 70diswrap

Microsoft DP-203 Dumps

Page: 1 / 36
Total 355 questions

Data Engineering on Microsoft Azure Questions and Answers

Question 1

You need to design an analytical storage solution for the transactional data. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

as

Options:

Question 2

You need to design a data retention solution for the Twitter feed data records. The solution must meet the customer sentiment analytics requirements.

Which Azure Storage functionality should you include in the solution?

Options:

A.

change feed

B.

soft delete

C.

time-based retention

D.

lifecycle management

Question 3

You need to integrate the on-premises data sources and Azure Synapse Analytics. The solution must meet the data integration requirements.

Which type of integration runtime should you use?

Options:

A.

Azure-SSIS integration runtime

B.

self-hosted integration runtime

C.

Azure integration runtime

Question 4

You need to design a data ingestion and storage solution for the Twitter feeds. The solution must meet the customer sentiment analytics requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area

NOTE: Each correct selection b worth one point.

as

Options:

Question 5

You need to ensure that the Twitter feed data can be analyzed in the dedicated SQL pool. The solution must meet the customer sentiment analytics requirements.

Which three Transaction-SQL DDL commands should you run in sequence? To answer, move the appropriate commands from the list of commands to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

as

Options:

Question 6

You need to implement the surrogate key for the retail store table. The solution must meet the sales transaction

dataset requirements.

What should you create?

Options:

A.

a table that has an IDENTITY property

B.

a system-versioned temporal table

C.

a user-defined SEQUENCE object

D.

a table that has a FOREIGN KEY constraint

Question 7

You need to design the partitions for the product sales transactions. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

as

Options:

Question 8

You need to implement an Azure Synapse Analytics database object for storing the sales transactions data. The solution must meet the sales transaction dataset requirements.

What solution must meet the sales transaction dataset requirements.

What should you do? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

as

Options:

Question 9

You need to implement versioned changes to the integration pipelines. The solution must meet the data integration requirements.

In which order should you perform the actions? To answer, move all actions from the list of actions to the answer area and arrange them in the correct order.

as

Options:

Question 10

You need to design a data storage structure for the product sales transactions. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

as

Options:

Question 11

You are designing a fact table named FactPurchase in an Azure Synapse Analytics dedicated SQL pool. The table contains purchases from suppliers for a retail store. FactPurchase will contain the following columns.

as

FactPurchase will have 1 million rows of data added daily and will contain three years of data.

Transact-SQL queries similar to the following query will be executed daily.

SELECT

SupplierKey, StockItemKey, COUNT(*)

FROM FactPurchase

WHERE DateKey >= 20210101

AND DateKey <= 20210131

GROUP By SupplierKey, StockItemKey

Which table distribution will minimize query times?

Options:

A.

round-robin

B.

replicated

C.

hash-distributed on DateKey

D.

hash-distributed on PurchaseKey

Question 12

You have a Microsoft Purview account. The Lineage view of a CSV file is shown in the following exhibit.

as

How is the data for the lineage populated?

Options:

A.

manually

B.

by scanning data stores

C.

by executing a Data Factory pipeline

Question 13

You need to implement an Azure Databricks cluster that automatically connects to Azure Data lake Storage Gen2 by using Azure Active Directory (Azure AD) integration. How should you configure the new clutter? To answer, select the appropriate options in the answers area. NOTE: Each correct selection is worth one point.

as

Options:

Question 14

You are designing an Azure Databricks table. The table will ingest an average of 20 million streaming events per day.

You need to persist the events in the table for use in incremental load pipeline jobs in Azure Databricks. The solution must minimize storage costs and incremental load times.

What should you include in the solution?

Options:

A.

Partition by DateTime fields.

B.

Sink to Azure Queue storage.

C.

Include a watermark column.

D.

Use a JSON format for physical data storage.

Question 15

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are designing an Azure Stream Analytics solution that will analyze Twitter data.

You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.

Solution: You use a hopping window that uses a hop size of 5 seconds and a window size 10 seconds.

Does this meet the goal?

Options:

A.

Yes

B.

No

Question 16

You have an Azure Data Factory pipeline named pipeline1 that includes a Copy activity named Copy1. Copy1 has the following configurations:

• The source of Copy1 is a table in an on-premises Microsoft SQL Server instance that is accessed by using a linked service connected via a self-hosted integration runtime.

• The sink of Copy1 uses a table in an Azure SQL database that is accessed by using a linked service connected via an Azure integration runtime.

You need to maximize the amount of compute resources available to Copy1. The solution must minimize administrative effort.

What should you do?

Options:

A.

Scale up the data flow runtime of the Azure integration runtime.

B.

Scale up the data flow runtime of the Azure integration runtime and scale out the self-hosted integration runtime.

C.

Scale out the self-hosted integration runtime.

Question 17

You have an Azure Data Factory pipeline shown the following exhibit.

as

The execution log for the first pipeline run is shown in the following exhibit.

as

The execution log for the second pipeline run is shown in the following exhibit.

as

For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE: Each correct selection is worth one point.

as

Options:

Question 18

You have an Azure Factory instance named DF1 that contains a pipeline named PL1.PL1 includes a tumbling window trigger.

You create five clones of PL1. You configure each clone pipeline to use a different data source.

You need to ensure that the execution schedules of the clone pipeline match the execution schedule of PL1.

What should you do?

Options:

A.

Add a new trigger to each cloned pipeline

B.

Associate each cloned pipeline to an existing trigger.

C.

Create a tumbling window trigger dependency for the trigger of PL1.

D.

Modify the Concurrency setting of each pipeline.

Question 19

You have an Azure subscription that contains the resources shown in the following table.

as

You need to read the TSV files by using ad-hoc queries and the openrowset function The solution must assign a name and override the inferred data type of each column.

What should you include in the openrowset function?

Options:

A.

the with clause

B.

the rowsetoptions bulk option

C.

the datafiletype bulk option

D.

the DATA_source parameter

Question 20

You need to create a partitioned table in an Azure Synapse Analytics dedicated SQL pool.

How should you complete the Transact-SQL statement? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.

as

Options:

Question 21

You have an Azure Synapse Analytics dedicated SQL pool.

You need to monitor the database for long-running queries and identify which queries are waiting on resources

Which dynamic manage ment view should you use for each requirement? To answer, select the appropriate options in the answer area.

NOTE; Each correct answer is worth one point.

as

Options:

Question 22

You are designing an Azure Data Lake Storage solution that will transform raw JSON files for use in an analytical workload.

You need to recommend a format for the transformed files. The solution must meet the following requirements:

Contain information about the data types of each column in the files.

Support querying a subset of columns in the files.

Support read-heavy analytical workloads.

Minimize the file size.

What should you recommend?

Options:

A.

JSON

B.

CSV

C.

Apache Avro

D.

Apache Parquet

Question 23

You have an Azure data factory.

You need to examine the pipeline failures from the last 180 flays.

What should you use?

Options:

A.

the Activity tog blade for the Data Factory resource

B.

Azure Data Factory activity runs in Azure Monitor

C.

Pipeline runs in the Azure Data Factory user experience

D.

the Resource health blade for the Data Factory resource

Question 24

You are planning a solution to aggregate streaming data that originates in Apache Kafka and is output to Azure Data Lake Storage Gen2. The developers who will implement the stream processing solution use Java, Which service should you recommend using to process the streaming data?

Options:

A.

Azure Data Factory

B.

Azure Stream Analytics

C.

Azure Databricks

D.

Azure Event Hubs

Question 25

You are developing an Azure Synapse Analytics pipeline that will include a mapping data flow named Dataflow1. Dataflow1 will read customer data from an external source and use a Type 1 slowly changing dimension (SCO) when loading the data into a table named DimCustomer1 in an Azure Synapse Analytics dedicated SQL pool.

You need to ensure that Dataflow1 can perform the following tasks:

* Detect whether the data of a given customer has changed in the DimCustomer table.

• Perform an upsert to the DimCustomer table.

Which type of transformation should you use for each task? To answer, select the appropriate options in the answer area

NOTE; Each correct selection is worth one point.

as

Options:

Question 26

You have an Azure Synapse Analytics workspace.

You plan to deploy a lake database by using a database template in Azure Synapse.

Which two elements ate included in the template? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point

Options:

A.

relationships

B.

table definitions

C.

table permissions

D.

linked services

E.

data formats

Question 27

You are implementing an Azure Stream Analytics solution to process event data from devices.

The devices output events when there is a fault and emit a repeat of the event every five seconds until the fault is resolved. The devices output a heartbeat event every five seconds after a previous event if there are no faults present.

A sample of the events is shown in the following table.

as

You need to calculate the uptime between the faults.

How should you complete the Stream Analytics SQL query? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

as

Options:

Question 28

You have an Azure subscription that contains an Azure Synapse Analytics dedicated SQL pool. You plan to deploy a solution that will analyze sales data and include the following:

• A table named Country that will contain 195 rows

• A table named Sales that will contain 100 million rows

• A query to identify total sales by country and customer from the past 30 days

You need to create the tables. The solution must maximize query performance.

How should you complete the script? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

as

Options:

Question 29

You are implementing a batch dataset in the Parquet format.

Data tiles will be produced by using Azure Data Factory and stored in Azure Data Lake Storage Gen2. The files will be consumed by an Azure Synapse Analytics serverless SQL pool.

You need to minimize storage costs for the solution.

What should you do?

Options:

A.

Store all the data as strings in the Parquet tiles.

B.

Use OPENROWEST to query the Parquet files.

C.

Create an external table mat contains a subset of columns from the Parquet files.

D.

Use Snappy compression for the files.

Question 30

You have an Azure event hub named retailhub that has 16 partitions. Transactions are posted to retailhub. Each transaction includes the transaction ID, the individual line items, and the payment details. The transaction ID is used as the partition key.

You are designing an Azure Stream Analytics job to identify potentially fraudulent transactions at a retail store. The job will use retailhub as the input. The job will output the transaction ID, the individual line items, the payment details, a fraud score, and a fraud indicator.

You plan to send the output to an Azure event hub named fraudhub.

You need to ensure that the fraud detection solution is highly scalable and processes transactions as quickly as possible.

How should you structure the output of the Stream Analytics job? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

as

Options:

Question 31

You plan to monitor an Azure data factory by using the Monitor & Manage app.

You need to identify the status and duration of activities that reference a table in a source database.

Which three actions should you perform in sequence? To answer, move the actions from the list of actions to the answer are and arrange them in the correct order.

as

Options:

Question 32

You have an Azure subscription that contains an Azure SQL database named SQLDB1 and an Azure Synapse Analytics dedicated SQL pool named Pool1.

You need to replicate data from SQLDB1 to Pool1. The solution must meet the following requirements:

• Minimize performance impact on SQLDB1.

• Support near-real-time (NRT) analytics.

• Minimize administrative effort.

What should you use?

Options:

A.

Azure Synapse Link

B.

Azure Private Link

C.

SQL Data Sync for Azure

D.

transactional replication in Microsoft SQL Server

Question 33

You are batch loading a table in an Azure Synapse Analytics dedicated SQL pool.

You need to load data from a staging table to the target table. The solution must ensure that if an error occurs while loading the data to the target table, all the inserts in that batch are undone.

How should you complete the Transact-SQL code? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE Each correct selection is worth one point.

as

Options:

Question 34

You configure monitoring for a Microsoft Azure SQL Data Warehouse implementation. The implementation uses PolyBase to load data from comma-separated value (CSV) files stored in Azure Data Lake Gen 2 using an external table.

Files with an invalid schema cause errors to occur.

You need to monitor for an invalid schema error.

For which error should you monitor?

Options:

A.

EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_Connect: Error[com.microsoft.polybase.client.KerberosSecureLogin] occurred while accessingexternal files.'

B.

EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_Connect: Error [No FileSystem for scheme: wasbs] occurred while accessing external file.'

C.

Cannot execute the query "Remote Query" against OLE DB provider "SQLNCLI11": for linked server "(null)", Query aborted- the maximum reject threshold (orows) was reached while regarding from an external source: 1 rows rejected out of total 1 rows processed.

D.

EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_Connect: Error [Unable to instantiate LoginClass] occurredwhile accessing external files.'

Question 35

You are designing a highly available Azure Data Lake Storage solution that will include geo-zone-redundant storage (GZRS).

You need to monitor for replication delays that can affect the recovery point objective (RPO).

What should you include in the monitoring solution?

Options:

A.

availability

B.

Average Success E2E Latency

C.

5xx: Server Error errors

D.

Last Sync Time

Question 36

You have a SQL pool in Azure Synapse that contains a table named dbo.Customers. The table contains a column name Email.

You need to prevent nonadministrative users from seeing the full email addresses in the Email column. The users must see values in a format of aXXX@XXXX.com instead.

What should you do?

Options:

A.

From Microsoft SQL Server Management Studio, set an email mask on the Email column.

B.

From the Azure portal, set a mask on the Email column.

C.

From Microsoft SQL Server Management studio, grant the SELECT permission to the users for all the columns in the dbo.Customers table except Email.

D.

From the Azure portal, set a sensitivity classification of Confidential for the Email column.

Question 37

Which Azure Data Factory components should you recommend using together to import the daily inventory data from the SQL server to Azure Data Lake Storage? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

as

Options:

Question 38

What should you do to improve high availability of the real-time data processing solution?

Options:

A.

Deploy identical Azure Stream Analytics jobs to paired regions in Azure.

B.

Deploy a High Concurrency Databricks cluster.

C.

Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.

D.

Set Data Lake Storage to use geo-redundant storage (GRS).

Question 39

What should you recommend using to secure sensitive customer contact information?

Options:

A.

data labels

B.

column-level security

C.

row-level security

D.

Transparent Data Encryption (TDE)

Question 40

What should you recommend to prevent users outside the Litware on-premises network from accessing the analytical data store?

Options:

A.

a server-level virtual network rule

B.

a database-level virtual network rule

C.

a database-level firewall IP rule

D.

a server-level firewall IP rule

Page: 1 / 36
Total 355 questions