Databricks Real Dumps Practice Exam Questions by Dumpswarp

Databricks Certified Generative AI Engineer Associate Questions and Answers

Question 1

A Generative AI Engineer has created a RAG application which can help employees retrieve answers from an internal knowledge base, such as Confluence pages or Google Drive. The prototype application is now working with some positive feedback from internal company testers. Now the Generative Al Engineer wants to formally evaluate the system’s performance and understand where to focus their efforts to further improve the system.

How should the Generative AI Engineer evaluate the system?

Options:

Use cosine similarity score to comprehensively evaluate the quality of the final generated answers.

Curate a dataset that can test the retrieval and generation components of the system separately. Use MLflow’s built in evaluation metrics to perform the evaluation on the retrieval and generation components.

Benchmark multiple LLMs with the same data and pick the best LLM for the job.

Use an LLM-as-a-judge to evaluate the quality of the final answers generated.

Question 2

A Generative AI Engineer is designing a RAG application for answering user questions on technical regulations as they learn a new sport.

What are the steps needed to build this RAG application and deploy it?

Options:

Ingest documents from a source –> Index the documents and saves to Vector Search –> User submits queries against an LLM –> LLM retrieves relevant documents –> Evaluate model –> LLM generates a response –> Deploy it using Model Serving

Ingest documents from a source –> Index the documents and save to Vector Search –> User submits queries against an LLM –> LLM retrieves relevant documents –> LLM generates a response -> Evaluate model –> Deploy it using Model Serving

Ingest documents from a source –> Index the documents and save to Vector Search –> Evaluate model –> Deploy it using Model Serving

User submits queries against an LLM –> Ingest documents from a source –> Index the documents and save to Vector Search –> LLM retrieves relevant documents –> LLM generates a response –> Evaluate model –> Deploy it using Model Serving

Question 3

A Generative Al Engineer is helping a cinema extend its website's chat bot to be able to respond to questions about specific showtimes for movies currently playing at their local theater. They already have the location of the user provided by location services to their agent, and a Delta table which is continually updated with the latest showtime information by location. They want to implement this new capability In their RAG application.

Which option will do this with the least effort and in the most performant way?

Options:

Create a Feature Serving Endpoint from a FeatureSpec that references an online store synced from the Delta table. Query the Feature Serving Endpoint as part of the agent logic / tool implementation.

Query the Delta table directly via a SQL query constructed from the user's input using a text-to-SQL LLM in the agent logic / tool

implementation. Write the Delta table contents to a text column.then embed those texts using an embedding model and store these in the vector index Look

up the information based on the embedding as part of the agent logic / tool implementation.

Set up a task in Databricks Workflows to write the information in the Delta table periodically to an external database such as MySQL and query the information from there as part of the agent logic / tool implementation.

Answer:

Explanation:

The task is to extend a cinema chatbot to provide movie showtime information using a RAG application, leveraging user location and a continuously updated Delta table, with minimal effort and high performance. Let’s evaluate the options.

Option A: Create a Feature Serving Endpoint from a FeatureSpec that references an online store synced from the Delta table. Query the Feature Serving Endpoint as part of the agent logic / tool implementation

Databricks Feature Serving provides low-latency access to real-time data from Delta tables via an online store. Syncing the Delta table to a Feature Serving Endpoint allows the chatbot to query showtimes efficiently, integrating seamlessly into the RAG agent’stool logic. This leverages Databricks’ native infrastructure, minimizing effort and ensuring performance.

Databricks Reference:"Feature Serving Endpoints provide real-time access to Delta table data with low latency, ideal for production systems"("Databricks Feature Engineering Guide," 2023).

Option B: Query the Delta table directly via a SQL query constructed from the user's input using a text-to-SQL LLM in the agent logic / tool

Using a text-to-SQL LLM to generate queries adds complexity (e.g., ensuring accurate SQL generation) and latency (LLM inference + SQL execution). While feasible, it’s less performant and requires more effort than a pre-built serving solution.

Databricks Reference:"Direct SQL queries are flexible but may introduce overhead in real-time applications"("Building LLM Applications with Databricks").

Option C: Write the Delta table contents to a text column, then embed those texts using an embedding model and store these in the vector index. Look up the information based on the embedding as part of the agent logic / tool implementation

Converting structured Delta table data (e.g., showtimes) into text, embedding it, and using vector search is inefficient for structured lookups. It’s effort-intensive (preprocessing, embedding) and less precise than direct queries, undermining performance.

Databricks Reference:"Vector search excels for unstructured data, not structured tabular lookups"("Databricks Vector Search Documentation").

Option D: Set up a task in Databricks Workflows to write the information in the Delta table periodically to an external database such as MySQL and query the information from there as part of the agent logic / tool implementation

Exporting to an external database (e.g., MySQL) adds setup effort (workflow, external DB management) and latency (periodic updates vs. real-time). It’s less performant and more complex than using Databricks’ native tools.

Databricks Reference:"Avoid external systems when Delta tables provide real-time data natively"("Databricks Workflows Guide").

Conclusion: Option A minimizes effort by using Databricks Feature Serving for real-time, low-latency access to the Delta table, ensuring high performance in a production-ready RAG chatbot.

Question 4

A Generative AI Engineer I using the code below to test setting up a vector store:

Assuming they intend to use Databricks managed embeddings with the default embedding model, what should be the next logical function call?

Options:

vsc.get_index()

vsc.create_delta_sync_index()

vsc.create_direct_access_index()

vsc.similarity_search()

Question 5

A Generative AI Engineer developed an LLM application using the provisioned throughput Foundation Model API. Now that the application is ready to be deployed, they realize their volume of requests are not sufficiently high enough to create their own provisioned throughput endpoint. They want to choose a strategy that ensures the best cost-effectiveness for their application.

What strategy should the Generative AI Engineer use?

Options:

Switch to using External Models instead

Deploy the model using pay-per-token throughput as it comes with cost guarantees

Change to a model with a fewer number of parameters in order to reduce hardware constraint issues

Throttle the incoming batch of requests manually to avoid rate limiting issues

Question 6

A Generative AI Engineer is designing a chatbot for a gaming company that aims to engage users on its platform while its users play online video games.

Which metric would help them increase user engagement and retention for their platform?

Options:

Randomness

Diversity of responses

Lack of relevance

Repetition of responses

Question 7

A Generative AI Engineer has been asked to design an LLM-based application that accomplishes the following business objective: answer employee HR questions using HR PDF documentation.

Which set of high level tasks should the Generative AI Engineer's system perform?

Options:

Calculate averaged embeddings for each HR document, compare embeddings to user query to find the best document. Pass the best document with the user query into an LLM with a large context window to generate a response to the employee.

Use an LLM to summarize HR documentation. Provide summaries of documentation and user query into an LLM with a large context window to generate a response to the user.

Create an interaction matrix of historical employee questions and HR documentation. Use ALS to factorize the matrix and create embeddings. Calculate the embeddings of new queries and use them to find the best HR documentation. Use an LLM to generate a response to the employee question based upon the documentation retrieved.

Split HR documentation into chunks and embed into a vector store. Use the employee question to retrieve best matched chunks of documentation, and use the LLM to generate a response to the employee based upon the documentation retrieved.

Question 8

A Generative Al Engineer at an automotive company would like to build a question-answering chatbot for customers to inquire about their vehicles. They have a database containing various documents of different vehicle makes, their hardware parts, and common maintenance information.

Which of the following components will NOT be useful in building such a chatbot?

Options:

Response-generating LLM

Invite users to submit long, rather than concise, questions

Vector database

Embedding model

Answer:

Explanation:

The task involves building a question-answering chatbot for an automotive company using a database of vehicle-related documents. The chatbot must efficiently process customer inquiries and provide accurate responses. Let’s evaluate each component to determine which isnotuseful, per Databricks Generative AI Engineer principles.

Option A: Response-generating LLM

An LLM is essential for generating natural language responses to customer queries based on retrieved information. This is a core component of any chatbot.

Databricks Reference:"The response-generating LLM processes retrieved context to produce coherent answers"("Building LLM Applications with Databricks," 2023).

Option B: Invite users to submit long, rather than concise, questions

Encouraging long questions is a user interaction design choice, not a technical component of the chatbot’s architecture. Moreover, long, verbose questions can complicate intent detection and retrieval, reducing efficiency and accuracy—counter to best practices for chatbot design. Concise questions are typically preferred for clarity and performance.

Databricks Reference: While not explicitly stated, Databricks’ "Generative AI Cookbook" emphasizes efficient query processing, implying that simpler, focused inputs improve LLM performance. Inviting long questions doesn’t align with this.

Option C: Vector database

A vector database stores embeddings of the vehicle documents, enabling fast retrieval of relevant information via semantic search. This is critical for a question-answering system with a large document corpus.

Databricks Reference:"Vector databases enable scalable retrieval of context from large datasets"("Databricks Generative AI Engineer Guide").

Option D: Embedding model

An embedding model converts text (documents and queries) into vector representations for similarity search. It’s a foundational component for retrieval-augmented generation (RAG) in chatbots.

Databricks Reference:"Embedding models transform text into vectors, facilitating efficient matching of queries to documents"("Building LLM-Powered Applications").

Conclusion: Option B is not a usefulcomponentin building the chatbot. It’s a user-facing suggestion rather than a technical building block, and it could even degrade performance by introducing unnecessary complexity. Options A, C, and D are all integral to a Databricks-aligned chatbot architecture.

Question 9

A Generative Al Engineer has built an LLM-based system that will automatically translate user text between two languages. They now want to benchmark multiple LLM's on this task and pick the best one. They have an evaluation set with known high quality translation examples. They want to evaluate each LLM using the evaluation set with a performant metric.

Which metric should they choose for this evaluation?

Options:

ROUGE metric

BLEU metric

NDCG metric

RECALL metric

Question 10

A Generative Al Engineer is setting up a Databricks Vector Search that will lookup news articles by topic within 10 days of the date specified An example query might be "Tell me about monster truck news around January 5th 1992". They want to do this with the least amount of effort.

How can they set up their Vector Search index to support this use case?

Options:

Split articles by 10 day blocks and return the block closest to the query.

Include metadata columns for article date and topic to support metadata filtering.

pass the query directly to the vector search index and return the best articles.

Create separate indexes by topic and add a classifier model to appropriately pick the best index.

Answer:

Explanation:

The task is to set up a Databricks Vector Search index for news articles, supporting queries like “monster truck news around January 5th, 1992,” with minimal effort. The index must filter by topic and a 10-day date range. Let’s evaluate the options.

Option A: Split articles by 10-day blocks and return the block closest to the query

Pre-splitting articles into 10-day blocks requires significant preprocessing and index management (e.g., one index per block). It’s effort-intensive and inflexible for dynamic date ranges.

Databricks Reference:"Static partitioning increases setup complexity; metadata filtering is preferred"("Databricks Vector Search Documentation").

Option B: Include metadata columns for article date and topic to support metadata filtering

Adding date and topic as metadata in the Vector Search index allows dynamic filtering (e.g., date ± 5 days, topic = “monster truck”) at query time. This leverages Databricks’ built-in metadata filtering, minimizing setup effort.

Databricks Reference:"Vector Search supports metadata filtering on columns like date or category for precise retrieval with minimal preprocessing"("Vector Search Guide," 2023).

Option C: Pass the query directly to the vector search index and return the best articles

Passing the full query (e.g., “Tell me about monster truck news around January 5th, 1992”) to Vector Search relies solely on embeddings, ignoring structured filtering for date and topic. This risks inaccurate results without explicit range logic.

Databricks Reference:"Pure vector similarity may not handle temporal or categorical constraints effectively"("Building LLM Applications with Databricks").

Option D: Create separate indexes by topic and add a classifier model to appropriately pick the best index

Separate indexes per topic plus a classifier model adds significant complexity (index creation, model training, maintenance), far exceeding “least effort.” It’s overkill for this use case.

Databricks Reference:"Multiple indexes increase overhead; single-index with metadata is simpler"("Databricks Vector Search Documentation").

Conclusion: Option B is the simplest and most effective solution, using metadata filtering in a single Vector Search index to handle date ranges and topics, aligning with Databricks’ emphasis on efficient, low-effort setups.

Question 11

Generative AI Engineer at an electronics company just deployed a RAG application for customers to ask questions about products that the company carries. However, they received feedback that the RAG response often returns information about an irrelevant product.

What can the engineer do to improve the relevance of the RAG’s response?

Options:

Assess the quality of the retrieved context

Implement caching for frequently asked questions

Use a different LLM to improve the generated response

Use a different semantic similarity search algorithm

Question 12

A Generative Al Engineer is tasked with developing a RAG application that will help a small internal group of experts at their company answer specific questions, augmented by an internal knowledge base. They want the best possible quality in the answers, and neither latency nor throughput is a huge concern given that the user group is small and they’re willing to wait for the best answer. The topics are sensitive in nature and the data is highly confidential and so, due to regulatory requirements, none of the information is allowed to be transmitted to third parties.

Which model meets all the Generative Al Engineer’s needs in this situation?

Options:

Dolly 1.5B

OpenAI GPT-4

BGE-large

Llama2-70B

Question 13

A Generative AI Engineer is building an LLM to generate article summaries in the form of a type of poem, such as a haiku, given the article content. However, the initial output from the LLM does not match the desired tone or style.

Which approach will NOT improve the LLM’s response to achieve the desired response?

Options:

Provide the LLM with a prompt that explicitly instructs it to generate text in the desired tone and style

Use a neutralizer to normalize the tone and style of the underlying documents

Include few-shot examples in the prompt to the LLM

Fine-tune the LLM on a dataset of desired tone and style

Question 14

After changing the response generating LLM in a RAG pipeline from GPT-4 to a model with a shorter context length that the company self-hosts, the Generative AI Engineer is getting the following error:

What TWO solutions should the Generative AI Engineer implement without changing the response generating model? (Choose two.)

Options:

Use a smaller embedding model to generate

Reduce the maximum output tokens of the new model

Decrease the chunk size of embedded documents

Reduce the number of records retrieved from the vector database

Retrain the response generating model using ALiBi

Question 15

A small and cost-conscious startup in the cancer research field wants to build a RAG application using Foundation Model APIs.

Which strategy would allow the startup to build a good-quality RAG application while being cost-conscious and able to cater to customer needs?

Options:

Limit the number of relevant documents available for the RAG application to retrieve from

Pick a smaller LLM that is domain-specific

Limit the number of queries a customer can send per day

Use the largest LLM possible because that gives the best performance for any general queries

Question 16

A company has a typical RAG-enabled, customer-facing chatbot on its website.

Select the correct sequence of components a user's questions will go through before the final output is returned. Use the diagram above for reference.

Options:

1.embedding model, 2.vector search, 3.context-augmented prompt, 4.response-generating LLM

1.context-augmented prompt, 2.vector search, 3.embedding model, 4.response-generating LLM

1.response-generating LLM, 2.vector search, 3.context-augmented prompt, 4.embedding model

1.response-generating LLM, 2.context-augmented prompt, 3.vector search, 4.embedding model

Question 17

A Generative AI Engineer is building a RAG application that will rely on context retrieved from source documents that are currently in PDF format. These PDFs can contain both text and images. They want to develop a solution using the least amount of lines of code.

Which Python package should be used to extract the text from the source documents?

Options:

flask

beautifulsoup

unstructured

numpy

Question 18

A Generative AI Engineer is developing a chatbot designed to assist users with insurance-related queries. The chatbot is built on a large language model (LLM) and is conversational. However, to maintain the chatbot’s focus and to comply with company policy, it must not provide responses to questions about politics. Instead, when presented with political inquiries, the chatbot should respond with a standard message:

“Sorry, I cannot answer that. I am a chatbot that can only answer questions around insurance.”

Which framework type should be implemented to solve this?

Options:

Safety Guardrail

Security Guardrail

Contextual Guardrail

Compliance Guardrail

Load More Databricks-Generative-AI-Engineer-Associate Questions

Big Halloween Sale Discount Flat 70% Offer - Ends in 0d 00h 00m 00s - Coupon code: 70diswrap

Dumpswrap Top Menu

breadcrumb

Databricks Databricks-Generative-AI-Engineer-Associate Dumps

Databricks-Generative-AI-Engineer-Associate Free PDF Questions

Databricks Certified Generative AI Engineer Associate Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Databricks-Generative-AI-Engineer-Associate Free PDF Answers

Dumpswrap Footer Menu

DumpsWrap All Rights Reserved