Spring Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code = simple70

Pass the Databricks Generative AI Engineer Databricks-Generative-AI-Engineer-Associate Questions and answers with Dumpstech

Exam Databricks-Generative-AI-Engineer-Associate Premium Access

View all detail and faqs for the Databricks-Generative-AI-Engineer-Associate exam

Go to Exam

Practice at least 50% of the questions to maximize your chances of passing.

Viewing page 1 out of 3 pages

Viewing questions 1-10 out of questions

Questions # 1:

After changing the response generating LLM in a RAG pipeline from GPT-4 to a model with a shorter context length that the company self-hosts, the Generative AI Engineer is getting the following error:

Question # 1

What TWO solutions should the Generative AI Engineer implement without changing the response generating model? (Choose two.)

Options:

Use a smaller embedding model to generate

Reduce the maximum output tokens of the new model

Decrease the chunk size of embedded documents

Reduce the number of records retrieved from the vector database

Retrain the response generating model using ALiBi

Questions # 2:

A Generative Al Engineer is setting up a Databricks Vector Search that will lookup news articles by topic within 10 days of the date specified An example query might be "Tell me about monster truck news around January 5th 1992". They want to do this with the least amount of effort.

How can they set up their Vector Search index to support this use case?

Options:

Split articles by 10 day blocks and return the block closest to the query.

Include metadata columns for article date and topic to support metadata filtering.

pass the query directly to the vector search index and return the best articles.

Create separate indexes by topic and add a classifier model to appropriately pick the best index.

Answer

Explanation

The task is to set up a Databricks Vector Search index for news articles, supporting queries like “monster truck news around January 5th, 1992,” with minimal effort. The index must filter by topic and a 10-day date range. Let’s evaluate the options.

Option A: Split articles by 10-day blocks and return the block closest to the query

Pre-splitting articles into 10-day blocks requires significant preprocessing and index management (e.g., one index per block). It’s effort-intensive and inflexible for dynamic date ranges.

Databricks Reference:"Static partitioning increases setup complexity; metadata filtering is preferred"("Databricks Vector Search Documentation").

Option B: Include metadata columns for article date and topic to support metadata filtering

Adding date and topic as metadata in the Vector Search index allows dynamic filtering (e.g., date ± 5 days, topic = “monster truck”) at query time. This leverages Databricks’ built-in metadata filtering, minimizing setup effort.

Databricks Reference:"Vector Search supports metadata filtering on columns like date or category for precise retrieval with minimal preprocessing"("Vector Search Guide," 2023).

Option C: Pass the query directly to the vector search index and return the best articles

Passing the full query (e.g., “Tell me about monster truck news around January 5th, 1992”) to Vector Search relies solely on embeddings, ignoring structured filtering for date and topic. This risks inaccurate results without explicit range logic.

Databricks Reference:"Pure vector similarity may not handle temporal or categorical constraints effectively"("Building LLM Applications with Databricks").

Option D: Create separate indexes by topic and add a classifier model to appropriately pick the best index

Separate indexes per topic plus a classifier model adds significant complexity (index creation, model training, maintenance), far exceeding “least effort.” It’s overkill for this use case.

Databricks Reference:"Multiple indexes increase overhead; single-index with metadata is simpler"("Databricks Vector Search Documentation").

Conclusion: Option B is the simplest and most effective solution, using metadata filtering in a single Vector Search index to handle date ranges and topics, aligning with Databricks’ emphasis on efficient, low-effort setups.

Questions # 3:

All of the following are Python APIs used to query Databricks foundation models. When running in an interactive notebook, which of the following libraries does not automatically use the current session credentials?

Options:

OpenAI client

REST API via requests library

MLflow Deployments SDK

Databricks Python SDK

Questions # 4:

A Generative Al Engineer is creating an LLM-based application. The documents for its retriever have been chunked to a maximum of 512 tokens each. The Generative Al Engineer knows that cost and latency are more important than quality for this application. They have several context length levels to choose from.

Which will fulfill their need?

Options:

context length 514; smallest model is 0.44GB and embedding dimension 768

context length 2048: smallest model is 11GB and embedding dimension 2560

context length 32768: smallest model is 14GB and embedding dimension 4096

context length 512: smallest model is 0.13GB and embedding dimension 384

Questions # 5:

A Generative Al Engineer is building a system which will answer questions on latest stock news articles.

Which will NOT help with ensuring the outputs are relevant to financial news?

Options:

Implement a comprehensive guardrail framework that includes policies for content filters tailored to the finance sector.

Increase the compute to improve processing speed of questions to allow greater relevancy analysis

C Implement a profanity filter to screen out offensive language

Incorporate manual reviews to correct any problematic outputs prior to sending to the users

Questions # 6:

Which TWO chain components are required for building a basic LLM-enabled chat application that includes conversational capabilities, knowledge retrieval, and contextual memory?

Options:

(Q)

Vector Stores

Conversation Buffer Memory

External tools

Chat loaders

React Components

Answer

B, C

Explanation

Building a basic LLM-enabled chat application with conversational capabilities, knowledge retrieval, and contextual memory requires specific components that work together to process queries, maintain context, and retrieve relevant information. Databricks’ Generative AI Engineer documentation outlines key components for such systems, particularly in the context of frameworks like LangChain or Databricks’ MosaicML integrations. Let’s evaluate the required components:

Understanding the Requirements:

Conversational capabilities: The app must generate natural, coherent responses.

Knowledge retrieval: It must access external or domain-specific knowledge.

Contextual memory: It must remember prior interactions in the conversation.

Databricks Reference:"A typical LLM chat application includes a memory component to track conversation history and a retrieval mechanism to incorporate external knowledge"("Databricks Generative AI Cookbook," 2023).

Evaluating the Options:

A. (Q): This appears incomplete or unclear (possibly a typo). Without further context, it’s not a valid component.

B. Vector Stores: These store embeddings of documents or knowledge bases, enabling semantic search and retrieval of relevant information for the LLM. This is critical for knowledge retrieval in a chat application.

Databricks Reference:"Vector stores, such as those integrated with Databricks’ Lakehouse, enable efficient retrieval of contextual data for LLMs"("Building LLM Applications with Databricks").

C. Conversation Buffer Memory: This component stores the conversation history, allowing the LLM to maintain context across multiple turns. It’s essential for contextual memory.

Databricks Reference:"Conversation Buffer Memory tracks prior user inputs and LLM outputs, ensuring context-aware responses"("Generative AI Engineer Guide").

D. External tools: These (e.g., APIs or calculators) enhance functionality but aren’t required for abasicchat app with the specified capabilities.

E. Chat loaders: These might refer to data loaders for chat logs, but they’re not a core chain component for conversational functionality or memory.

F. React Components: These relate to front-end UI development, not the LLM chain’s backend functionality.

Selecting the Two Required Components:

Forknowledge retrieval, Vector Stores (B) are necessary to fetch relevant external data, a cornerstone of Databricks’ RAG-based chat systems.

Forcontextual memory, Conversation Buffer Memory (C) is required to maintain conversation history, ensuring coherent and context-aware responses.

While an LLM itself is implied as the core generator, the question asks for chain components beyond the model, making B and C the minimal yet sufficient pair for a basic application.

Conclusion: The two required chain components areB. Vector StoresandC. Conversation Buffer Memory, as they directly address knowledge retrieval and contextual memory, respectively, aligning with Databricks’ documented best practices for LLM-enabled chat applications.

Questions # 7:

An AI developer team wants to fine-tune an open-weight model to have exceptional performance on a code generation use case. They are trying to choose the best model to start with. They want to minimize model hosting costs and are using Hugging Face model cards and spaces to explore models. Which TWO model attributes and metrics should the team focus on to make their selection?

Options:

Big Code Models Leaderboard

Number of model parameters

MTEB Leaderboard

Chatbot Arena Leaderboard

Number of model downloads last month

Questions # 8:

A Generative AI Engineer just deployed an LLM application at a digital marketing company that assists with answering customer service inquiries.

Which metric should they monitor for their customer service LLM application in production?

Options:

Number of customer inquiries processed per unit of time

Energy usage per query

Final perplexity scores for the training of the model

HuggingFace Leaderboard values for the base LLM

Answer

Questions # 9:

A Generative Al Engineer has already trained an LLM on Databricks and it is now ready to be deployed.

Which of the following steps correctly outlines the easiest process for deploying a model on Databricks?

Options:

Log the model as a pickle object, upload the object to Unity Catalog Volume, register it to Unity Catalog using MLflow, and start a serving endpoint

Log the model using MLflow during training, directly register the model to Unity Catalog using the MLflow API, and start a serving endpoint

Save the model along with its dependencies in a local directory, build the Docker image, and run the Docker container

Wrap the LLM’s prediction function into a Flask application and serve using Gunicorn

Questions # 10:

A Generative Al Engineer is building a RAG application that answers questions about internal documents for the company SnoPen AI.

The source documents may contain a significant amount of irrelevant content, such as advertisements, sports news, or entertainment news, or content about other companies.

Which approach is advisable when building a RAG application to achieve this goal of filtering irrelevant information?

Options:

Keep all articles because the RAG application needs to understand non-company content to avoid answering questions about them.

Include in the system prompt that any information it sees will be about SnoPenAI, even if no data filtering is performed.

Include in the system prompt that the application is not supposed to answer any questions unrelated to SnoPen Al.

Consolidate all SnoPen AI related documents into a single chunk in the vector database.

Answer

Viewing page 1 out of 3 pages

Viewing questions 1-10 out of questions