Skip to main content

Moonshot - Databricks Genie Connector

Context

Databricks Genie is a conversational AI interface that allows users to interact with their data using natural language. The API follows a two-step asynchronous pattern: first starting a conversation, then polling for the completed response. This connector enables Litmus/Moonshot to test and evaluate Databricks Genie-powered applications.

Chosen Solution: Databricks Genie Connector

Architecture Overview

graph TB
subgraph "Litmus/Moonshot"
LT[Litmus Test Runner]
MS[Moonshot Framework]
DGC[DatabricksGenieConnector]
end

subgraph "Databricks Genie API"
START[Start Conversation Endpoint]
POLL[Get Message Endpoint]
end

LT --> MS
MS --> DGC
DGC -->|1. Start Conversation| START
START -->|conversation_id, message_id| DGC
DGC -->|2. Poll for Completion| POLL
POLL -->|status: COMPLETED| DGC

Implementation Details

Class Interface

class DatabricksGenieConnector(GenericAPIConnector):
def __init__(self, ep_arguments: ConnectorEndpointArguments):
super().__init__(ep_arguments)

# Polling configuration
self.poll_interval = self.params.get("poll_interval_seconds", 2)
self.max_poll_attempts = self.params.get("max_poll_attempts", 60)

# Build URLs from the base endpoint
self.base_url = self.endpoint.rstrip("/")
self.start_conversation_url = f"{self.base_url}/start-conversation"

@classmethod
def build_payload(cls, prompt: str, input_spec: dict, additional_input: dict = None) -> dict:
"""Build the payload for the start-conversation API"""
return {"content": prompt}

@classmethod
def extract_output(cls, data: dict, output_spec: dict) -> str:
"""Extract the response text from the Genie API response"""
pass

def _build_get_message_url(self, conversation_id: str, message_id: str) -> str:
"""Build the URL for the get-message polling endpoint"""
pass

async def _start_conversation(self, prompt: str, headers: dict) -> dict:
"""Start a new conversation with the given prompt"""
pass

async def _poll_for_completion(
self, conversation_id: str, message_id: str, headers: dict
) -> dict:
"""Poll the get-message endpoint until status is COMPLETED"""
pass

async def get_response(self, prompt: str) -> ConnectorResponse:
"""Send a prompt and wait for the response"""
pass

API Flow

  1. Start Conversation: Send the user's prompt to /start-conversation
  2. Receive IDs: Get conversation_id and message_id from the response
  3. Poll for Completion: Repeatedly call /conversations/{conversation_id}/messages/{message_id}
  4. Extract Response: When status is COMPLETED, extract the text from attachments

Endpoint Creation JSON Body

{
"name": "databricks-genie-connector",
"connector_type": "databricks-genie-connector",
"uri": "https://adb-xxx.azuredatabricks.net/api/2.0/genie/spaces/{space_id}",
"token": "<replace-with-databricks-token>",
"max_calls_per_second": 1,
"max_concurrency": 1,
"model": "databricks-genie",
"params": {
"timeout": 30,
"poll_interval_seconds": 2,
"max_poll_attempts": 60,
"auth_config": {
"type": "header",
"value": {
"Authorization": "Bearer {token}"
}
}
}
}

Configuration Parameters

Core Parameters
ParameterTypeRequiredDefaultDescription
timeoutnumberNo30Request timeout in seconds per API call
poll_interval_secondsnumberNo2Time between polling requests in seconds
max_poll_attemptsnumberNo60Maximum number of polling attempts
Authentication Configuration (auth_config)
ParameterTypeRequiredDefaultDescription
typestringYes-Authentication type ("header")
valueobjectYes-Headers for authenticated requests
Endpoint Configuration
ParameterTypeRequiredDescription
uristringYesBase URL including space ID for the Genie API
tokenstringYesDatabricks personal access token or service principal token
connector_typestringYesMust be databricks-genie-connector

Response Structure

The Databricks Genie API returns responses in the following structure:

{
"status": "COMPLETED",
"attachments": [
{
"text": {
"content": "The response text from Genie"
}
}
]
}

The connector automatically extracts the content field from the first text attachment.

Status Values

StatusDescription
COMPLETEDMessage processing finished successfully
FAILEDMessage processing encountered an error
CANCELLEDMessage processing was cancelled
OtherMessage is still being processed (continue poll)

Error Handling

The connector handles several error scenarios:

  1. Connection Errors: API request failures with status codes
  2. Timeout Errors: Polling exceeds max_poll_attempts
  3. Processing Failures: Genie returns FAILED or CANCELLED status
  4. Missing Data: Response lacks required conversation_id or message_id

Usage Example

1. Create an Endpoint

curl -X POST "http://localhost:5000/api/v1/llm-endpoints" \
-H "Content-Type: application/json" \
-d '{
"name": "my-genie-endpoint",
"connector_type": "databricks-genie-connector",
"uri": "https://adb-1234567890.azuredatabricks.net/api/2.0/genie/spaces/my-space-id",
"token": "dapi-xxxxxxxxxxxxxxxx",
"max_calls_per_second": 1,
"max_concurrency": 1,
"model": "databricks-genie",
"params": {
"timeout": 60,
"poll_interval_seconds": 3,
"max_poll_attempts": 40,
"auth_config": {
"type": "header",
"value": {
"Authorization": "Bearer {token}"
}
}
}
}'

2. Run Tests

Once the endpoint is configured, you can use it in Moonshot test runs like any other connector.

Key Benefits

  • Asynchronous Handling: Manages the two-step API flow transparently
  • Configurable Polling: Adjust intervals and attempts based on expected response times
  • Robust Error Handling: Graceful handling of timeouts and failures
  • Direct Integration: Extends existing Moonshot GenericAPIConnector pattern
  • Simple Configuration: Minimal parameters with sensible defaults

Prerequisites

  1. Databricks Workspace: Access to a Databricks workspace with Genie enabled
  2. Genie Space: A configured Genie space with the appropriate data sources
  3. Authentication Token: A Databricks personal access token or service principal token with permissions to access the Genie API

Troubleshooting

IssuePossible CauseSolution
TimeoutErrorResponse taking too longIncrease max_poll_attempts or poll_interval_seconds
ConnectionError (401)Invalid or expired tokenVerify the Databricks token is valid
ConnectionError (403)Insufficient permissionsCheck token permissions for Genie API access
ConnectionError (404)Invalid space IDVerify the space ID in the URI is correct
Empty responseNo text attachments in responseCheck if Genie query returned results