Skip to main content
Connect to your Databricks workspace to manage clusters, trigger jobs, run SQL on warehouses, and interact with model serving and vector search using plain language. Build reliable automations for data engineering and MLOps while keeping everything inside your Databricks environment.
Operate clusters, jobs, SQL warehouses, model serving endpoints, and vector search from one place to streamline data and ML operations.

How to Use MCP Nodes

What is Databricks MCP?

The Databricks MCP creates a customized node that understands Databricks resources like clusters, jobs, warehouses, serving endpoints, and vector indexes. You can use natural language to perform focused actions and get structured data back. Create once, then reuse the node across workflows for consistent, reliable automations.

What Can It Do for You?

  • Keep compute costs in check by listing, starting, and terminating clusters on demand
  • Orchestrate jobs, trigger new runs, and fetch run outputs for downstream steps
  • Run SQL on warehouses and return structured data for analytics or reporting
  • Query model serving endpoints and vector indexes as part of AI or retrieval workflows

Available Tools

ToolWhat It DoesExample Use
Get MeGet authenticated user information”Return my user id, username, and workspace information as structured data”
List ClustersList all pinned and active clusters, and all clusters terminated within the last 30 days”List clusters and return name, cluster_id, state, and last_activity_time”
Terminate ClusterTerminate a Spark cluster”Given cluster id, terminate the cluster and return cluster_id and state”
Start ClusterStart a terminated Spark cluster”Given cluster id, start the cluster and return cluster_id and state”
List JobsRetrieve a list of jobs with automatic pagination support”List jobs and return job_id, name, and schedule as structured data”
Manage Job RunCancel a running job or delete a non-active job run”Given run id, cancel the run and return run_id, state, and life_cycle_state”
Run JobTrigger a new job run and return the run_id”Given job id, trigger a new run and return run_id and state”
Get Job Run OutputRetrieve the output and metadata of a single task run”Given run id, return status, start_time, end_time, and result summaries as structured data”
Query Serving EndpointQuery a serving endpoint with input data”Query serving endpoint endpoint name with request payload and return predictions and metadata”
Query Vector IndexQuery a vector index for similarity search”Query vector index index name with search text and return top k results with id, score, and metadata”
Execute SqlExecute a SQL statement and optionally await its results”Execute sql query on warehouse warehouse id and return columns and rows as structured data”
List WarehousesList all SQL warehouses that you have access to”List SQL warehouses and return name, warehouse_id, state, and connection info”
List Serving EndpointsList all model serving endpoints in the workspace”List serving endpoints and return name, status, and route configuration”
List Vector Search EndpointsList all vector search endpoints in the workspace”List vector search endpoints and return name, status, and last_update_time”

How to Use

1

Create Your Databricks MCP Node

Go to your node library, search for Databricks, and click “Create a node with AI”
2

Add Your Prompt

Drag the Databricks MCP node to your canvas and add your prompt in the text box.
3

Test Your Node

Run the node to see the results. If it works as expected, you’re all set! If you run into issues, check the troubleshooting tips below.
4

Save and Reuse

Once your Databricks MCP node is working, save it to your library. You can now use this customized node in any workflow.

Example Prompts

Here are some prompts that work well with Databricks MCP: Cluster Inventory
List clusters and return structured data with name, cluster_id, state, creator_user_name, and last_activity_time
Start or Stop Compute
Given `cluster id`, start the cluster and return structured data with cluster_id and state
Run a Job
Given `job id`, trigger a new job run and return structured data with run_id and state
Fetch Job Output
Given `run id`, get the run output and return structured data with status, life_cycle_state, start_time, end_time, and result summaries
Execute SQL on a Warehouse
Run `sql query` on warehouse `warehouse id` and return structured data with column names and rows
Query a Serving Endpoint
Query serving endpoint `endpoint name` using input `request payload` and return structured data with predictions and model version
Start with a single focused action per node, such as listing jobs or starting a cluster. Pass IDs between nodes for actions that depend on prior results, then chain additional steps with nodes like Ask AI, Google Sheets Writer, or Router.

Troubleshooting

If your Databricks MCP node isn’t working as expected, try these best practices:

Keep Prompts Simple and Specific

  • Good: “List jobs and return job_id and name”
  • Needs Improvement: “List all clusters, start any that are terminated, then run job name and return its output”
While the second prompt might work, it’s more efficient to break it into separate nodes. Databricks MCP works best with focused, single-action prompts.

Match What Databricks Can Do

  • Good: “Run sql query on warehouse warehouse id and return columns and rows”
  • Needs Improvement: “Run a SQL query and create a Google Sheet with charts of the results”
Databricks MCP focuses on Databricks operations and data retrieval. For creating spreadsheets and charts, combine it with Google Sheets Writer and Ask AI nodes in your workflow.

Break Complex Tasks Into Steps

Instead of trying to do everything in one prompt (which can cause timeouts and complexity):
List clusters, start the ones named `cluster name pattern`, run job `job name` on each, then fetch each run's output and summarize it
Break this into smaller, focused nodes that each handle one task:
1

Step 1: Get Cluster IDs

List clusters matching cluster name pattern and return cluster_id and name
2

Step 2: Start Clusters

For each cluster id, start the cluster and return cluster_id and state
3

Step 3: Trigger Job

Given job id, trigger a new job run and return run_id and state
4

Step 4: Fetch Run Output

Given run id, return structured data with status, start_time, end_time, and result summaries
In your workflow, connect these nodes sequentially. The cluster IDs output from Step 1 become the input for Step 2, the job ID feeds Step 3, and the run IDs from Step 3 feed into Step 4.

Focus on Data Retrieval

Databricks MCP excels at getting information from Databricks. For analysis or content generation, connect it to other nodes. Example:
  • Good prompt: “List serving endpoints and return name and status”
  • Needs Improvement: “List serving endpoints and write a recommendation plan for scaling them”
Use Ask AI node for analysis, summarization, and recommendations separately in your workflow. Keep Databricks prompts focused on retrieving or executing actions within Databricks and returning structured data.

Troubleshooting Node Creation

If you’re seeing empty outputs in the node creation window (or if you’ve already created the node, hover over it and click “Edit”), use the chat interface to prompt the AI to add debug logs and verify the API response. Specifically mention that you received empty outputs.
In the node creation window (or if you’ve already created the node, hover over it and click “Edit”), use the chat interface to describe in detail what you expected versus what you received.
First click “Fix with Gummie” or use the “Request changes” button in the node creation window. If multiple attempts fail, simplify your prompt or contact support.
MCP node creation often requires a few tweaks. Use the chat interface in the node creation window to refine filters, output fields, or pagination. The AI will adjust the node based on your feedback.

Need More Help?