githubEdit

🦀databricks-kube-operator

A Kubernetes operator for Databricks

[!IMPORTANT] As of 2025/06 this project is archived. We recommend using Upjetarrow-up-right to generate a Crossplane provider from the official Databricks Terraform Providerarrow-up-right

Rustarrow-up-rightFOSSA Statusarrow-up-right

A kube-rsarrow-up-right operator to enable GitOps style management of Databricks resources. It supports the following APIs:

API
CRD

Jobs 2.1

DatabricksJob

Git Credentials 2.0

GitCredential

Repos 2.0

Repo

Secrets 2.0

DatabricksSecretScope, DatabricksSecret

Experimental headed towards stable. See the GitHub project board for the roadmap. Contributions and feedback are welcome!

Read the docsarrow-up-right

Quick Start

Looking for a more in-depth example? Read the tutorial.

Installation

Add the Helm repository and install the chart:

helm repo add mach https://mach-kernel.github.io/databricks-kube-operator
helm install databricks-kube-operator mach/databricks-kube-operator

Create a config map in the same namespace as the operator. To override the configmap name, --set configMapName=my-custom-name:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: databricks-kube-operator
data:
  api_secret_name: databricks-api-secret
EOF

Create a secret with your API URL and credentials:

Usage

See the examples directory for samples of Databricks CRDs. Resources that are created via Kubernetes are owned by the operator: your checked-in manifests are the source of truth.

Changes made by users in the Databricks webapp will be overwritten by the operator if drift is detected:

Look at jobs (allowed to be viewed by the operator's access token):

A job's status key surfaces API information about the latest runarrow-up-right. The status is polled every 60s:

Developers

Begin by creating the configmap as per the Helm instructions.

Generate and install the CRDs by running the crd_gen bin target:

The quickest way to test the operator is with a working minikubearrow-up-right cluster:

Generating API Clients

The client is generated by openapi-generator and then lightly postprocessed so we get models that derive JsonSchemaarrow-up-right and fix some bugs.

chevron-rightTODO: Manual client 'fixes'hashtag

Expand CRD macros

Deriving CustomResource uses macros to generate another struct. For this example, the output struct name would be DatabricksJob:

rust-analyzer shows squiggles when you use crds::databricks_job::DatabricksJob, but one may want to look inside. To see what is generated with cargo-expandarrow-up-right:

Adding a new CRD

Want to add support for a new API? Provided it has an OpenAPI definition, these are the steps. Look for existing examples in the codebase:

Running tests

Tests must be run with a single thread since we use a stateful singleton to 'mock' the state of a remote API. Eventually it would be nice to have integration tests targetting Databricks.

License

FOSSA Statusarrow-up-right

Last updated