Skip to main content

Writing a Custom VCS Provider

This guide shows you how to implement custom VCS providers for Briefcase AI, enabling you to integrate any version control or data versioning system.

Overview

Briefcase AI uses a two-layer plugin architecture for VCS providers:

  1. VcsProvider trait — Simple interface for raw object I/O and versioning
  2. VcsStorageBackend<P: VcsProvider> — Generic adapter that implements StorageBackend on top of any VcsProvider, handling JSON serialization, query filtering, and batch flush

You only need to implement the VcsProvider trait (~150-250 lines). The adapter gives you full StorageBackend compliance for free.

The VcsProvider Trait

All VCS providers implement this trait from crates/core/src/storage/vcs/provider.rs:

use async_trait::async_trait;
use crate::storage::StorageError;

#[async_trait]
pub trait VcsProvider: Send + Sync {
/// Upload raw bytes to the given path.
/// Path format: "snapshots/{id}.json" or "decisions/{id}.json".
async fn write_object(&self, path: &str, data: &[u8]) -> Result<(), StorageError>;

/// Download raw bytes from the given path.
/// Returns StorageError::NotFound if the object does not exist.
async fn read_object(&self, path: &str) -> Result<Vec<u8>, StorageError>;

/// List all object paths under a prefix.
/// e.g., list_objects("snapshots/") -> ["snapshots/abc.json", "snapshots/def.json"]
async fn list_objects(&self, prefix: &str) -> Result<Vec<String>, StorageError>;

/// Delete the object at the given path.
/// Returns true if deleted, false if not found (no error on missing).
async fn delete_object(&self, path: &str) -> Result<bool, StorageError>;

/// Create a version (commit/tag/checkpoint) for the current state.
/// Returns a version identifier (e.g., git commit SHA, Nessie hash).
async fn create_version(&self, message: &str) -> Result<String, StorageError>;

/// Optional: get metadata for a single object.
/// Default returns an error; providers override if supported.
async fn get_object_metadata(&self, _path: &str) -> Result<ObjectMetadata, StorageError> {
Err(StorageError::IoError("not supported by this provider".into()))
}

/// Check connectivity and authentication.
async fn health_check(&self) -> Result<bool, StorageError>;

/// Human-readable provider name for logging.
fn provider_name(&self) -> &'static str;

/// Configuration summary for diagnostics.
fn config_summary(&self) -> String;
}

Key design decisions:

  • Raw bytes, not typed objects: write_object/read_object work with &[u8] and Vec<u8>. JSON serialization of snapshots and decisions is handled by the VcsStorageBackend adapter.
  • StorageError: All methods use the existing StorageError enum from crate::storage, not a custom error type. This ensures seamless integration with the rest of the storage layer.
  • Path convention: The adapter uses snapshots/{id}.json and decisions/{id}.json paths. Your provider just needs to store and retrieve bytes at those paths.

Step-by-Step Guide

Step 1: Create a Module

Create a new directory under crates/core/src/storage/vcs/:

mkdir -p crates/core/src/storage/vcs/myprovider
touch crates/core/src/storage/vcs/myprovider/mod.rs

Step 2: Implement the VcsProvider Trait

Here's a template for a REST API-based provider:

// crates/core/src/storage/vcs/myprovider/mod.rs

use async_trait::async_trait;
use reqwest::Client;
use super::provider::{VcsProvider, VcsProviderConfig, ObjectMetadata};
use super::super::StorageError;

pub struct MyProvider {
client: Client,
endpoint: String,
token: String,
repository: String,
branch: String,
}

impl MyProvider {
pub fn new(config: VcsProviderConfig) -> Result<Self, StorageError> {
let endpoint = config.endpoint.ok_or_else(|| {
StorageError::ConnectionError(
"BRIEFCASE_MYPROVIDER_ENDPOINT is required".into(),
)
})?;
let token = config.token.unwrap_or_default();
let repository = config.repository.unwrap_or_else(|| "default".into());
let branch = config.branch.unwrap_or_else(|| "main".into());

Ok(Self {
client: Client::new(),
endpoint,
token,
repository,
branch,
})
}
}

#[async_trait]
impl VcsProvider for MyProvider {
async fn write_object(&self, path: &str, data: &[u8]) -> Result<(), StorageError> {
let url = format!(
"{}/repos/{}/branches/{}/objects/{}",
self.endpoint, self.repository, self.branch, path
);
self.client
.put(&url)
.bearer_auth(&self.token)
.body(data.to_vec())
.send()
.await
.map_err(|e| StorageError::ConnectionError(e.to_string()))?;
Ok(())
}

async fn read_object(&self, path: &str) -> Result<Vec<u8>, StorageError> {
let url = format!(
"{}/repos/{}/branches/{}/objects/{}",
self.endpoint, self.repository, self.branch, path
);
let resp = self.client
.get(&url)
.bearer_auth(&self.token)
.send()
.await
.map_err(|e| StorageError::ConnectionError(e.to_string()))?;

if resp.status() == reqwest::StatusCode::NOT_FOUND {
return Err(StorageError::NotFound(format!("Object not found: {}", path)));
}

resp.bytes()
.await
.map(|b| b.to_vec())
.map_err(|e| StorageError::IoError(e.to_string()))
}

async fn list_objects(&self, prefix: &str) -> Result<Vec<String>, StorageError> {
let url = format!(
"{}/repos/{}/branches/{}/objects?prefix={}",
self.endpoint, self.repository, self.branch, prefix
);
let resp = self.client
.get(&url)
.bearer_auth(&self.token)
.send()
.await
.map_err(|e| StorageError::ConnectionError(e.to_string()))?;

let items: Vec<String> = resp.json()
.await
.map_err(|e| StorageError::IoError(e.to_string()))?;
Ok(items)
}

async fn delete_object(&self, path: &str) -> Result<bool, StorageError> {
let url = format!(
"{}/repos/{}/branches/{}/objects/{}",
self.endpoint, self.repository, self.branch, path
);
let resp = self.client
.delete(&url)
.bearer_auth(&self.token)
.send()
.await
.map_err(|e| StorageError::ConnectionError(e.to_string()))?;

Ok(resp.status().is_success())
}

async fn create_version(&self, message: &str) -> Result<String, StorageError> {
let url = format!(
"{}/repos/{}/branches/{}/commits",
self.endpoint, self.repository, self.branch
);
let resp = self.client
.post(&url)
.bearer_auth(&self.token)
.json(&serde_json::json!({ "message": message }))
.send()
.await
.map_err(|e| StorageError::ConnectionError(e.to_string()))?;

let body: serde_json::Value = resp.json()
.await
.map_err(|e| StorageError::IoError(e.to_string()))?;

body["id"].as_str()
.map(|s| s.to_string())
.ok_or_else(|| StorageError::IoError("Missing commit ID in response".into()))
}

async fn health_check(&self) -> Result<bool, StorageError> {
let url = format!("{}/healthcheck", self.endpoint);
match self.client.get(&url).send().await {
Ok(resp) => Ok(resp.status().is_success()),
Err(_) => Ok(false),
}
}

fn provider_name(&self) -> &'static str {
"myprovider"
}

fn config_summary(&self) -> String {
format!(
"MyProvider: endpoint={}, repo={}, branch={}",
self.endpoint, self.repository, self.branch
)
}
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_config_parsing() {
let config = VcsProviderConfig::new("myprovider")
.with_endpoint("https://my.example.com")
.with_token("test-token")
.with_repository("test-repo");

let provider = MyProvider::new(config).unwrap();
assert_eq!(provider.provider_name(), "myprovider");
}

#[test]
fn test_missing_endpoint_errors() {
let config = VcsProviderConfig::new("myprovider");
let result = MyProvider::new(config);
assert!(result.is_err());
}
}

For a filesystem-based provider, see dvc/mod.rs, artivc/mod.rs, or gitlfs/mod.rs as examples. These use tokio::fs for async file I/O and std::process::Command (via spawn_blocking) for CLI operations.

Step 3: Add Feature Flag

Update crates/core/Cargo.toml:

[features]
vcs-myprovider = ["vcs-storage"]

The vcs-storage base feature brings in async and networking dependencies.

Step 4: Register in the Factory

Update crates/core/src/storage/vcs/mod.rs:

// Add module declaration (feature-gated)
#[cfg(feature = "vcs-myprovider")]
pub mod myprovider;

// Add to create_vcs_provider() match arms
#[cfg(feature = "vcs-myprovider")]
"myprovider" => Ok(Box::new(myprovider::MyProvider::new(config)?)),

Also add to available_providers():

#[cfg(feature = "vcs-myprovider")]
providers.push("myprovider");

Step 5: Add Python SDK Client (Optional)

Create briefcase/integrations/vcs/myprovider/client.py:

from briefcase.integrations.vcs.base import VcsClientBase

class MyProviderClient(VcsClientBase):
"""Client for MyProvider VCS integration."""

def __init__(self, endpoint: str, token: str = None,
repository: str = "default", branch: str = "main",
briefcase_client=None):
super().__init__(
provider_name="myprovider",
endpoint=endpoint,
repository=repository,
branch=branch,
briefcase_client=briefcase_client
)
self.token = token

def _build_headers(self):
headers = {}
if self.token:
headers["Authorization"] = f"Bearer {self.token}"
return headers

Update briefcase/integrations/vcs/__init__.py to add the conditional import:

try:
from briefcase.integrations.vcs.myprovider import MyProviderClient
_myprovider_client = MyProviderClient
except ImportError:
pass

Configuration Pattern

All providers use VcsProviderConfig for configuration. This struct supports:

  1. Programmatic builder: VcsProviderConfig::new("nessie").with_endpoint("...").with_branch("main")
  2. Environment variables: VcsProviderConfig::from_env("nessie") reads BRIEFCASE_NESSIE_ENDPOINT, BRIEFCASE_NESSIE_TOKEN, etc.
  3. Extra fields: Provider-specific options via with_extra("key", "value") or auto-captured from env vars

Standard env var suffixes: ENDPOINT, ACCESS_KEY, SECRET_KEY, TOKEN, REPOSITORY, BRANCH. Any additional BRIEFCASE_{PROVIDER}_* variables are captured in the extra HashMap.

How the Adapter Works

You don't need to understand the adapter to write a provider, but here's the architecture:

Adapter flow from application storage calls into provider write and version operations

VcsStorageBackend translates typed storage operations into provider-level object and version operations.

Flow summary:

  1. StorageBackend::save_snapshot receives typed snapshot data.
  2. VcsStorageBackend serializes it to JSON bytes and writes via provider APIs.
  3. Pending writes are tracked and finalized during flush() through provider version creation.

The adapter handles: JSON serialization/deserialization, path management, query filtering (matches_query for time ranges, tags, modules), pagination (offset/limit), batch pending writes, and flush-to-commit.

Testing Checklist

  • Config parsing: Test MyProvider::new() with valid config, missing required fields
  • write/read round-trip: Write bytes, read them back, verify equality
  • list_objects: Write multiple objects, list with prefix, verify all returned
  • delete_object: Delete existing object (returns true), delete missing (returns false)
  • create_version: Verify a version ID string is returned
  • health_check: Test reachable endpoint (true) and unreachable (false, no error)
  • Feature flag isolation: cargo check --features vcs-myprovider compiles; cargo check (no features) still compiles
  • Error mapping: Test that provider errors map to correct StorageError variants

For HTTP-based providers, use wiremock::MockServer for realistic request/response testing. For filesystem-based providers, use tempfile::tempdir() for isolated I/O tests.

See Also