How It Works

Architecture

HUSK is built on three components:

HUSK server (Bun + Hono) — HTTP API, MCP endpoint, admin UI, SQLite for structured data
Qdrant — vector database for semantic similarity search
Ollama — runs embedding models locally for converting text into vectors

When a memory is stored, HUSK embeds the text using Ollama, stores the metadata in SQLite, and upserts the vector into Qdrant. When searching, the query is embedded the same way and Qdrant returns the most similar vectors.

Client-agnostic design

The server doesn't know or care what client is sending data. The /ingest endpoint is a universal write API — any tool that can make an HTTP call can store memories. The MCP endpoint wraps the same logic for MCP-compatible clients.

This means HUSK works with:

Claude Code (via MCP or the plugin)
Any future MCP client
Custom scripts hitting the REST API directly

The plugin decides how to capture and retrieve context. The server just stores and searches.

Data flow

Store: Client sends text → HUSK embeds it via Ollama → stores in SQLite + Qdrant
Search: Client sends query → HUSK embeds query → Qdrant finds similar vectors → HUSK returns matched memories
Session tracking: The Claude Code plugin sends observations (prompts, tool usage, files modified) → HUSK records them → later sessions can review past work via session_context and get_session_detail

How It Works

Architecture

Client-agnostic design

Data flow

On this page