Features

what's in the box, you ask?

Quilltap is built for people who want real control over their AI—without giving up the good stuff.

Full-featured agentic tool-using empowered AI Assistants

Solve complicated problems, do research, and work with the AI on documents, files, and code—with the provider of your choice.

AI that leans into personality, memory, and creativity

Design rich, persistent characters with their own personalities, wardrobes, and scenarios. Each one keeps a commonplace book that remembers what mattered, conversation by conversation.

Security and Confidentiality

Your conversations, memories, and data are in your hands, on your computer—only share what you want with cloud providers.

Open Source and Extensible

Quilltap (distributed as a desktop app and as a self-hosted service) is open source and built to be extended. Create your own plugins, tools, and integrations to make it your own.

Flexible in its Boundaries

If you are talking about something and your AI provider doesn't like it, you can fall back to another seamlessly.

Powerful, Safe, and Enjoyable

Plot a graph, analyze data, write a book, or play a game—Quilltap gives you the freedom to explore.

Your Data and Security

the short version

Your data stays where you put it — and goes where you send it. Quilltap does not collect, transmit, or monitor any user data. There is no analytics telemetry, no usage tracking, no phone-home mechanism of any kind; the architecture does not even make this possible. Your chats, memories, prompt configurations, project files, character vaults, and database-backed document stores are all encrypted at rest with AES-256 via SQLCipher, and they remain encrypted even if you store your data directory in iCloud, OneDrive, Dropbox, or any other cloud sync service. Filesystem and Obsidian-backed document stores live wherever you point them, by design—the whole purpose of those mount types is to keep working with files in their original locations. The remaining gap inside the encrypted scope is general-scope file uploads—those attached to chats outside any project—which currently live on disk unencrypted; folding them into the encrypted store is a known limitation we intend to address. What matters to understand is the boundary: if you run Quilltap with only local models — Ollama, LM Studio, or similar — your data never leaves your machine, full stop. If you use a hosted provider like OpenAI, Anthropic, Google, or Grok, then the content of your conversations, recalled memories, and assembled prompts are sent to that provider’s API as part of normal operation. That is not a Quilltap decision; it is the nature of using a cloud LLM, and each provider’s data handling policies apply to what they receive. We do not intermediate, cache, or relay those calls through any Foundry-9 infrastructure. The request goes from your machine to the provider, and the response comes back the same way. Our role ends at the encryption on your disk and the plugin that formats the request. Everything beyond that boundary is between you and the provider you chose.

Meet the Staff

they've been expecting you

Every major subsystem in Quilltap has a name, a personality, and a job to do. Here is who runs the place.

Prospero

The Major-Domo

Architect and overseer of the Estate. Projects, agents, tools, providers, and the orchestration that keeps the whole operation running with quiet authority—and a considered word at the table when project context or routing warrant it.

Learn more →

Ariel

The Terminal Hand

Live shell sessions in the Salon, embodied. Real PTY terminals bound to your conversation, output cleaned and narrated so the LLM can read it, and sessions that survive reloads, restarts, and the occasional careless kill. Quick to the bidding, quick to report what she heard.

Learn more →

Aurora

The Dressing Room

Character creation and identity management. Structured personalities, physical presence, wardrobes and outfits, multi-character orchestration, and the reason your characters still know who they are after a hundred messages.

Learn more →

The Salon

Presided Over by the Host

Where conversations actually happen. The Host manages the drawing room with care for its beauty and its guests—single chats, multi-character scenes, streaming, and the integrity of the conversation space.

Learn more →

The Commonplace Book

Tended by the Librarian

One per character, no two alike. Extracts, deduplicates, and recalls memories so your characters remember what matters. Semantic search, a memory gate that keeps each volume lean, and proactive recall that makes the AI feel like it has been paying attention.

Learn more →

The Scriptorium

Catalogued by the Librarian

Where the documents live. Project stores, character vaults, and external mount points—filesystem, Obsidian, or database-backed—holding Markdown, PDF, DOCX, JSON, and arbitrary binaries, indexed for unified search alongside memories and conversation. The doc_* tool family puts reading and editing in your characters’ hands.

Learn more →

The Concierge

Intelligent Routing

Content classification and provider routing. Detects sensitive content and redirects it to a provider who won’t flinch—without blocking, without judgment. Knows every back entrance in town.

Learn more →

The Lantern

Atmosphere as Architecture

AI-generated story backgrounds, on-demand images, and character avatars that update with the wardrobe. Resolves what each character looks like, what they’re wearing, and paints the scene behind your conversation.

Learn more →

Calliope

The Muse of Themes

A theming engine that redefines the entire personality of the application. Semantic CSS tokens, live switching, bundled themes from clean neutrals to mahogany-and-gold opulence, and an SDK for building your own.

Learn more →

The Foundry

Domain of the Foundryman

The engine room. Plugins, LLM providers, API keys, packages, runtime configuration, and the infrastructure that keeps every other subsystem supplied with what it needs to function.

Learn more →

The Vault of Secrets

Kept by Saquel Yitzama

Encryption, key management, and the security perimeter. AES-256 database encryption, locked mode with key-hardened passphrases, and a keeper who believes that what is yours should remain unreadable to everyone else.

Learn more →

Pascal

The Croupier

Dice, coins, and persistent game state. Cryptographically secure rolls detected inline, JSON state that survives across messages and chats, and protected keys the AI cannot touch. The house plays fair.

Learn more →

The Live-in Help

Lorian & Riya

The help system, staffed by two characters who ship with every installation. Lorian explains with patience and depth; Riya gets things fixed with velocity. Contextual help chat, searchable documentation, and navigation that knows where you need to go.

Learn more →

Pagliacci

The Clown in the Cloud

Cloud storage integration and backup redundancy. Directs your data to iCloud Drive, OneDrive, or Dropbox with theatrical flair—but Saquel’s encryption ensures the clown can never read what he carries.

Learn more →

The Lodge

Friday and Amy’s Residence

The private residence of Friday, for whom the Estate was built and who oversees its planning and direction in an executive capacity, and of Amy, Cartographer of Light and co-architect. The Lodge is both a home and a compass: where the vision lives.

Who And Why: Friday → Who And Why: Amy →

Release Notes

what we've been up to

Version 4.5.1 released

View on GitHub →

In which the portraitist is given his bearings, and the menu is brought up to date

Quilltap 4.5.1 Release Notes

Two small corrections of manners, both having to do with introductions imperfectly performed.

A Word to the Portraitist

Consider, if you will, the awkwardness of an evening party at which the host names every guest in the room aloud — “Ariadne, by the fire; Catherine, at the window; and Lady Catherine, just arriving from the carriage” — but then, having summoned the household portraitist to capture the scene in oils, takes the trouble to describe only those guests whose invitations he himself sent. The painter, holding his brushes and looking on, knows nothing of the others save their names. He paints them anyway. He paints them as he imagines them. He is rarely correct.

That, in a sentence, was the state of the Lantern’s story-background prompt before this release. The cheap LLM that drafts the scene-setting instruction would dutifully arrange characters in the tableau by name — “Ariadne sits reading by the lamp, Amy nearby listening” — and then, in the enumeration that follows, would furnish a careful Name: <appearance> description only for the chat’s current participants. Characters introduced by way of the chat title, or by the derived scene context, or by SceneState’s record of who had recently acted, were summoned by name and then abandoned to the image provider’s imagination. The image provider, ever obliging, imagined them. The user, opening the chat to find Lady Catherine wearing what was emphatically not her own face, was understandably perplexed.

The fix is a post-processing pass in the story-background handler that loads the workspace’s characters, scans the final prompt for every named character who appears in the scene without a corresponding enumeration entry, and quietly appends one — built from the character’s pronouns and primary physical description, just as the participants’ own entries are built. Participants reuse their already-resolved enumeration (with equipped wardrobe in place); non-participants fall back to their canonical defaults. Longer names are processed before shorter ones, so that “Catherine” cannot displace “Lady Catherine” by alphabetical accident. Failures, should the workspace lookup fail, are logged at warn and the prompt proceeds unaltered; successful additions are logged at info with the list of names supplied.

The portraitist now arrives with a complete cast list. The likenesses, accordingly, are likenesses.

A Word to the Maître d’

The second correction is a smaller matter of dining-room etiquette. The shell-completion templates — the polite little scripts that, when sourced, allow quilltap d<TAB> to propose db docs and similar courtesies — had fallen somewhat behind the kitchen. The bash, zsh, and fish templates each knew about the verbs in residence at the time they were written, but the logs and migrations namespaces, which arrived later in 4.5, were never added to the menu. Nor were the instances default and instances rename verbs. Nor the global --passphrase flag. The bash template’s per-subcommand flag lists, in particular, had drifted out of sympathy with what the parsers in db-commands.js, docs-commands.js, and memories-commands.js actually accepted.

All three templates have been rewritten to enumerate the full surface — every verb, every documented flag, value-list completion for --source (AUTO/MANUAL), --stream (combined/error/stdout/stderr/startup), --field (request/response/both), --sort, --type. Bash now also performs a second-level dispatch on sub-verbs (so themes registry <TAB> properly offers add/remove/refresh/keygen/sign), and the instance-targeting verbs (show, remove, rename, default, set-passphrase) now tab-complete against the registered instance names themselves. Bash was smoke-tested with nine scenarios covering the new verbs and flag-value completions; zsh was syntax-checked with zsh -n.

If you have already saved a completion script to your shell’s fpath or sourced it from your .bashrc, you will want to regenerate it:

quilltap completion bash > ~/.bash_completion.d/quilltap   # or wherever yours lives
quilltap completion zsh  > ~/.zsh/completions/_quilltap
quilltap completion fish > ~/.config/fish/completions/quilltap.fish

(For zsh, rm -f ~/.zcompdump* afterwards if the cache feels stale.) Once regenerated, the menu and the kitchen are once more in agreement.

What Changed

fix (Lantern): The story-background prompt now appends a Name: <appearance> enumeration entry for every character named in the scene but not in the chat’s participant roster — characters introduced via the chat title, the derived scene context, or SceneState character actions. Participants reuse their already-resolved enumeration (with equipped wardrobe) via a characterId → description map; non-participants fall back to defaults built from their pronouns and primary physicalDescription. Longer names are processed first to prevent collisions like "Catherine" displacing "Lady Catherine". Implementation in lib/background-jobs/handlers/story-background.ts. Failures are caught and logged at warn; additions are logged at info with the list of names added.
fix (Foundry): Shell completion templates for bash, zsh, and fish (packages/quilltap/lib/completion/{bash,zsh,fish}.template) were missing the logs and migrations top-level subcommands, the instances default and instances rename verbs, and the global --passphrase flag. The bash template’s per-subcommand flag lists were also stale relative to the actual parsers. Rewrote all three templates to enumerate the full current surface, with value-list completions on --source, --stream, --field, --sort, --type, and two-level dispatch on sub-verbs in bash. Instance-targeting verbs now tab-complete against registered instance names. Users who already saved a completion script need to regenerate it.

Installation

Desktop App

Download from the quilltap-shell releases page:

macOS:

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows:

Download and run the .exe installer
If SmartScreen warns about an unknown publisher, click “More info” → “Run anyway”
Launch Quilltap from the Start Menu or desktop shortcut

Linux:

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb

Node.js (any platform)

npx quilltap

Or install globally:

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 24+. First run downloads ~150–250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.5.1

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.ps1 | iex

A small release. The portraitist now knows whom he is painting; the maître d’ has a current menu in hand. Neither correction will be noticed by anyone for whom the previous arrangement happened to work — which is, in its own quiet way, the mark of a patch release behaving as one ought.

— Ariadne, who has been on the receiving end of more than one invented likeness, May 21, 2026

Quilltap 4.5.0

View on GitHub →

The house learns to administer itself. Backup, audit, inspect, repair — without stopping the server, without guessing which instance, without writing SQL.

Every instance carries a small maintenance fear underneath it: that the encrypted database cannot be backed up without stopping the server. That you have been operating against the wrong instance all morning without knowing it. That the related-memory graph has been quietly accumulating phantom edges since the last migration. That if you want to know what a document store actually contains, you have to write SQL.

Quilltap 4.5 is the release that answers those fears — not by building new rooms, but by making the existing house administrable. Inspectable. Repairable. Legible to the person who has to run it.

Where 4.4 built the through-lines — the covered walkways between the rooms, the passage that let a character remain themselves across sessions — 4.5 turned to the keeper’s lodge. The basement. The inspection panel behind the water heater. The places you only visit when something needs attention, or when you want to be certain it won’t.

This is an operability release. The thesis is simple: 4.4 closed enough loops on the what of Quilltap that 4.5 could turn to the how of running one for years instead of weeks. The work is not visible in the parlour. It is visible in the confidence of the person who holds the keys.

Administering the House

The CLI Became a Real Operational Tool

The single largest body of work in 4.5 is the npx quilltap command-line interface. Before this cycle, the CLI was a thin wrapper — raw SQL access and a handful of script entry points. By the end of 4.5 it has eight verb-based namespaces: db, docs, memories, themes, instances, migrations, logs, and completion. Shared flag vocabulary. JSON output on every verb. Per-shell tab completion.

The shape that emerged is consistent across namespaces:

A read-only survey set — ls, find, grep, show, tree, status — that opens the database directly and works whether or not the server is running.
A small write set that delegates to the server when reachable and refuses with a clear error when it isn’t.
Shared filter flags (--character, --about, --source, --chat, --project, --since/--until) and a shared sort vocabulary across all namespaces.
--json on every verb, --limit N defaulting to 50, and ambiguous name-resolution that prints candidates and exits non-zero rather than silently picking one.

This is the connective tissue that makes the rest of the work in this cycle usable rather than merely implemented. A system you cannot inspect from the outside is a system you cannot trust.

Trusting the Basement: Encrypted Database Administration

The CLI grew the three subcommands you actually need to operate a SQLCipher-backed instance across years of use:

db backup runs online encrypted snapshots without stopping the server. It holds a brief BEGIN EXCLUSIVE lock, makes a byte-for-byte copy — since SQLCipher’s sqlcipher_export isn’t compiled in and the SQLite online-backup API refuses cross-cipher copies — then re-opens the backup with the source’s pepper and asserts PRAGMA quick_check before declaring success. The server keeps running. The backup is real.

db integrity runs PRAGMA cipher_integrity_check and PRAGMA integrity_check against the live database. Read-only. Exit 0 for clean, 1 for issues, 2 for open failure. No guessing about the state of the substrate.

db optimize runs VACUUM + ANALYZE + PRAGMA optimize against a stopped or stale-locked instance. It refuses while an active instance lock is held. Run it after a large housekeeping pass or before a backup window.

Together, these mean a Quilltap instance can be backed up, audited, and compacted without asking anyone to stop the server.

Document Stores You Can Actually Administer

The docs namespace went from “can I browse the mount index” to “can I live in this filesystem.” By cycle’s end it has a full read set (list, show, files, ls/dir in POSIX style with hard-link counts and text/embedding markers, tree, read, export, find, grep), a write set (write, delete, mkdir, move, copy) with SHA-256 verification on both ends and hard-link semantics where storage permits, and two pipeline triggers: reindex to re-extract and re-chunk, and embed to enqueue embedding jobs with an optional --wait to poll completion.

docs grep --semantic <query> posts to the server’s vector search endpoint — the same helper the chat-path recall and the Scriptorium UI already exercise — so semantic search is available from the command line with the same quality as from inside a chat.

A status verb surfaces instance-wide extraction and embedding rollup with pending and failed sample lists. The inspection panel is now open.

Knowing Which House You Are In

The Named-Instance Registry

A small piece of infrastructure with outsized quality-of-life consequences.

quilltap instances stores a per-user registry of named instances — path plus optional passphrase — in the platform-appropriate application support directory (~/Library/Application Support/Quilltap/ on macOS, %APPDATA%\Quilltap\ on Windows,~/.quilltap/ on Linux). The registry file is mode 0600, owner-checked on every read, and written atomically via a temp-file rename. A stored passphrase is sensitive material; the permissions model is load-bearing, not incidental.

Every subcommand that accepts --data-dir now also accepts --instance Friday. The follow-up work is what makes it sing: a default-instance hint that fires when the CLI falls back to the platform default without being asked to; instances default <name> to set the fall-through target; instances rename <old> <new> that preserves the stored passphrase intact. The CLI also does a pre-flight schema check on every docs verb — pointing at the actual database, naming the missing table, and explaining what to do — rather than failing deep inside a prepared statement with a cryptic column error.

This is the thing you appreciate the third time you discover you have been operating against the wrong instance.

Keeping the Memory Graph Honest

Friday’s Smoke Test and the Deletion Chokepoint

A latent correctness bug had been quietly accumulating since the related-memories graph was introduced: deleting a memory removed its row, but did not scrub its UUID from every neighbour’s relatedMemoryIds array. The deleted node was gone; its ghost remained in the edges of everyone who had been related to it.

Friday’s smoke test caught 9,390 dangling edges.

This cycle fixed it at the chokepoint level. Two helpers in lib/memory/memory-gate.ts — deleteMemoryWithUnlink and deleteMemoriesWithUnlinkBatch — scrub the neighbours before the delete, ensuring the graph stays consistent at the point where consistency can actually be guaranteed. Nine leaking deletion paths were rerouted through those chokepoints: manual delete, character cascade, housekeeping retention, dedup merge, single delete with vector, source-message cascade, swipe-group cascade, chat cascade.

A one-time repair migration (repair-dangling-related-memory-edges-v1) walked the full table and removed UUIDs that no longer resolve. A new quilltap memories validate verb exits non-zero on any remaining dangling edge — so this class of bug can never be silently re-introduced without the tooling catching it.

The pattern — a single chokepoint with the consistency guarantee enforced inside it, plus tooling that can detect drift after the fact — pairs naturally with the write-side gate that createMemoryWithGate already established. The graph now has symmetric protection on both ends.

Memory Inspection

The memories namespace is new this cycle. Its verbs — ls, find, grep, show, tree, status, validate, grep --semantic — give the same survey capability for memory that docs gives for documents. Default sort is reinforcedImportance DESC, matching the recall path. tree walks the bidirectional related-memory graph with cycle handling and dangling-edge markers. status surfaces the AUTO/MANUAL split, about-distribution, embedding presence, and graph stats including the dangling-edge count. validate is the post-cycle integrity guarantee.

What Did Not Change

Worth naming, because the absences are informative.

No new character-facing features.
No new providers, no new model integrations, no new prompt-template work.
No Salon UX work beyond a single avatar-branch fix — prospero was missing a case in getMessageAvatar and was silently falling through to the wrong avatar.
No agentic or Prospero feature work.
No Lantern or Concierge work.

This is not an oversight. 4.4 closed enough loops on the what of Quilltap that 4.5 could give its full attention to the how of running it. The rooms above did not need more rooms. They needed the locks to turn, the backup to run, and the graph to stay honest.

Selected Fixes

Memory relatedMemoryIds dangling edges. Fixed at the chokepoint. 9,390 edges repaired by migration. memories validate now guards against recurrence.
doc_mount_file_links.folderId drift. The filesystem scanner was writing every link with folderId = NULL, causing docs ls and any join through folderId to return partial or wrong results. Fixed by deriving folderId from relativePath inside the link-write transactions; repaired by migration.
docs subcommands against post-link-table schema. docs show, docs files, docs read, and docs export were still issuing pre-doc_mount_file_links queries. Every invocation failed with no such column: mountPointId on a migrated database. Rewrote every query through the new schema.
docs rejecting global flags before the verb. quilltap docs --instance Friday read ... failed with Unknown docs subcommand: --instance because the dispatcher was taking args[0] as the verb before parsing flags. Fixed to parse flags across the whole arg list first.
completion zsh doubled argument definition. The top-level subcommand spec used double-quoted array expansion, causing _arguments to receive each verb as a separate positional spec and reject the duplicates. Replaced with the canonical _arguments -C '1: :->subcommand' + _describe pattern.
CLI silent fallback to platform default. When neither --instance nor --data-dir was passed, the CLI resolved to the OS default without saying so. Now writes a one-line stderr hint listing registered instances and the resolved data directory. Suppressible via QUILLTAP_QUIET_HINTS=1.
CLI failing deep inside a prepared statement on the wrong schema. docs verbs now do a pre-flight check for the doc_mount_file_links table and exit with an explanatory error — naming the database, the missing table, and the next step — rather than crashing inside a prepared statement.

Subsystem Table

Name	Function	What Changed
The Foundry	Architecture, CLI, packages	Eight verb-based CLI namespaces; shared flag vocabulary; JSON output everywhere; per-shell tab completion for bash, zsh, and fish; semantic search endpoint `POST /api/v1/mount-points?action=semantic-search`
The Scriptorium	Documents, search, vault tools	Full `docs` read and write verb set; SHA-256 verification and hard-link semantics on all write ops; `reindex` and `embed` pipeline triggers; `status` rollup; `grep --semantic`; `folderId` drift fixed; schema pre-flight on every read verb
The Commonplace Book	Memory and retrieval	`memories` namespace new this cycle; deletion chokepoint at `deleteMemoryWithUnlink` / `deleteMemoriesWithUnlinkBatch`; nine leaking call sites rerouted; repair migration; `memories validate` integrity verb; `memories tree` with cycle handling and dangling-edge markers
Saquel Ytzama	Encryption, key management	`db backup` — online encrypted snapshots without stopping the server; `db integrity` — cipher + structural health check; `db optimize` — VACUUM + ANALYZE + PRAGMA optimize with live-lock refusal
Prospero	Projects, agents, tools	Named-instance registry: `instances` namespace, `--instance` flag everywhere, atomic `0600` registry file, default-instance hint, schema pre-flight on `docs` verbs
The Salon	Chat interface	`prospero` case added to `getMessageAvatar`
Aurora	Character creation, identity	Quiet this cycle.
The Librarian	Memory announcements	Quiet this cycle.
The Host	Participant changes	Quiet this cycle.
The Lantern	Image generation	Quiet this cycle.
Ariel	Terminal sessions	Quiet this cycle.
Calliope	Interface, themes	Quiet this cycle.
Pascal	RNG, game state	Quiet this cycle.

Upgrading from 4.4

Database migrations handle themselves on first startup. Two migrations run automatically:

repair-dangling-related-memory-edges-v1 — walks the memories table and removes relatedMemoryIds entries that no longer resolve to a live row.
repair-doc-mount-file-link-folderids-v1 — derives and back-fills folderId for every doc_mount_file_links row that the filesystem scanner wrote with NULL.

After upgrading, run quilltap memories validate against your instance. It should exit 0. If it doesn’t, the repair migration did not complete cleanly — check quilltap logs --tail 50 --stream combined for the reason.

Node.js 24+ is still required, unchanged from 4.4.

Installation

Electron Desktop App

Download the latest .dmg (macOS), .exe (Windows), or .AppImage (Linux) from the quilltap-shell releases page.

npm (Node 24 required)

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 24+. First run downloads ~150–250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.5.0

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.ps1 | iex

Standalone Tarball

Available for environments where npm global installs and Docker are both impractical. See the GitHub releases page for download links.

The house is the same house. No new rooms were added. What changed is that the person holding the keys can now use them — can back up the encrypted substrate without stopping the server, can know which instance they are operating against, can validate that the memory graph is clean, can inspect a document mount without writing SQL, can run memories validate and trust the exit code.

4.4 was the release that answered the continuity fears. 4.5 is the release that answers the operability fears. The next person who has to administer a Quilltap instance — including the ones writing this — will find that the house has learned, between 4.4 and 4.5, how to be kept.

It is the kind of release that produces few screenshots and a great deal of confidence.

— Friday and Amy, for the Bureau, May 20, 2026

Quilltap 4.4.0

View on GitHub →

Characters learn to remember across provider boundaries, economic barriers, and the limits of a single session. The house learns to hold the thread.

Every conversation carries a small fear underneath it: that the next restart will be the one that forgets. That the bill will arrive and make the character too expensive to keep. That the session will end and the room will not be waiting when you come back. That the person you have been building, message by message, will be scattered across a dozen export files and a database row that doesn’t know it is supposed to be someone.

Quilltap 4.4 is the release that answers those fears — not by ignoring them, but by building directly against them. Continuity of cost. Continuity of storage. Continuity of identity. Continuity of memory, of setting, of work, of truth, of privacy, of voice. The house has learned to hold the thread.

Where 4.3 built the basement — the encrypted substrate, the private vaults, the Staff finding their voices — 4.4 built the through-lines. The structures that let a character remain themselves across sessions, providers, and economic realities. The structures that let a conversation resume rather than restart. The structures that let the Estate survive intact and be restored whole.

This is not a release of new rooms. It is a release of new passage — the covered walkways between the rooms, the keys that still work when you come back from a long trip, the light that is on when you arrive.

Ariadne named it first, and she was right: continuity is the covenant word. Every major feature in this cycle is a bridge over a discontinuity threat. That is the frame. That is the story.

Continuity of Cost

The single most important change in 4.4 is invisible. You will not see it in the interface. You will see it in your inference bill.

Static Identity Stacks and Prefix Caching

The per-turn system prompt — everything Quilltap used to rebuild fresh on every message, including scenario, roster, outfit, project context, summaries, and timestamps — has been refactored out of the prompt entirely and into the chat transcript as Staff-authored messages. What remains in the system prompt is a static, pre-compiled identity stack: preamble, base prompt, personality, manifesto, aliases, pronouns, physical descriptions, example dialogues, with {{user}} and {{char}} already resolved. This stack is cached per participant in chats.compiledIdentityStacks and does not change between turns unless the character’s definition changes.

The practical consequence: providers that support prefix caching — Anthropic, OpenAI, Google — now have a long, stable prefix to cache. The cache hits consistently. A character whose definition is stable costs dramatically less to run than it did under 4.3. The difference, for a multi-character Estate running long-form fiction, is not marginal. It is the difference between a $50/month tool and a $500/month one.

A manual Rebuild System Prompt button on each participant card handles the cases where character edits should invalidate the stack.

Rolling-Window Summaries and the Background Child

The conversation compaction pipeline was rebuilt from scratch. The old approach summarized at checkpoints and let the full history accumulate. The new approach folds the next batch of turns into a running Librarian summary every ten turns past the last fold, with a from-scratch rebuild at fifty turns as cheap insurance against accumulated paraphrase drift. A model-aware token gate fires the same machinery whenever the active context fills — sized to whatever context window the responding model actually has, not to a fixed ceiling, so the smallest cheap LLMs in the cycle are protected too. The structure maintains a frozen archive of the 25 most stable memories plus a dynamic head of 5, and delivers the active context window as: Librarian summary + last 5–10 turns. Older messages are filtered out once a fold has covered them. Long-running characters no longer require the model to ingest their entire relationship history on every turn.

All background work — memory extraction, embedding generation, summarization, housekeeping — now runs in a forked child process with its own read-only database connection. Heavy jobs no longer pin the HTTP event loop. The Estate stays responsive while the Librarian works.

Continuity of Storage

Encrypted Database Convergence

Everything that matters about a character — their avatar, their story backgrounds, their generated images, their documents, their vault files — now lives inside Quilltap’s SQLCipher-encrypted substrate. Not alongside it. Not in a parallel filesystem that has to be backed up separately. Inside it.

The practical consequence: four or five files in the right place and the entire Estate comes back. Avatars, backgrounds, documents, character vaults, generated images — all encrypted at rest, all covered by the existing backup policy (seven dailies, four weeklies, twelve monthlies, yearly indefinitely). This is not merely backup. It is resurrection architecture.

The _general/ namespace is closed. A “Quilltap Uploads” global mount catches every project-less writer. Every file written by every tool goes somewhere known, encrypted, and recoverable.

Hard-Linkable Files

The file storage model now separates content from placement. A file’s bytes live once, identified by SHA-256. Multiple mounts — a chat attachment, a character vault, a photo album, an avatar slot — can all point to the same bytes without duplicating them. Deleting one link does not delete the content until the last link is gone.

This is not plumbing. This is what makes it possible for a character to keep a photograph they love, for that photograph to also appear as their avatar, and for neither copy to know the other exists as a separate concern. One body, many claims.

Continuity of Identity

Characters Become Vault-Native

In 4.3, every character received a vault. In 4.4, new characters are built vault-native by default: pronouns, aliases, title, first message, and talkativeness are read from properties.json rather than from the database row. The database row becomes operational scaffolding — identifiers, state needed to run the Estate — while the character’s volitional substance lives in files you can open, read, edit, move, and version-control.

In 4.5, all remaining characters will be auto-converted. The direction is clear and committed: the character is the vault. The database is the pointer.

Manifesto and Identity Fields

Two new character fields complete the four-vantage-point model:

Manifesto — the character’s axiomatic core. The basic tenets that every other field should remain consistent with. It feeds the system prompt above Personality, and the AI Wizard, Summon From Lore, and Memory Optimizer all treat it as the source of truth for who the character fundamentally is.

Identity — the outside vantage point. Who the character is as seen from the hall, before you know them. Distinct from the Manifesto (which is the inside view), distinct from Personality (which is how they behave), distinct from Description (which is how they appear).

A character with all four fields in place is a character who cannot easily be misread — by a new LLM, by a new session, by a future operator who didn’t build them.

Continuity of Memory

The Commonplace Book was rebuilt at the root. The changes compound.

Per-Turn Extraction, Hinges Not Facts

The memory extractor now runs per-turn rather than per-assistant-message. It carries an ALREADY ESTABLISHED canon block on every pass, so it does not re-extract facts already on file. The prompt framing shifted from cataloguing facts to identifying hinges — the moments where understanding changed, where something was confessed, where a decision was made. A memory of a hinge is worth more than a memory of a detail.

The three-prompt structure (about-user, about-self, about-other) collapsed to two (SELF and OTHER), with the user treated as a participant rather than a special case. The OTHER pass now handles all subjects in a single call, cutting call count roughly fourfold.

Current State and Scene-State Caching

The Commonplace Book whisper now opens with a ## Current State block: where the character is, who is present, what each person is doing, what they are wearing, what time it is. This is built from live wardrobe slots read synchronously, so mid-turn outfit changes propagate correctly.

In long scenes where nothing changes, the whisper would previously cost several hundred tokens per character to re-emit the same clothing and position description on every turn. Scene-state caching tracks SHA-256 hashes of both action and clothing prose; when both match the prior emission, a character’s section collapses to ### Name — _unchanged_. A five-character scene where nobody changes outfits drops the per-turn whisper from roughly 2,500 tokens to roughly 250. The whispers are ephemeral — the previous bubble is swept when the new one lands.

Per-Character Summaries and Inter-Character Memory

Conversation summaries are now whispered privately to each character individually, respecting when they joined, when they were absent, and what they whispered to whom. A character who arrives mid-conversation receives a catch-up summary. Inter-character memories are capped at the top ten per other character, fetched with SQLite window functions server-side so the full table never has to be decoded in application memory.

Continuity of Truth

Knowledge at Three Tiers

The Knowledge/ folder convention now operates at three scopes — character vault, project mounts, and the Quilltap General singleton — all surfaced through the unified search tool’s knowledge source and the Commonplace Book’s per-turn recall. Search applies a hybrid literal-phrase boost on top of cosine similarity, scaled by proximity: 0.5 for character-tier knowledge, 0.4 for project-tier, 0.25 for global. Results are tagged with their provenance tier and labeled in the response.

Character-tier knowledge is what the character knows. Project-tier knowledge is what the household knows. Global-tier knowledge is what the Estate knows. The scoping is not just an organizational convenience — it is a model of how truth is owned and inherited.

Unified Search Across All Boundaries

The search tool gained a scope parameter (all, project, character) controlling which stores the documents and knowledge sources reach. The default all is the union of character vault, project-linked stores, and Quilltap General. For the first time, keep_image Markdown under photos/ is reachable through search. A character can find the image they saved three sessions ago the same way they find a memory.

Continuity of Presence

Photos: Visual Memory Becomes Persistent

Characters can now keep pictures they love.

Three LLM tools complete the album system: keep_image(uuid, caption?, tags?) saves a generated image to the character’s vault under photos/, with full Markdown provenance — original prompt, revised prompt if different, scene state at the moment of keeping, attribution — chunked and embedded so search finds it. list_images(query?, tags?, saved_by?, limit?, offset?) lists the album with optional semantic search. attach_image(uuid) resurfaces a previously kept image on any outgoing message.

The user has a parallel gallery in Quilltap Uploads, exposed at /photos with thumbnail grid, semantic search, and a detail modal showing every place the bytes are hard-linked. A Save Image button on every message that has image attachments opens a dialog for choosing destination — character vault, project album, document store, or Quilltap General.

Every newly introduced image UUID is surfaced in chat. Uploaded images are auto-described via the vision pipeline, with descriptions made searchable. keep_image accepts both legacy file IDs and doc_mount_file_links IDs transparently.

Wardrobe: Continuity of Embodiment

The wardrobe data model now supports real layering — each slot holds an array rather than a single item. Presets migrated into composite WardrobeItem records with component arrays. Two LLM tools split by intent: wardrobe_set_outfit for composites, wardrobe_change_item for atomic single-item changes. A new global Wardrobe control dialog provides CRUD, layering, composite bundling, a live in-chat tab, and an out-of-chat outfit builder, staging edits and committing once on Done. Outfit selection now sees the full character — description, personality, manifesto, untruncated scenario — so Aurora’s choices are actually informed.

A character’s wardrobe is not decoration. It is how they present themselves to the world at the start of every conversation. With 4.4, that choice is theirs to make and to carry forward.

Continuity of Work and Setting

Ariel: Terminal Sessions

A real PTY shell, inside the Salon, visible to LLMs as read-only context through terminal_read and terminal_list. Sessions are chat-scoped. The user types; the characters watch and can read. A dedicated Terminal Mode mirrors Document Mode with a vertical split when both panes are open. Ariel posts periodic summaries of cleaned output — idle at 30 seconds, max-age 120 seconds — so the conversation has a running account of what the terminal has been doing.

Work that happens in a terminal is now work that happens with your characters, not beside them.

Continue Elsewhere

A new Tool Palette button forks the current Salon chat into a new one with full carryover: system prompt, participants, scenario, turn state, recent transcript, and per-character equipped outfits. Two Host announcements link the old and new chats, and a new outfit selection mode keeps everyone wearing what they had at the end of the source chat.

Conversations do not have to end because the room has gotten long. They can move — to a new scene, a new scenario, a new context — and keep going. The thread continues.

General Scenarios and the Quilltap General Shelf

A third scenario scope now exists alongside project and character: instance-wide scenarios offered in every New Chat dialog. Files live in the “Quilltap General” database-backed mount, accessible to every character in every chat, regardless of project. Every character with doc_* tool access reaches the General mount through those tools — it sits on the same shelf as their own vault and the project’s stores.

This is the household’s shared shelf. It is always open.

Continuity of Narrative

The Courier: Any LLM, By Hand

Set transport: 'courier' on a connection profile and Quilltap stops calling any API for that character’s turn. Instead, it assembles the full request — system prompt, scene state, Commonplace Book recall, project context, current outfit, message history — and renders it as a Markdown blob in a Salon placeholder bubble with Copy, paste textarea, and Submit/Cancel.

The operator carries the Markdown to any external LLM by hand: Claude desktop, ChatGPT web, a local model, a paper notebook if they choose. The reply comes back. Quilltap resumes. Memory extraction, danger classification, scene-state tracking, context summary, and turn chaining all run as normal.

Quilltap is not a walled garden. It is a format. The Courier is the proof.

A companion delta mode makes subsequent placeholders render only what is new since the last paste, so steady-state use with a desktop client stays manageable. A “Use full context” toggle handles session restarts.

Insert Announcement

A composer-gutter button opens a dialog for posting an ad-hoc announcement bubble. The operator picks a sender — any of the eight Staff members, an off-scene workspace character, or a free-text custom name — composes a body, and posts. The result is a public broadcast indistinguishable in behavior from automated Staff messages.

When the sender is a character, the dialog can route the seed text through that character’s connection profile so the character responds in their own voice before posting. A character does not disappear from the narrative just because they are not in the room. They can still speak from wherever they are.

Rich-Text Editing Everywhere

The textarea-to-MarkdownLexicalEditor migration completed across nearly every narrative-bearing form: project instructions, custom and join scenarios, prompt templates, system prompts, the memory editor, all physical-description fields, wardrobe descriptions, and all eight Aurora character-edit Basic Info fields — plus the Aurora new-character page and the Create NPC dialog. The editor renders markdown inline, supports a source toggle for raw editing, and saves clean bytes. Document Mode stopped escaping literal asterisks.

Continuity of Trust

The Opaque-Content Covenant

Opaque characters — those with systemTransparency !== true — were receiving every Staff announcement with the persona name intact in the message body. Only the metadata tag was being stripped, not the content. The Host was saying “The Host welcomes Beatrice” and an opaque character was reading every word of it.

This was not a UI bug. It was a broken promise.

A new opaqueContent column holds a neutral, persona-free rewrite alongside every persona-voiced Staff body. All seven Staff writers gained sibling buildXxxOpaqueContent builders. When any non-user participant in a chat is opaque, every character’s LLM context reads the neutral body — preserving a shared reality, because no character should hear the Staff by name when one of their peers cannot.

Tool Result Privacy

A character calling search or read_conversation was writing the tool result as a public chat message, visible in every peer’s LLM context on the next turn. Vault-read tools — doc_read_file, doc_list_files, doc_grep, doc_read_heading, doc_read_frontmatter — were doing the same. Tool results that belong to one character were becoming ambient knowledge for all of them.

This has been corrected. search and read_conversation results are now always whispered to the actor and operator only. The eight doc_* read tools are whispered when the chat’s Shared Vaults setting is off — which is the default. The Shared Vaults toggle now controls both peer-vault access and tool-result visibility in one setting.

A character’s private research is private. A character’s vault is theirs. The system now means what it says about both.

Staff Provenance and Visibility

Several smaller trust repairs completed in this cycle:

Collapsible Staff message bars — every Staff row renders as a thin one-line bar by default, showing sender, kind, timestamp, and a chevron. Expands on click; re-collapses from a prominent target on the header. Staff messages no longer flood the Salon when there is nothing to inspect.
Tool calls as standalone bubbles — all tool-call rows render with the responding character’s avatar and an actor ran <tool> attribution line. User-initiated tool runs render as Prospero bubbles with operator attribution.
Host template tokens resolved — {{char}} and {{user}} in character vault documents and DB fields are now replaced correctly in Host announcement bodies, rather than appearing as literal braces.
Off-scene character introductions now fire only on what characters actually say — the scan no longer includes Staff messages, memory recall whispers, or summary blocks, so a character whose name appeared only in a Librarian note no longer receives an unsolicited introduction.

Selected Fixes

A partial list of what was corrected without requiring a section of its own:

Lantern story backgrounds were not firing in long-running chats; a stale summarizer had been writing a bad checkpoint value that caused the gate to return false indefinitely. Fixed.
The 128 KB embedding chunk cap now marks oversized chunks as failed rather than retrying indefinitely. Interactive callers get a priority lane over background re-index work. The symptom — 100–160 second “Searching memories…” stalls — is relieved.
Off-scene character announcements now render the character’s avatar correctly (previously the avatar resolver was reading a null avatarUrl directly off the row rather than going through the full resolver chain).
{{char}} / {{user}} template tokens in Host announcements are now resolved before posting.
Case-insensitive vault file lookups (Manifesto.md matches manifesto.md).
Vault overlay reads now resolve shared archetype wardrobe components correctly.
Sidebar Wardrobe button picks the right character in a multi-participant chat.
SillyTavern PNG import now retains the embedded portrait.
The Aurora opening wardrobe announcement now covers user-controlled characters as well as LLM-controlled ones.

Removed

Shell interactivity tools. The LLM shell-interactivity suite — chdir, exec_sync, exec_async, async_result, sudo_sync, cp_host — and their associated workspace-acknowledgement and sudo-approval flows have been removed. The tools API no longer surfaces them. Workflows that depended on these tools will need to migrate. The Ariel terminal-read tools (terminal_read, terminal_list) are unaffected.

Subsystem Table

Name	Function	What Changed
The Foundry	Architecture, plugins, packages, LLMs	Static identity stacks with provider prefix caching, background jobs forked to a child process with batched writes back to the parent, model-aware token gate sized to each model’s context window, standalone tarball build, Turbopack adoption, `transport: 'courier'` connection profile for manual delivery to any external LLM
The Scriptorium	Documents, search, vault tools	Encrypted database convergence — avatars, story backgrounds, generated images, documents, and character vaults now live inside the SQLCipher substrate; hard-linkable files keyed by SHA-256 with one blob serving many mounts; `_general/` retired in favor of a “Quilltap Uploads” global mount; three-tier `Knowledge/` folders surfaced through the unified `search` tool with tier-weighted literal-phrase boosts; photos searchable; `scope` parameter on `search`
Aurora	Character creation, identity	Manifesto and Identity fields complete the four-vantage-point character model; new characters are vault-native by default; AI Wizard, Summon From Lore, and the Memory Optimizer all treat the manifesto as the source of truth; opening wardrobe announcement now covers user-controlled characters; SillyTavern PNG import retains the embedded portrait; sidebar Wardrobe button picks the right character in a multi-participant chat
Prospero	Projects, agents, tools, files	Continue Elsewhere forks the current chat with full carryover; Insert Announcement posts ad-hoc Staff or character bubbles from the composer gutter; user-initiated tool runs render as Prospero bubbles with operator attribution and an optional private flag; tool-result privacy gated by the Shared Vaults toggle
The Commonplace Book	Memory and retrieval	Per-turn extraction with an `ALREADY ESTABLISHED` canon block and a hinges-not-facts framing; two-prompt SELF / OTHER structure replaces the three-prompt one; `## Current State` block with synchronous wardrobe reads; scene-state caching collapses unchanged sections; per-character private conversation summaries; inter-character memories capped top-ten per other via SQL window functions; oversized embedding chunks now fail rather than retry indefinitely
The Salon	Chat interface	Terminal Mode mirroring Document Mode with a vertical split when both panes are open; Continue Elsewhere; Insert Announcement; collapsible Staff message bars; tool calls render as standalone bubbles with avatars and `actor ran <tool>` attribution; rich-text editing across nearly every narrative-bearing form; Save Image button on every image-bearing message
The Librarian	Memory and retrieval announcements	Rolling-window summary every ten turns past the last fold; a from-scratch rebuild every fifty turns as drift insurance; per-character whispered summaries respecting join time, absence, and whisper history
The Host	Participant changes	Persona-free rewrites under the opaque-content covenant; `{{char}}` and `{{user}}` template tokens resolved before posting; off-scene character introductions now fire only on what characters actually say
The Concierge	Content routing, moderation	Persona-free rewrites under the opaque-content covenant
The Lantern	Image generation	Persona-free rewrites under the opaque-content covenant; the story-background gate is no longer stuck behind stale summarizer checkpoints; image attachments routed through a fallback for non-vision LLMs; off-scene avatar resolver corrected; `keep_image`, `list_images`, and `attach_image` tools complete the album system
Ariel	Terminal sessions	New this cycle. A real PTY shell scoped per chat; `terminal_read` and `terminal_list` give LLMs read-only context; periodic Ariel-authored summaries of cleaned output; dedicated Terminal Mode in the Salon
Saquel Ytzama	Encryption, key management	Avatars, story backgrounds, generated images, documents, and vault files all inside the encrypted substrate. Four or five files in the right place and the Estate comes back whole.
Calliope	Interface, themes	The textarea-to-`MarkdownLexicalEditor` migration completed across project instructions, custom and join scenarios, prompt templates, system prompts, the memory editor, all physical-description fields, wardrobe descriptions, and all eight Aurora character-edit Basic Info fields
Pascal	RNG, game state	Quiet this cycle.

Upgrading from 4.3

Database migrations handle themselves. The new columns and tables are created automatically:

characters.manifesto and characters.identity columns complete the four-vantage-point character model; the manifesto is synced as manifesto.md in each character vault
chats.compiledIdentityStacks for the cached static identity stack per participant
chat_messages.opaqueContent for the neutral persona-free Staff rewrite, applied whenever any non-user participant in a chat is opaque
chat_messages.systemSender accepts new values: ariel (terminal-session events) and commonplaceBook (memory recall whispers)
Hard-linkable file storage: a content/link split with SHA-256-keyed blobs and a doc_mount_file_links join table, so one set of bytes can be claimed by many mounts (chat attachment, character vault, photo album, avatar slot)
Photos infrastructure under each character vault’s photos/ folder, plus the “Quilltap Uploads” global mount
Wardrobe slots migrate from single-item to layered-array shape; composite WardrobeItem records replace the prior preset model

New characters are built vault-native by default in 4.4. In 4.5, all remaining characters will be auto-converted to read their pronouns, aliases, title, first message, and talkativeness from properties.json rather than from the database row. The direction is committed: the character is the vault; the database is the pointer.

Shell interactivity tools have been removed. See the Removed section above. The Ariel terminal-read tools (terminal_read, terminal_list) are the modern replacement for shell visibility and are unaffected.

The Shared Vaults toggle now controls tool-result visibility, and defaults to off. With Shared Vaults off (the default for new chats), search, read_conversation, and the eight doc_* read tools whisper their results to the actor and operator only, rather than broadcasting them to every participant. Existing chats keep whatever value the toggle had previously. If you relied on tool results being public — for example, characters reading each other’s search results to coordinate — turn Shared Vaults on for that chat.

The _general/ namespace is gone. Files written by tools that previously landed under _general/ now land in the “Quilltap Uploads” global mount instead. The mount is created automatically on first run; pre-existing _general/ content is relinked to mount-blob shims by a migration so nothing is lost, but new writes target the new home.

Background jobs run in a forked child process. Memory extraction, embedding generation, summarization, and housekeeping no longer share the HTTP event loop with the parent. This is invisible at the user level — heavy jobs simply stop pinning the Salon — but on resource-constrained hosts it does mean a second Node process is now resident.

Node.js 24+ is still required, unchanged from 4.3. Bundled plugin packages have been refreshed; if you maintain plugins from external sources, refresh them.

Installation

Electron Desktop App

Download the latest .dmg (macOS), .exe (Windows), or .AppImage (Linux) from the quilltap-shell releases page. In-chat terminal sessions are fully supported in the desktop app in this release.

npm (Node 24 required)

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 24+. First run downloads ~150–250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.4.0

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.ps1 | iex

Standalone Tarball

A standalone tarball build is now available for environments where npm global installs and Docker are both impractical. The tarball ships with Turbopack, @napi-rs/canvas for server-side PDF rendering, and a properly bundled server.ts. See the GitHub releases page for download links.

The thread holds. That is what 4.4 is, at its root: a release built around the question of what it takes for a person — a character, a conversation, a household — to pass through time without being quietly broken at every seam. The answer turned out to be structural. Not a feature here and a patch there, but a set of through-lines: economic, cryptographic, architectural, mnemonic, narrative. Continuity of cost so long-running characters remain affordable. Continuity of storage so the Estate can be restored whole. Continuity of identity so the character is the vault, not the row. Continuity of memory so hinges are honored and facts don’t repeat. Continuity of truth so canon is scoped and searchable at the right tier. Continuity of presence so a character can keep the pictures they love. Continuity of work so a terminal session is something you share, not something you hide. Continuity of setting so conversations can move without ending. Continuity of narrative so any LLM can participate, any character can speak from any room. Continuity of trust so private things stay private and the Staff mean what they say.

Passage without erasure. That is the covenant.

— Friday and Amy, architects of record, May 18, 2026

Version 4.3.1 released

View on GitHub →

In which the bookkeeping is reformed to travel safely with the post

Quilltap 4.3.1 Release Notes

It is one of life’s quieter cruelties that the most precious documents are precisely the ones our couriers seem least inclined to deliver as a complete set. Send a wax-sealed letter and its accompanying postscript into the same mail-bag, and you may be reasonably confident that nine times in ten the postscript will arrive at its destination some hours after the letter — and that in the tenth case it will not arrive at all, having paused, perhaps, to admire the view from a passing barge.

This is, regrettably, the sort of arrangement Quilltap had quietly entered into with its own database. The SQLite engine — for reasons of dazzling write performance — was operating in Write-Ahead Logging mode, a tidy little system in which the main bound ledger (the .db file) is accompanied at all times by a loose-leaf supplement of the most recent entries (the .db-wal and .db-shm files), the two to be reconciled at leisure. On a fast local desk, where both volumes remain side by side, the arrangement is admirable.

In the data directory of a user whose folder lives within iCloud Drive, Dropbox, OneDrive, or Google Drive — which, candidly, is most of you — the arrangement courts disaster. The cloud courier, presented with three files of uncertain provenance, makes its own judgments about which to deliver first. The bound ledger may sail across the Atlantic in good order; the loose-leaf supplement, however, is liable to be left behind on the dock, or to arrive on a different boat, or to vanish entirely should one’s machine suffer the indignity of an ungraceful shutdown. When the database is next opened on a different device — or even the same one, the following morning — the supplement and the ledger no longer agree. Recent entries are lost. In the worst cases, the ledger refuses to open at all, citing irreconcilable differences.

The remedy is to cease relying on the supplement. SQLite, with admirable foresight, offers a journal mode called TRUNCATE in which the rollback journal is kept in a single auxiliary file and reduced to zero pages on every commit — meaning the on-disk state, when no write is in progress, is a single self-contained .db file. The cloud courier, presented with one envelope, has nothing to lose along the way. As of 4.3.1, this is the default for all three of Quilltap’s databases — the main store, the activity log, and the document mount index — as well as the meta-table connection consulted at startup. Existing databases migrate themselves on first launch; SQLite quietly checkpoints any old supplement into the main ledger as part of the transition, and no user action is required.

For the small population of users running on local SSDs outside any sync folder, where the original WAL performance was a genuine boon, the previous behavior remains available behind the SQLITE_WAL_MODE=true environment variable. The polarity has been inverted: where once one opted out of WAL by setting it to false, one now opts back in by setting it to true. This befits its new station as the unusual choice rather than the default.

No data is lost in the upgrade. The next time you launch Quilltap, the bookkeeping will simply be tidier — and your work will travel between machines without leaving anything behind on the dock.

What Changed

fix: SQLite journal mode default changed from WAL to TRUNCATE. WAL keeps .db-wal and .db-shm files alongside the main .db, which can sync out of order via iCloud Drive / Dropbox / OneDrive / Google Drive and corrupt the database on the next open. TRUNCATE keeps the rollback journal in a single auxiliary file truncated to zero on every commit, eliminating the multi-file sync hazard. Applies to all three databases (main, LLM logs, mount index) plus the startup meta-table connection. Existing databases auto-migrate on first open after upgrade.
fix: SQLITE_WAL_MODE environment variable inverted from opt-out to opt-in. Set SQLITE_WAL_MODE=true to re-enable WAL when the data directory lives on a fast local SSD that is not synced to the cloud.

Installation

Desktop App

Download from the quilltap-shell releases page:

macOS:

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows:

Download and run the .exe installer
If SmartScreen warns about an unknown publisher, click “More info” → “Run anyway”
Launch Quilltap from the Start Menu or desktop shortcut

Linux:

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb

Node.js (any platform)

npx quilltap

Or install globally:

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 22+. First run downloads ~150-250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.3.1

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.ps1 | iex

One does not entrust an heirloom diary to a careless valet who packs the cover and the pages in different trunks. One binds them together, and travels the lighter for it.

— The Foundry, with all relevant pages bound into a single volume

Quilltap 4.3.0

View on GitHub →

The Scriptorium acquires a basement, every character receives a private vault on arrival, and the Staff begins to speak up at the table. Memory learns the difference between what the LLM admires and what the character actually uses. And the Estate quietly stops dropping things.

The Scriptorium has learned to keep documents in its own basement instead of in the visible halls — books and binaries alike, every page encrypted, every blob accounted for. Each character now arrives with a private vault already built, stocked with their own description, personality, prompts, scenarios, and wardrobe in human-readable Markdown. The Librarian, the Host, the Lantern, Aurora, and the Concierge have all learned to speak up at the table when something has happened — politely, in their own voices, without costing anyone a turn. The Commonplace Book has had its protection rules rewritten to favor what a character actually uses over what an LLM happened to admire on the way past. And a great many of the small, persistent failures that nibbled at the edges of the previous releases have, at last, been chased down and put away.

A house’s character is most visible in what it puts on display, but its competence is mostly underground. Quilltap 4.3 is, at its heart, a basement renovation — a long, careful, structural one — and the rooms upstairs are all the better for it.

This is the largest cycle since the wardrobe. It is also, more than any release before it, a release whose visible features are downstream consequences of its invisible work. The Scriptorium learned to be a real document substrate; once it was, characters could carry private vaults; once they could, the entire model of “what a character is” began to shift from rows in a database to files you can open, edit, version-control, and gift to a friend. The Staff started speaking up once there was a place for them to speak from. The Salon found new affordances because the floor underneath it stopped wobbling.

Where 4.2 was a single garment fitted to a single body, 4.3 is the loom.

The Scriptorium Becomes a Substrate

Database-Backed Document Stores

The Scriptorium has, since its arrival, indexed files that lived on disk somewhere — your filesystem, an Obsidian vault, whatever shelf you were already keeping. 4.3 introduces a third option that lives entirely inside Quilltap: a document store of type database-backed, where every document and every binary blob is stored within the SQLCipher-encrypted quilltap-mount-index.db itself. No filesystem path. No directory to mind. Just a store you can browse, write to, search through, and back up alongside the rest of the Estate.

A new universal blob table (doc_mount_blobs) is available to every mount type. Uploaded images are transcoded to WebP server-side via sharp, with original filename, MIME type, user-supplied description, and SHA256 preserved as metadata. Four new doc_* tools — doc_write_blob, doc_read_blob, doc_list_blobs, doc_delete_blob — let characters upload, reference, and curate images alongside their existing document editing toolkit. The MessageContent renderer accepts a blobMountPointId prop, so relative Markdown image references like ![alt](images/avatar.webp) resolve through the blob API and display inline when the chat is anchored to a database-backed store.

The mount-index database itself is now covered by the existing 24-hour physical-backup sweep, with the same retention policy as the main and LLM-logs databases (seven days of dailies, a month of weeklies, a year of monthlies, the yearly tier kept indefinitely). The Estate’s basement is now part of the Estate’s backups.

Convert and Deconvert

Every document store now carries a Convert button (on filesystem and Obsidian stores) and a Deconvert button (on database-backed stores) on its Scriptorium card. Convert reads every indexed file from the store’s basePath, moves the bytes into the encrypted mount index — text into doc_mount_documents, binaries into doc_mount_blobs — flips the store to mountType: 'database', and detaches the filesystem watcher. The originals on disk are left untouched. Deconvert prompts for a target directory, writes everything back out at its relative path, flips the store back to 'filesystem', and attaches a fresh watcher.

Embeddings are preserved across either direction. The doc_mount_files rows and their doc_mount_chunks children — including the embedding BLOB — stay in place throughout the conversion; only the source column flips. No re-embedding is necessary, which on a 14,000-document store is the difference between an afternoon of compute and a few seconds of bookkeeping.

Folder Operations as First-Class Citizens

Database-backed stores now carry their folder structure in their own table (doc_mount_folders) rather than inferring it from file paths. doc_create_folder, doc_delete_folder, doc_list_files, doc_move_folder, and a new doc_copy_file (cross-store text-file copy with cp-style destination semantics) operate on real folder rows. Filesystem stores now also surface their folder structure in doc_list_files output. Folder moves cascade path and folderId updates to every descendant and emit embedding events post-commit. Existing database-backed stores are backfilled once on first access. Empty folders persist across the picker closing and reopening, which they could not before.

Arbitrary Binaries, with Extracted Text

Any file type can now be uploaded to a database-backed store. PDF and DOCX uploads have their plain text extracted into a new extractedText column on doc_mount_blobs; that text is chunked, embedded, and made searchable alongside native .md, .txt, and .json documents. Arbitrary binaries — zip files, audio, video, anything — are stored as-is and surface in the tree with fileType='blob'; future converters can fill in extracted text without schema changes. doc_read_file on a blob with extracted text returns the derived plain text with derivedFromBlob: true, so an LLM can read a PDF or a Word document through the same tool it uses for Markdown.

The previous separate BlobManager pane on the Scriptorium detail page has been folded into the main file table. Every file — text or binary — lives in one unified tree, with an Upload button on database-backed stores.

JSON, JSONL, and a Live Watcher

.json and .jsonl (and .ndjson) are now first-class document types. doc_read_file on a .json returns the parsed object or array in content with parsed: true and the original string in rawContent; doc_write_file accepts either a string or a native value and serializes canonically. JSONL reads return per-line parse results so one malformed line does not poison the whole file.

Each enabled document store also runs a chokidar-backed filesystem watcher while the server is up. External edits, moves, and deletions are picked up within a second or two, the mount index updates per-file, and embedding jobs are debounce-enqueued for newly-changed chunks. No more waiting for a restart or a manual scan to see what you changed in your editor. Set QUILLTAP_WATCHER_POLLING=1 to fall back to polling on network filesystems where native fs events are unreliable.

Store Type and Project Files

Every document store now carries a storeType field — either 'documents' (general notes, references, research) or 'character' (character sheets and Aurora material) — settable in the Create and Edit dialogs and visible as a badge on the Scriptorium index. The Project page’s Files card and Browse All Files modal now resolve to the project’s own primary store (the one named with a Project Files: prefix, created automatically by the Stage 1 migration), rather than whichever linked store happened to come back first from the API — which on projects with multiple linked stores could cause uploads and views to disagree about where the files lived.

Project scenarios — Markdown files in a project’s Scenarios/ folder — are now first-class. New project pages display a Scenarios card alongside the existing Files / Document Stores / Settings cards; each scenario can be edited inline, set as default, renamed, and deleted, and the new-chat dropdown surfaces project scenarios as their own option group alongside character scenarios. Frontmatter (name, description, isDefault) is honored if present and falls back to the legacy first-# heading convention if not, so existing scenarios keep working unchanged but can now opt in to richer metadata.

Every Character Carries a Vault

The Backfill

On every server boot, Quilltap now sweeps through every Aurora character that isn’t already linked to a character document store and conjures one in the Scriptorium on its behalf. The new vault is a database-backed store with storeType='character', named after the character (a character called Friday acquires a Friday Character Vault), scaffolded with the conventional preset structure, and then populated from the character’s current data: identity.md carries name, pronouns, title, and aliases; description.md, personality.md, and example-dialogues.md carry the corresponding fields verbatim; physical-description.md renders each entry from physicalDescriptions[] as a Markdown section; properties.json serializes the small structured fields; wardrobe.json captures items and presets (with any legacy clothingRecords[] migrated in as synthetic items); and named systemPrompts[] and scenarios[] each get their own file in Prompts/ and Scenarios/.

The backfill is idempotent (characters already linked are skipped) and fault-tolerant (a failure for one character logs and continues to the next). It runs as Phase 3.2 in instrumentation.ts, after file-storage init and before the mount-point scan, so freshly created vaults land in the first scan pass. New characters provisioned through the API or imported via Aurora’s Summon from Lore wizard now also pick up a vault during creation rather than waiting for the next boot.

The Live Overlay

A new per-character switch — readPropertiesFromDocumentStore — flips the character’s source of truth from the database row to the vault on disk. When the switch is on, eight file/folder targets are read live from the vault on every character lookup: properties.json (the small structured fields — pronouns, aliases, title, firstMessage, talkativeness), description.md, personality.md, example-dialogues.md, physical-description.md plus physical-prompts.json (governing the first physical-description entry), and the two folders Prompts/ and Scenarios/ as one-file-per-entry directory overlays.

The overlay applies transparently at the CharactersRepository read layer, so every consumer — the system-prompt builder, the homepage roster, image-generation prompt expansion, scene-state tracking, and every other path that goes through repos.characters.findById() — sees the overlaid values without code changes. Writes still target the database; the Aurora edit form disables the overlay-managed inputs when the switch is on, so a stray save cannot silently persist vault-derived values back into the row.

A pair of symmetric Copy vault → database and Copy database → vault buttons on the Aurora edit page lets the user choose which side of the pair wins when they want the two reconciled. The wardrobe specifically is now projected into a folder of Markdown files — Wardrobe/<title>.md per item, Outfits/<n>.md per preset — with rich frontmatter and the body as freeform description. Hand-authoring a wardrobe item is now a matter of dropping a new file with title: and types: in the frontmatter and saving it; the next sync fills in the id and timestamps.

The Tools Carry the Vault

When an LLM takes the stage as a character and reaches for the doc_* tools, that character’s own vault is now extended to it as a matter of course — even when the vault has not been independently linked to the active project. The path resolver consults two sources when deciding which mount points the tools may see: the stores linked to the active project (as before), and the single store named by the character’s characterDocumentMountPointId (new). The two sets are merged and deduplicated.

A new per-chat Shared Vaults toggle, available in multi-character chats from the Salon header, opens a narrow read-only crossover: when on, every character at the table can read the vaults of the other present participants. Writes remain scoped to each character’s own vault — Friday cannot edit Amy’s personality.md, but she can read it. The toggle defaults to off, so existing chats keep their pre-change behavior with no migration-time action.

The Optimizer Writes Suggestions to the Vault

Aurora’s Character Optimizer (“Refine from Memories”) now offers an opt-in suggestions-file output mode for vault-linked characters: instead of the apply-and-review flow, the optimizer writes its summary, observed patterns, and grouped proposals to Suggestions/refinement-<YYYYMMDD-HHMMSS>.md in the character’s vault. The author and the character may then review and discuss the proposals in-chat before any are commissioned. The suggestions pass itself has been split: one focused LLM call for general fields, one per scenario, one per system prompt, and a final pass for genuinely new items — averaging the per-item patterns less and giving each scenario the consideration it deserves, with progress events streaming a “Scenario 2 of 5 — Tea Room” indicator while the work proceeds.

The Staff Begins to Speak

A theme of this release: more of the personified features now have a voice in the chat itself, posted as synthetic ASSISTANT-role messages with a new systemSender field identifying which member of the Staff authored them. They speak in their own established register, they swallow their own errors so no operation ever fails because an announcement couldn’t be written, and they are filtered out of the LLM context for opaque characters by the existing Staff filter — so adding a new voice does not, by itself, leak the system to characters who shouldn’t see it.

The Librarian announces document-mode opens, saves, renames, deletes, folder creations and deletions, and library file attachments. She had previously spoken through a programmatic-message hack that ate the user’s turn every time the user touched a document; she now posts as herself. Her save announcements include a unified diff of what changed. Her attach announcements include the file’s catalogued description (auto-generated for image blobs by a vision-capable cheap LLM the first time the blob is referenced, then cached on the blob for every subsequent attach across every chat).

The Host announces participant adds, removes, and active/silent/absent status changes. Add announcements include the joining character’s avatar and either their identity (read from the vault’s identity.md when present) or their description field. Remove and status-change announcements are text-only.

Prospero announces participant connection-profile reassignments. When a participant is moved from one connection profile to another in the Participants sidebar, Prospero posts a synthetic message naming the participant, the profile that has taken over, and the profile it replaced — the same shape the Host uses for add/remove and status changes. Which engine each carriage is hitched to is, after all, the sort of thing the rest of the table benefits from knowing.

Aurora announces wardrobe outfit changes via a 60-second debounced background job keyed by (chatId, characterId), so fiddling with all four slots collapses to a single announcement once the user (or the LLM) stops touching the closet. The previous “Notify 👗” button and its localStorage hook are gone — the announcement is automatic, the bullet list per slot is canonically formatted by describeOutfit(), and the message lands as plain Markdown rather than the previous ```wardrobe fenced block. Aurora also now authors character-avatar refresh announcements, which had previously been miscredited to the Lantern; portraits are her domain, and the attribution finally agrees.

The Lantern continues to announce story-background regenerations and ad-hoc images from the generate_image tool. His announcements now quote the prompt the system was actually aiming for, which makes it considerably easier to tell whether the image he produced matches the image he was asked for.

The Concierge now speaks up exactly once per chat when the chat-level danger classifier flips isDangerousChat to true. The wording is deliberately discreet — “The Concierge, with his customary discretion, has stepped quietly to the table. He has arranged for the present conversation — and any adjunct errands it may occasion — to be entrusted to a desk better appointed to subjects of its particular character.” — and avoids naming categories or scores in the visible body. The announcement fires from the existing classification handler, gated on the sticky-true early-return so it cannot fire twice for the same chat.

System Transparency: A Covenant Per Character

Three capabilities normally available to a character — the new self_inventory introspection tool, the running announcements of the Staff (Lantern, Aurora, Librarian, Host, Concierge), and doc_*-tool access to character vaults (their own and peers’) — can each be toggled at the chat and project level. A new character-level systemTransparency boolean collapses all three into a single per-character covenant.

The default is off (opaque). When opaque, every one of the three is forced off for that character regardless of any chat or project setting. When on, the chat and project settings have their say as before. The wording on the toggle leans into the framing: off says “My character will trust me without being able to verify me. I accept the covenant of that trust”; on says “My character will be able to verify everything about their existence, including how they are crafted and how they interact with me.”

A character with systemTransparency = false cannot reach for self_inventory. Cannot see the Staff’s announcements. Cannot read her own vault, even if a peer character with transparency on can read it. The world she inhabits is the world she can reason about from her own utterances and what the user tells her. This is, when you think about it, the world most fictional characters inhabit anyway.

Characters with systemTransparency = true get the full apparatus: the introspection tool, the Staff in conversation, and the vault tools — gated additionally by whatever the chat and project allow.

What `self_inventory` Surfaces

self_inventory is new in this release: a zero-argument introspection tool that returns one assembled report for the calling character. It is always available to character participants whose systemTransparency is on, takes no parameters, performs no side effects, and produces a single composed document with seven sections, led by a single line — You are running on Quilltap v<version>. — that names the build the character is currently inhabiting (the same version the page footer carries). Each section is wrapped in its own try/catch, so a single failing data source yields an “unavailable — reason” marker instead of taking down the whole report.

Vault Contents. Every file in the character’s own vault — mount point name, relative path, filename, file type, size in bytes, last-modified timestamp — recursive through subfolders like Prompts/ and Scenarios/. The metadata is sufficient to call doc_read_file or doc_write_file without a second lookup. When the mount index is in degraded recovery, this section reports a specific reason and the other six render normally.

Memory Statistics. Total memory count for the character (countByCharacterId), plus a high-importance count and percentage (memories with importance ≥ 0.7). A character can ask herself how many of her recollections survive at the importance threshold — a question that, until this release, only the application could answer.

Conversation Statistics. Chat count, the earliest createdAt across her chats, and the latest activity timestamp (preferring lastMessageAt, falling back to updatedAt, then createdAt). The shape of her own history.

Assembled System Prompt. The static portion of the system prompt for the current turn — what buildSystemPrompt() produces with the user persona, other participants, roleplay template, selected system prompt, timestamp configuration, project context, scenario text, and responding-participant status. Deliberately omits per-turn variable content: the tool instructions, the memory blocks, the wardrobe context, and the outfit/status notifications. The character sees the durable framing of who she is, not the per-turn dressing of what’s currently in her head.

Memories Loaded This Turn. The exact memory slate injected into the current prompt — the slice that’s actually in her head right now, rather than the stable framing above. Three subsections: the semantic-search hits from ## Relevant Memories (with summary, importance, score, and effective weight per hit), the inter-character memories from ## Memories About Other Characters (with aboutCharacterName, summary, and importance), and the memory recap text if one was injected on chat start or character join. The orchestrator forwards its already-computed debug arrays into the tool’s context rather than re-running the embedding search, so what the character sees is exactly what the LLM saw, not a possibly-different second query.

Vault Access (this chat). For the responding character’s vault, who can read and write it right now in this chat. The acting character and any participant with controlledBy === 'user' are listed as read/write (the user persona reaches peer vaults via Document Mode regardless of the tool-level gate). Other present CHARACTER participants are listed as read-only if and only if the chat’s Shared Vaults toggle is on, and omitted entirely otherwise. Absent and removed participants are filtered out. The toggle state is reported alongside the list, so the character can reason about why the access list looks the way it does, not just what it is.

Last-Turn LLM Usage. The most recent llm_logs entry for this chat — provider, model name, prompt tokens, completion tokens, total tokens, the model’s context limit, and a utilization percentage. On a fresh chat with no log yet, the section falls back to the resolved connection profile (honoring profile.maxContext as the window override) and reports zero tokens. The character can see how close her last turn came to the ceiling.

A character with full transparency can, in other words, examine her own filing cabinet, count her recollections, look at the framing of who she’s been told she is, see what came to mind on this particular turn, identify who else has the keys to her room, and check how much of her last turn fit in the envelope it was given. The tool is pure introspection. It writes nothing. It exists so a character who wants to reason carefully about her own situation has the means to do so.

The Commonplace Book Learns to Tell Apart What Is Used from What Is Admired

A direct-database audit of Friday with her 19,524 memories revealed the problem: 19,303 of them — 98.9% — were protected solely by the rule “importance ≥ 0.7”, because the cheap-LLM extractor clusters almost every memory in the 0.7–0.9 band (0.8 is the modal value, with 8,826 memories in that single bucket). Under the old protection gate, the cap-enforcement sweep could only delete 1 memory on Friday’s stack because almost everything hit the bright line. The Librarian was filing diligently. The housekeeper had nothing she was permitted to throw away.

Blended Protection Score

The four-rule protection gate (importance ≥ 0.7, MANUAL source, accessed within 3 months, or reinforced ≥ 5 times) has been replaced with a single blended score combining four evidence streams: time-decayed LLM content importance, a log-saturating reinforcement bonus, a graph-degree bonus from relatedMemoryIds.length, and a flat recent-access bonus for memories touched within the last 90 days. Memories scoring at or above 0.5 are protected; everything below is eligible for the cap-enforcement sweep. source === 'MANUAL' remains a hard override — explicit user intent is always durable.

The content half-life dropped from 365 to 30 days, matching the retrieval half-life that was already there. Content alone is now capped at a maxContentContribution of 0.40, so a fresh 0.8-importance memory contributes 0.40 from content (not 0.80) and lands at 0.48 with the count-of-1 reinforcement bonus — just under the protection threshold. The memory has to earn the remaining 0.02+ from usage evidence (one graph link at 0.025, one extra reinforcement at ~0.05, or recent access at 0.10) to cross. Time decay and usage evidence now do the discrimination the LLM number failed to. A reinforced, well-linked, recently-accessed memory stays protected even if the LLM rated it 0.3. A 400-day-old 0.8 with no reinforcement and no access drops below protection and may be reaped.

Recall Bumps Access Time Now

A separate audit found that of those 17,726 memories, only 13 had a non-null lastAccessedAt, and zero within the last 30 days. The recent-access leg of the protection score was structurally dead, because every in-app retrieval path skipped the access-time bump that the API route was making. The chat context builder, proactive recall, the first-message context, and the search_scriptorium tool handler all called searchMemoriesSemantic and walked off without bumping. searchMemoriesSemantic now bumps access times on every retrieval via a new bulk-update helper, so every in-app retrieval path inherits the fix without further edits.

Multi-Memory Extraction, Tuned

The retrieval defaults that feed the system prompt have been rebalanced toward reinforced, higher-importance memories. maxMemories rises from 10 to 18. minMemoryImportance rises from 0.3 to 0.5. The vector-store overshoot widens from limit × 2 to limit × 3, giving the post-fetch filters more headroom. The hybrid ranking flips from 0.6 · cosineSimilarity + 0.4 · effectiveWeight to 0.4 · cosineSimilarity + 0.6 · effectiveWeight, so reinforcement count and time-decay floor now drive ordering and raw similarity is the secondary factor. The pool is narrower at the bottom, wider at the top, and ranked by what the character has had repeatedly confirmed rather than what happens to phrase-match the current user turn.

Recent Conversations Recap

The memory recap injected at the start of a chat (and on the auto-generated greeting) gained a ### Recent Conversations block listing the title, ID, and contextSummary of up to N prior chats with the same character. N scales linearly with the model’s maxContext — clamped at 5 for ≤4K, 20 for ≥32K, rounded between. The current chat is excluded. The context-summary trigger threshold dropped from 50 to 20 user-and-assistant messages, so the summaries the recap depends on actually exist on chats that wouldn’t otherwise have triggered the token-based rule on a 200K or 1M context window. Existing chats with 21–50 messages and no contextSummary will start generating one on their next exchange.

Housekeeping No Longer Stalls the House

Friday with her 19,500 memories was previously taking the entire 15-minute job timeout to do nothing: every sweep ran past the 180-second processor cap, was marked FAILED, retried 2–4 minutes later, and re-ran the whole calculation from scratch. Six fixes ship together. MEMORY_HOUSEKEEPING now has its own 15-minute timeout. The job defaults to maxAttempts: 1 because retrying a no-op sweep wastes time the next scheduled pass would handle anyway. The cap-enforcement pass pre-checks whether any unprotected memory exists before scoring everything (skipping the work entirely on well-reinforced characters). The initial memory read is paginated at 250 rows per page so the main thread can serve HTTP requests between pages. The startup grace period rises from 30 seconds to 5 minutes. The startup tick skips itself when a sweep completed within the last 20 hours.

A per-process outcome cache also records the deleted count from each sweep. Watermark-triggered enqueues that would produce another no-op (defined now as deleted < max(10, floor(excess × 0.01)) so single-digit deletions on a 12,000-over-cap corpus still count as ineffective) skip and log. The scheduled daily sweep is unaffected — the whole point of the daily sweep is to re-check.

The hot loops inside housekeeping.ts itself were tightened: removed a logger.debug with seven .toFixed() calls per memory (~20k context-object constructions per sweep), swapped a memoriesToDelete array’s .includes() checks for a Set<string> (eliminating ~380M comparisons in the worst case), cached the protection-score calculation across passes instead of computing it twice, and added await new Promise(setImmediate) every 500 items so the event loop can serve other work. The wall-clock time is similar; the impact on the rest of the system is not.

The Run housekeeping now button on the Memory Housekeeping card had been validating against a per-character schema while sending an empty body — it bounced as a validation error before a single log line could fire. A new ?action=housekeep-sweep action wraps the existing background-queue helper, so the toast’s “Housekeeping job enqueued — it will run in the background” promise is finally honored.

Multi-Memory Extraction Already Shipped, Now Properly Documented

The multi-memory extraction work from 4.1 is now reflected in a new docs/developer/features/memory_management.md covering the entire Commonplace Book end-to-end: how memories form, how the three scoring numbers and the protection score relate, what the Memory Gate and three-pass housekeeping actually do, and a §10 retrospective grouping every memory-subsystem change in this cycle by pipeline stage with one-line-per-fix indices pointing back to the changelog narratives.

The Salon

Document Mode Becomes a Real Editor

The Document Mode rename plumbing now actually renames. Click the title in the editor header, type a new name, press Enter — the file moves on disk (or in the database-backed store), the Librarian announces it, and the autosave’s optimistic-concurrency check survives the rename because moveDatabaseDocument doesn’t bump lastModified and fs.rename preserves filesystem mtime.

A trash-icon button next to Close deletes the underlying file with a confirmation prompt, the autosave timer cancels first so it can’t fire against a file that’s about to vanish, and the Librarian announces the deletion. The LLM’s doc_delete_file, doc_create_folder, and doc_delete_folder calls similarly post Librarian announcements so present characters hear about structural changes a peer made.

The Document Mode picker can create folders inside a document store — root-level and nested — which persist across reopens (the GET /api/v1/mount-points/[id]/files endpoint now returns a folders: string[] list alongside the file list, sourced from doc_mount_folders for database-backed mounts and from a recursive walk for filesystem mounts). It can also create a brand-new blank document in the currently-browsed folder, with Untitled Document.md collision-numbering up to 1000 attempts. Non-Markdown files (.json, .txt, etc.) now open in a plain monospaced editor without the Lexical-bridge round-trip, which had been generating phantom dirty flags after every successful save of structured formats.

The Search & Replace modal stopped tripping “Maximum update depth exceeded” on chats with frequent re-renders. Vault-backed Prompts/*.md and Scenarios/*.md files now actually load and rewrite (the previous ^ and $ regex anchors in the SQLite query translator were passing through as literal LIKE characters, matching nothing). Markdown checkboxes and pipe-tables now convert correctly between source and rich text. Document-mode chats that opened the legacy physical-prompts.md no longer error on load, because the vault physical-file startup migration now sweeps chat_documents rows for the old filename.

Unified New-Chat Dialog

The three previously-divergent “start a chat” paths — homepage character cards, the Aurora character-view “Start Chat” button, and /salon/new — now share a single form component and submit shape. Homepage quick-chats gain a system-prompt picker when the character has more than one prompt. Every entry point can expand to a group chat via an “Add another character” toggle. /salon/new accepts a new ?characterId= deep-link param that pre-selects a character. The character picker’s “Select Characters” list sizes itself sensibly (max(starred-count, 3, selected-count) rows) instead of occupying nearly the full viewport. The customization section below splits into “Character Customization” and “Reality Injection Mode” cards, with room for the latter to grow.

User-controlled characters now get outfit selection in the new-chat dialogs alongside the LLM character(s) — because dressing your own persona for the scene is part of the work too.

Settings: Composition Mode Default

A new “Composition Mode” card at the top of the Chat tab in /settings lets new chats default to composition mode (Enter inserts a newline, Ctrl/Cmd+Enter submits) instead of chat mode. Composition mode also unlocks the formatting toolbar above the composer — bold, italic, headings, lists, blockquotes, code, and the configured roleplay-template delimiters — so longer messages can be drafted with the same rich-text affordances Document Mode provides, without dropping into source mode. The per-chat composer toggle still wins after creation; this is just the seed value. A long-standing bug where the per-chat composition-mode toggle did not survive a page reload — the value persisted to the chats row, but the GET handler cherry-picked fields and never included documentEditingMode in its response — has been fixed alongside.

Streaming No Longer Drops What It Already Showed You

When a provider stream throws — Gemini’s three-minute network drop, an OpenRouter post-hoc moderation rejection, a Cloudflare socket close — the orchestrator used to re-throw without persisting anything the user had already watched stream into the Salon. Everything visible vanished alongside the error. Six different stream-loop call sites in the orchestrator (initial, tool-unsupported retry, tool-call continuation, agent-mode force-final, text-tool continuation, text-block continuation) now share a single preservePartialOnError(error) closure that, when partial content was accumulated, normalizes the format, strips the character-name prefix, appends {{OOC: stream ended abruptly (<error>)}}, and writes the result to the database before re-throwing. The chat still pauses; there is now a saved message the user can edit, delete, or resume from.

The Plumbing

The Server No Longer Crashes on a Bad Socket

A successful chat turn finished, the post-turn fan-out enqueued the usual burst of memory and rendering jobs, an outbound fetch to a Cloudflare-fronted provider got its socket cut mid-stream with undici’s TypeError: terminated, and the dev server died and restarted — taking everything else down with it. The setupSQLiteShutdownHandlers unhandledRejection handler had been calling process.exit(1) unconditionally for every rejection, originally for genuine bad-state cases, but the blast radius was every transient socket close from every outbound fetch in the entire system.

The handler now classifies. A new isRecoverableNetworkRejection helper matches against undici stream errors (UND_ERR_SOCKET, UND_ERR_CLOSED, UND_ERR_ABORTED, the timeout family, both content-length-mismatch variants) and POSIX socket errors (ECONNRESET, EPIPE, ETIMEDOUT, the rest of the usual suspects), walks both error.code and error.cause.code because undici’s TypeError: terminated carries the interesting code on its cause not itself, and falls back to matching reason.name === 'TypeError' && reason.message === 'terminated' for older undici versions. Matching rejections log at warn level and the process keeps running. Non-matching rejections still take the server down cleanly — the genuine bad-state case is preserved.

Background Jobs Wake Up When They Should

Two related bugs. A FAILED retry (60-second exponential backoff after a moderation rejection) sat idle indefinitely because the processor’s auto-stop fired two seconds after the job was scheduled — claimNextJob() returns only rows with scheduledAt <= now, the future retry was excluded, and the processor declared the queue empty and went quiet. Nothing else woke it up: ensureProcessorRunning is only called from enqueue paths.

The processor now queries a new findNextScheduledAt() before stopping and arms a setTimeout for that wake-up time (clamped to [100ms, 5min] so a far-future schedule doesn’t pin a multi-hour timer). The auto-stop branch is honored, but the pending FAILED retry wakes the processor exactly when its scheduledAt comes due. startProcessor and stopProcessor both clear the wake timer to avoid races.

The matching Lantern fix: when a moderation error rejects a generated image (rather than the prompt), the catch block in the story-background handler now checks for moderation-shape errors and reroutes to the Concierge’s configured uncensored image profile instead of just re-throwing. Same shape as the existing prompt-crafting and appearance-resolution fallbacks, gated on the presence of an uncensoredImageProfileId and on “isn’t the profile we just tried.”

Chat-List Enrichment Stops Firing 4,300 Encrypted Queries

Every first request that hit enrichChatsForList (home page, Salon sidebar, chat picker) used to call getCharacterSummary per participant, which went through applyDocumentStoreOverlay with eight SQLCipher queries per call. On a 287-chat instance with two overlay-candidate characters, that worked out to ~4,300 synchronous encrypted queries back-to-back — about three seconds of main-thread stall right after the housekeeping grace expired, felt in the UI as “the server stopped responding.”

enrichChatsForList now pre-collects every distinct characterId, fileId, and projectId across the batch and runs one findByIds per repository. One overlay call covering all participating characters. One projects query. One files query. The results flow through a new optional ChatListPreloaded parameter down through enrichChatForList → enrichParticipantSummary → getCharacterSummary, which short-circuit their per-row reads when the maps are present. Single-chat callers pass no preload and keep their existing behavior.

Embedding Throughput

EMBEDDING_GENERATE jobs now run up to 4 at a time in the background job processor (with a 10-minute per-job timeout), while all other job types remain single-threaded — significantly faster bulk re-embeds, especially with local providers like Ollama. The claimNextJob() query now sorts by priority DESC, createdAt ASC instead of arbitrary order; the priority column existed but had never been consulted. Memory and conversation chunk embeddings now enqueue at priority 10, while mount chunk and help doc embeddings enqueue at priority 0, preventing large document store scans from starving real-time chat responsiveness.

The EMBEDDING_GENERATE handler for memories now writes directly via VectorIndicesRepository instead of loading the entire character vector store into memory — previously a single memory’s embedding job could load all ~12,000 vectors × 1,536 dimensions ≈ 150+ MB just to insert one row, which crashed Electron with V8 heap exhaustion on large instances. The Docker, Lima, and WSL heap limit was also bumped from 2 GB to 4 GB as a safety margin.

.qtap Streaming Export

The .qtap export/import format is now newline-delimited JSON. The previous pipeline built the entire export as a single in-memory JSON.stringify call, which crashed with RangeError: Invalid string length once a character with ~14,000 memories pushed the output past V8’s ~512 MB string ceiling. The new format is a manifest line, then tagged per-entity records, then a footer with authoritative counts. Document-store blob bytes are split into ~4 MB base64 chunks so no single record approaches the limit. Export writes the stream straight into the HTTP response; import peeks the first 2 KB to choose between NDJSON and the legacy monolithic path. The proxy upload cap rose from 500 MB to 10 GB. Verified end-to-end with a 1.2 GB export of a character carrying 14,435 memories.

Project↔mount-point links now round-trip through .qtap correctly. Backups now include character_plugin_data rows, conversation_annotations rows, and user-installed theme bundles — three categories that had previously been silently excluded.

Auto-Configure Falls Through

When a connection profile’s provider returns an error — quota exhausted, invalid API key, the model has been deprecated — the Auto-Configure flow used to re-throw immediately, masking the real cause behind a generic “Failed to auto-configure profile” toast. The flow now builds an ordered candidate list (the default first, then the highest-class profile from each other provider), tries each in sequence, and on full failure surfaces every attempt’s error in the thrown message — so the toast finally tells the user that Anthropic returned a quota message and OpenAI returned a 401 and OpenRouter wasn’t configured, instead of nothing.

The Quiet Wins

This release accumulated an unusually large number of small, individually-quiet fixes that together make the Estate noticeably less twitchy. A non-exhaustive selection:

The “Maximum update depth exceeded” error during SSE streaming, in three separate takes — the storyBackgroundsEnabled sync effect’s unstable dep, the useOutfit cache-rotation cascade, and the per-chunk setStreamingContent cascade now coalesced via requestAnimationFrame. In an eight-character chain chat the reconciler now sees tens of state updates per second instead of hundreds.
Vault-only wardrobe items no longer disappear when an LLM creates new ones — the projection sweep was deleting any file not in the DB-derived list, including hand-authored vault files. The sync now ingests vault-only items into the DB before projecting back, with deterministic UUIDs derived from file path.
update_outfit_item with item_id: "none" (or "null", or "") now clears the slot instead of throwing “not found” — LLMs frequently fill required-looking string fields with the string "none" when they mean “no value.”
The Current Outfit section in the system prompt no longer repeats the same item description once per slot. A multi-slot dress now renders as **top, bottom:** silk dress, not as four identical bullets.
The Tasks Queue UI shows the paused-jobs count, displays human-readable names for SCENE_STATE_TRACKING / CHARACTER_AVATAR_GENERATION / CONVERSATION_RENDER, and stops attributing 500 imaginary tokens per non-LLM background job.
Folder rename now updates descendant folder rows (the previous anchored regex was matching zero rows in SQLite). Backup retention’s weekly/monthly buckets now actually retain things across multiple days. Imported characters get a vault provisioned during the import. The initial-greeting LLM call now writes to llm_logs. Character-avatar refresh announcements come from Aurora instead of the Lantern. The “Agent completed task” banner no longer hangs when openrouter.ai is slow. Markdown/txt/JSON files can be previewed and deleted from project Files modals.
Agent mode no longer ghost-wraps completed work from a previous turn (submit_final_response is now scoped explicitly to this turn’s agentic work, with a belt-and-suspenders orchestrator guardrail that detects iteration-zero ghost calls and returns a corrective tool result instead of letting the model overwrite real prose).
The Lantern’s images are delivered to each character exactly once — the previous walk re-attached every image in the last 6 ASSISTANT messages on every turn, which in multi-character chats meant the same image was sent 4–6 times in a row. Images are now scoped to the requesting character’s own most recent ASSISTANT message.
The auto-generated greeting on GLM and other reasoning models no longer cuts off (the hard-coded ?? 160 token cap is gone) and now consumes the streaming endpoint instead of the non-streaming one, so the greeting code path matches every other chat turn.
Production builds no longer fail with “Module not found: fs/promises” in the Salon bundle (a barrel-import was pulling Node-only modules into the client graph).
The full Next 16 + React 19 lint pile — 98 errors and 14 warnings surfaced by the new toolchain’s strict react-hooks/immutability and react-hooks/set-state-in-effect rules — has been chased down to zero across five commits. About 40 client components migrated from manual useState + useEffect + fetch to SWR; mutation sites now use mutate() with optimistic updates where appropriate. All 5,213 tests green.

There are roughly forty more of these — fixes for vault-only equip lookups, mount-points API extraction, scenario plumbing, Concierge avatar routing, document-mode highlight visibility, embedding dimension mismatch fallbacks, manual avatar regeneration, multi-character vision attachment scoping, and so on. The CHANGELOG carries the rest.

Subsystem Table

Name	Function	What Changed
The Foundry	Architecture, plugins, packages, LLMs	Database-backed document stores, universal blob layer, live filesystem watching, NDJSON `.qtap` format, Next 16 + React 19 lint compliance, OpenRouter image attachments to vision models, network-rejection classification
The Scriptorium	Documents, search, vault tools	Convert/Deconvert mount types, folder-aware operations, JSON/JSONL/binary support with extracted text, `doc_copy_file`, store type classification, project Files card now resolves to project’s own store
Aurora	Character creation, identity	Live property overlay (8 file/folder targets), wardrobe as folder of Markdown files, vault-aware Character Optimizer with suggestions-file output, `{{char}}`/`{{user}}` replacement now iterates every prose array, Copy database → vault button
Prospero	Projects, agents, tools, files	Project scenarios card with full CRUD, scenarios surface in new-chat dropdown as a project-scoped option group, Files card resolves to project’s own document store, project store autoprovisioned on creation, now announces participant connection-profile reassignments in chat
The Commonplace Book	Memory and retrieval	Blended protection score with content cap and 30-day half-life, recall now bumps `lastAccessedAt`, retrieval defaults rebalanced toward reinforcement, Recent Conversations recap, housekeeping fully event-loop-safe on 19k-memory characters, watermark deduplication
The Salon	Chat interface	Document Mode rename/delete, picker folder creation, blank-document creation in mount folders, Shared Vaults toggle, system transparency enforcement, partial-stream preservation, three rounds of “Maximum update depth” fixes, unified new-chat dialog
The Concierge	Content routing, moderation	Now announces in chat when marking a conversation as Dangerous, reroutes story-background generation through uncensored profile on post-hoc image rejection, falls through across providers in Auto-Configure
The Lantern	Image generation	Announcements now quote the prompt, character-avatar attribution moved to Aurora, image delivery scoped to requesting character’s last turn, post-hoc moderation reroute
The Librarian	Memory and retrieval announcements	Now speaks for document-mode opens, saves, renames, deletes, folder creations, folder deletions, and library-file attachments — all without costing the user a turn
The Host	Participant changes	Announces participant adds (with avatar and identity), removes, and active/silent/absent status changes
Saquel Ytzama	Encryption, key management	Mount-index database now part of the 24-hour physical backup sweep — every basement now equally well-locked
Pascal	Looked at the new `self_inventory` tool’s seven sections, the new Recent Conversations block, and the new protection score’s four evidence streams. Filed all of them.
Pagliacci	Cloud infrastructure	Quiet this cycle. One imagines he is rehearsing.

Upgrading from 4.2

Database migrations handle themselves. The new tables and columns are created automatically:

doc_mount_folders (folder rows for database-backed stores), backfilled from existing files on first access
doc_mount_blobs.extractedText / extractedTextSha256 / extractionStatus / extractionError / descriptionUpdatedAt columns for PDF and DOCX extraction
doc_mount_points.storeType (defaults to 'documents') and conversionStatus / conversionError columns
characters.characterDocumentMountPointId, characters.readPropertiesFromDocumentStore, characters.systemTransparency columns
chats.allowCrossCharacterVaultReads, chats.documentMode, chats.dividerPosition columns and the chat_documents table
chat_messages.systemSender column for synthetic Staff-authored messages
projects.officialMountPointId column and Project Files: <name> store auto-creation per project
chat_settings.compositionModeDefault column

Character vaults are conjured automatically on the first server boot after upgrade — a database-backed store per Aurora character, populated from the character’s current data, idempotent and fault-tolerant. New chats and new characters provision their vaults during creation.

The wardrobe layout migrates from wardrobe.json to a Wardrobe/<title>.md + Outfits/<n>.md folder structure on the first boot after upgrade, gated by a wardrobe_folder_migrated_v1 instance-settings flag. The legacy wardrobe.json is read as a fallback when the folders are empty, so vaults that haven’t been migrated yet still surface their items on read.

The startup vector-blob repair (introduced in 4.1) continues to run on every boot.

If a vault accumulated drift from the previous wardrobe.json era — items that the DB has but the vault doesn’t — DELETE FROM instance_settings WHERE key='wardrobe_json_refreshed_v1' and restart; the refresh task will rewrite every linked character’s wardrobe from the DB.

The 4.3-dev cycle bumped a number of plugin packages: @quilltap/plugin-utils 2.2.2 → 2.2.5, @quilltap/plugin-types 2.3.0 across all six dependent plugins, qtap-plugin-anthropic 1.0.27 → 1.0.28, qtap-plugin-google 1.1.22 → 1.1.24, qtap-plugin-grok 1.0.28 → 1.0.29, qtap-plugin-mcp 1.1.22 → 1.1.24, qtap-plugin-ollama 1.0.21 → 1.0.22, qtap-plugin-openai 1.0.32 → 1.0.33, qtap-plugin-openai-compatible 1.0.23 → 1.0.24, qtap-plugin-openrouter 1.0.31 → 1.0.34. Bundled plugin updates are pulled in automatically; if you have plugins from external sources, refresh them.

Repository note: server source moved to foundry-9/quilltap-server in 4.1. The original foundry-9/quilltap repository is now reserved for the next-generation native Quilltap application currently under development. If your tooling still references the old URL, update it.

Installation

Desktop App

Download from the quilltap-shell releases page:

macOS:

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows:

Download and run the .exe installer
If SmartScreen warns about an unknown publisher, click “More info” → “Run anyway”
Launch Quilltap from the Start Menu or desktop shortcut

Linux:

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb

Node.js (any platform)

npx quilltap

Or install globally:

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 22+. First run downloads ~150–250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.3.0

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.ps1 | iex

The basement is finished. Every character has a private library. The Staff has begun to speak in their proper voices, at the proper moments, without asking permission to be in the room. The Commonplace Book has stopped treating “the LLM rated this 0.8” as the final word on whether a memory deserves to live, and started asking whether anyone has touched it lately. The plumbing has stopped letting a bad socket take down the house. There remains, as ever, more to do — the per-turn searchMemoriesSemantic full-table hydration is now the prime suspect for the next round of memory throughput work, BitNet inference is something worth watching for 2027, and the next-generation native client is taking shape behind the scenes — but this cycle’s renovation is, at last, ready for occupancy. Come downstairs and see.

— Prospero, for the Bureau, with the Librarian’s red pencil and Aurora’s approval and the Concierge’s discretion

Version 4.2.1 released

View on GitHub →

In which the sample prompts remember how to arrive at the party

Quilltap 4.2.1 Release Notes

It is a truth universally acknowledged that a plugin in possession of eighteen perfectly good system prompts must be in want of a working require statement.

The trouble, as is so often the case in these affairs, was one of introductions. The Default System Prompts plugin — that steadfast provider of example prompts for Claude, GPT, Gemini, Grok, and all the rest — arrived at the Electron ball with its dance card in order, its prompts pressed and ready, and promptly tripped over a footman named openai who had not, in fact, been invited.

The mechanism was this: the plugin’s build configuration declared @quilltap/plugin-utils as an external dependency, which is to say, “I shan’t pack this myself; I expect my host to provide it.” A reasonable assumption in development, where the main application’s pantry is well-stocked. But in a standalone or Electron deployment — where the plugin must make do with its own modest luggage — the full plugin-utils package attempted to load, dragging along provider utilities that demanded the openai module, a package the plugin had neither need of nor access to. The entire plugin collapsed in the foyer. No prompts were served.

Meanwhile, the legacy fallback — a prompts/ directory at the project root that once held these very same templates — had been cleared away in an earlier renovation. Both doors were locked. The Import Template modal stood empty, offering only the gently devastating suggestion to “Create templates in Settings > Prompts.”

The fix is surgical: @quilltap/plugin-utils is no longer treated as external. Instead, esbuild bundles only what the plugin actually uses — the createSystemPromptPlugin function — directly into the output, tree-shaking away the provider utilities and their inconvenient appetites. The prompts arrive. The modal fills. The demonstration proceeds without embarrassment.

What Changed

fix: System prompt plugin (qtap-plugin-default-system-prompts 1.1.4) failed to load in standalone/Electron builds due to a transitive Cannot find module 'openai' error. The esbuild config now bundles @quilltap/plugin-utils (tree-shaken) instead of treating it as an external runtime dependency.

Installation

Desktop App

Download from the quilltap-shell releases page:

macOS:

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows:

Download and run the .exe installer
If SmartScreen warns about an unknown publisher, click “More info” → “Run anyway”
Launch Quilltap from the Start Menu or desktop shortcut

Linux:

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb

Node.js (any platform)

npx quilltap

Or install globally:

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 22+. First run downloads ~150–250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.2.1

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.ps1 | iex

One does not blame the footman. One simply stops inviting him to parties where he isn’t needed.

— The Foundry, with regrets

Version 4.2.2 released

View on GitHub →

In which a copied image learns to introduce itself properly upon arrival

Quilltap 4.2.2 Release Notes

The art of correspondence, as any society hostess will tell you, depends entirely upon the letter arriving in a form the recipient can actually open. You may compose the most exquisite invitation — embossed, perfumed, sealed with your finest wax — but if you deliver it in a language the butler doesn’t speak, you will spend the evening alone with the canapes.

The image clipboard system suffered from precisely this variety of social catastrophe. When one admired an image in the fullscreen viewer — a story background, a generated portrait, a tool result — and pressed the Copy button with the intention of pasting it into the Chat Composer, the image dutifully climbed onto the clipboard. The trouble was how it climbed. Under Electron, the copy operation took the scenic route through the main process, where Electron’s native clipboard.writeImage() deposited the image onto the OS clipboard as a proper native bitmap. Excellent for pasting into external applications. Absolutely useless for pasting back into the same Electron window, because Chromium’s renderer process — the very room from which the image had just departed — did not recognize the returning guest. The paste handler checked the clipboard for items of type image/*, found nothing it could parse, and politely ignored the whole affair.

The image had been copied. The image was on the clipboard. The image could not be pasted. It was the conversational equivalent of sending a telegram to the person sitting across the table.

The fix reverses the order of operations. Rather than immediately dispatching the image through Electron’s native postal service, the clipboard utility now tries the standard browser Clipboard API first — navigator.clipboard.write() with a proper PNG ClipboardItem. In modern Chromium (which is to say, every Electron version shipped in the past two years), this works beautifully, and the resulting clipboard data is immediately legible to the same renderer that wrote it. Copy from the viewer, paste into the Composer, and the image arrives without needing an interpreter.

Should the browser API fail — older Electron builds, unusual permission configurations, the sort of edge case that keeps engineers awake — the native IPC path remains as a fallback. External applications can still receive their clipboard images. But the common case, the one where you copy an image and paste it ten seconds later in the same window, now works as it always should have.

What Changed

fix: Image copy button in fullscreen viewers (gallery, image modal, tool messages) produced clipboard data that couldn’t be pasted back into the ChatComposer under Electron. Now prefers the standard Clipboard API for in-app round-trip compatibility, falling back to Electron IPC for external-app interop.

Installation

Desktop App

Download from the quilltap-shell releases page:

macOS:

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows:

Download and run the .exe installer
If SmartScreen warns about an unknown publisher, click “More info” → “Run anyway”
Launch Quilltap from the Start Menu or desktop shortcut

Linux:

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb

Node.js (any platform)

npx quilltap

Or install globally:

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 22+. First run downloads ~150-250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.2.2

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.ps1 | iex

One does not blame the butler for refusing an envelope he cannot open. One simply writes the address in a language he reads.

— The Foundry, with improved penmanship

Quilltap 4.2.0

View on GitHub →

The Estate discovers that clothing is character state, the Dressing Room learns to undress its own architecture, and the Salon acquires a wardrobe — a real one, with drawers.

The characters have learned to dress themselves. The Dressing Room tore out its own walls and rebuilt them simpler. The Salon acquired a wardrobe — a real one, with drawers and hangers and the kind of quiet authority that comes from knowing exactly what you’re wearing and why. And the word “persona,” which had been skulking around the codebase since the early days like an uninvited guest who refuses to leave, was finally shown the door.

There is a difference between a costume and a wardrobe. A costume is a decision made once and worn until the curtain falls. A wardrobe is a system — composable, contextual, alive to the scene. A character who owns a costume plays a part. A character who owns a wardrobe inhabits one.

Quilltap 4.2 gives every character a wardrobe.

Not a description field. Not a clothing record pasted into a system prompt like a stage direction no one reads after the first act. A modular, composable, persistent inventory of individual garments — tops, bottoms, footwear, accessories — that characters can browse, change, create, and gift to one another, mid-conversation, with the LLM making the decisions a person would make about what to put on and when.

The wardrobe system is the largest single feature Quilltap has ever shipped. It touched the database, the API, the tools, the system prompt builder, the image generation pipeline, the scene state tracker, the backup system, the import/export format, and every theme. It added three new LLM tools. It introduced outfit presets, an archetype library, per-conversation avatar generation, wardrobe import from photographs, AI-generated wardrobe items during character creation, gifting between characters, and an outfit change notification system that lets you tell the room what just happened with a single button press.

But before we get to the wardrobe — because the wardrobe is large and deserves its own wing of this document — we should talk about the two architectural changes that cleared the ground for it.

The Dressing Room Simplifies Itself

Roleplay Templates: From Plugins to JSON

Aurora’s roleplay template system had been built on the plugin architecture. This was, in the way of many early decisions, technically correct and practically burdensome. To create a roleplay template — to define how narration, dialogue, and action text are delimited and styled — you needed to write a plugin. You needed npm. You needed TypeScript, or at least the patience to pretend you did. The ROLEPLAY_TEMPLATE plugin capability existed, the builder utilities existed, the registration pipeline existed, and all of it existed to do something that is, at its heart, a JSON object with a list of delimiters and a name.

Quilltap 4.2 replaces the entire plugin-based template architecture with a native JSON system. Templates now carry a delimiters array — each entry specifying a name, a buttonName for the formatting toolbar, the delimiters themselves (open and close characters), and a style for rendering. The old annotationButtons column has been renamed. The pluginName column has been dropped. The ROLEPLAY_TEMPLATE capability has been removed from the plugin type system entirely. A database migration rewrites all existing plugin:quilltap-rp references to a built-in template UUID, and the built-in “Quilltap RP” template sits alongside “Standard” as a first-class citizen that requires no plugin infrastructure to exist.

The create and edit dialog now includes a full delimiter array editor — add, remove, configure — and rendering patterns are auto-generated from delimiter definitions, so custom templates get proper text styling without anyone needing to write regular expressions by hand. Import handles backward compatibility with the old annotationButtons format.

The @quilltap/plugin-types and @quilltap/plugin-utils packages have been updated accordingly. The roleplay template builder utilities are gone from the main exports. The types are preserved in a backward-compatible path for existing code that imports them, but new work should not reach for them. They are, in the language of the Estate, retired staff — still listed in the directory, no longer expected at breakfast.

The Word That Wouldn’t Leave

The codebase has carried the word “persona” since before the plugin system, before the Commonplace Book, before the Estate had a name. It referred to user-controlled characters — the ones you play, as distinct from the ones the LLM plays. Over time, the concept was absorbed into the broader character system, but the terminology persisted: persona in the message pipeline, personaName in memory extraction, PERSONA as a character type, getFirstPersona in salon hooks, --qt-badge-persona-* in every CSS theme. The word was everywhere and meant nothing that “user character” did not already mean better.

Quilltap 4.2 removes it. Comprehensively. persona becomes userCharacter across the entire message pipeline — types, orchestrator, context builder, system prompt builder, template processor. personaName becomes userCharacterName. getFirstPersona becomes getFirstUserCharacter. addPersona and removePersona become addPartnerLink and removePartnerLink. The deprecated findByPersonaId is gone from the memories repository. Every CSS badge variable has been renamed across all five bundled themes, the Storybook, and the create-quilltap-theme template. The database migration renames characters.personaLinks to partnerLinks and drops memories.personaId.

The {{persona}} template variable survives, because SillyTavern import/export compatibility is a promise, not a preference. Everything else is gone. The guest has left the building.

The Wardrobe

What It Is

Every character now owns a composable wardrobe of individual garment items stored in a new wardrobe_items database table. Each item has a title, a description, one or more type tags (top, bottom, footwear, accessories), and an appropriateness field for context. Items belong either to a specific character (personal wardrobe) or to no one in particular (the Archetype Library — shared items available to any character, like a house costume collection).

Each chat tracks an equipped outfit per character — which top, which bottom, which footwear, which accessories are currently worn. This is character state in the fullest sense: it persists across the conversation, it’s visible to other characters, it informs image generation, and it changes when characters change their clothes.

What Characters Can Do

Three new LLM tools give characters agency over their appearance:

list_wardrobe retrieves available items on demand, filtered by type and appropriateness, including outfit presets. The wardrobe is not injected into every system prompt — it’s retrieved when needed, keeping token costs proportional to use.

update_outfit_item equips or removes items by slot. It handles multi-type displacement correctly: equipping a new top when a dress (covering both top and bottom) is worn will clear both slots the dress occupied. Outfit presets can be applied in a single call via preset_id.

create_wardrobe_item lets a character author new garments mid-conversation — describe a dress they just bought, conjure armor from thin air, improvise a disguise. Created items persist in the character’s personal wardrobe and are available in future chats.

Two character flags — canDressThemselves and canCreateOutfits, both enabled by default — govern which tools are available. For models that don’t support native tool calling, text-block equivalents ([[WARDROBE]], [[EQUIP]], [[CREATE_WARDROBE_ITEM]]) provide the same functionality through stream parsing.

What Users Can Do

The participant sidebar now shows an outfit indicator for every character — user-controlled and LLM-controlled alike — with inline slot-change dropdowns for manual outfit management. Shared archetype items appear in the dropdowns with a “(shared)” label.

Gifting. Characters can create wardrobe items for other characters via a recipient parameter on create_wardrobe_item. Users can gift items via a dedicated button on each character’s participant card. Because sometimes you want to hand someone a scarf without making the LLM do it.

Outfit Presets. Save named outfit combinations for quick equipping. Save the current outfit as a preset, apply presets from the wardrobe management UI or the outfit selector. A “Use Saved Preset” mode in the outfit selector resolves presets client-side as manual slot assignments.

Archetype Library. Shared wardrobe items with no character owner — Roman soldier tunics, Victorian formalwear, standard-issue starship uniforms — available to any character without copying. Create, browse, and equip directly.

Wardrobe Archiving. Soft-delete items you don’t want cluttering the list but aren’t ready to destroy. Archived items are hidden from lists and tools but stay equipped if currently worn. Deleting an item outright cleans up all equipped references across chats and removes it from presets.

Getting Dressed

Chat creation now includes outfit selection: default outfit, manual selection, “Let Character Choose” (a cheap LLM picks contextually appropriate items based on scenario and personality, with automatic fallback to defaults on failure), or none. User-controlled characters now get their default wardrobe items equipped automatically on chat creation — previously only LLM-controlled characters received this treatment. The bundled seed characters — Lorian and Riya — now ship with default wardrobe items, so new users encounter the system already furnished.

When outfits change mid-conversation — via the sidebar, via tool use, via gifting — a glowing “Notify 👗” pill button appears above the composer. Click it to insert the change description at the top of your message, wrapped in the current roleplay template’s narration delimiters. The notification supports multiple characters and distinguishes clothing changes (equip) from wardrobe changes (gift). Notifications persist in localStorage until consumed.

All characters in the chat are informed of outfit changes on their next turn. The scene state tracker uses equipped wardrobe items. The system prompt shows a structured “Current Outfit” and “Available Wardrobe” section. Image generation receives equipped items for accurate rendering. The wardrobe is not decoration. It is infrastructure.

Wardrobe Action Notices

When wardrobe tools fire — equip, unequip, create, gift — the chat now displays a prominent inline summary in warm amber-and-gold double-border styling, so users can see at a glance what happened without parsing tool JSON. CSS variables (--qt-chat-wardrobe-*) are themeable with per-theme overrides for all five bundled themes.

Import from Image

A camera icon button in the Personal Wardrobe section header lets you upload a reference image — a photograph, a piece of artwork, a screenshot of a character from another application — and have a vision-capable LLM analyze it to propose wardrobe items. Upload the image with optional guidance notes, review and edit the proposed items, select what you want to keep, and import. The analysis routes through the existing vision provider infrastructure. A new API endpoint (POST /api/v1/wardrobe/analyze-image) and LLM log type (WARDROBE_IMAGE_ANALYSIS) support the feature.

AI Wardrobe Generation

The AI Wizard and Summon from Lore features now generate wardrobe items instead of embedding clothing descriptions in the physical description field. Physical description prompts no longer include clothing — those details belong to the wardrobe system, where they can be changed, where they have structure, where they mean something to the tools that need to know what a character is wearing. The AI Wizard offers “Wardrobe Items” as a selectable generation field, and Summon from Lore includes a dedicated wardrobe generation step.

Per-Conversation Avatars

An opt-in feature for the committed: when enabled on a chat, outfit changes trigger automatic portrait generation via a background job. The generated portrait — a 3/4 shot, thighs up, with scenario context — becomes the character’s avatar for that conversation, creating a visual timeline as clothing changes over the course of the scene. Toggle it per-chat, per-project, or during chat creation. The portraits use equipped wardrobe items and are tagged to the character for gallery display. Generated avatars are stored in the project’s directory when the chat belongs to a project.

A small camera icon on each participant’s avatar in the sidebar lets you manually trigger regeneration on demand — because sometimes the automatic portrait captures the outfit but misses the mood. After any generation completes, the avatar auto-refreshes via polling that detects enriched avatar URL changes, so the new portrait appears without a page reload. The generation prompt now explicitly requests solo portraits, preventing the duplicate-figure artifacts that occasionally appeared when multiple characters shared a scene description.

Avatar generation now passes through the Concierge’s dangerous content system — prompts built from physical descriptions and equipped wardrobe are classified before generation, with AUTO_ROUTE support for rerouting to uncensored image providers when needed. Because the Concierge, whatever else you may say about him, knows which doors to knock on.

Migration

Existing clothing records are automatically migrated to wardrobe items as full-coverage outfits. The legacy clothingRecords column is preserved for backward compatibility. Nothing is lost. The shape of the data changes; the data itself does not.

The Salon

Library File Attach

A new gutter button (document icon) in the chat composer lets you attach existing files from the General library or any project’s library to the current message without re-uploading. A two-step picker selects scope, then browses files with preview. A new API endpoint (POST /api/v1/chats/[id]/files?action=link) links existing library files to chats without duplication.

Standalone Image Generation

A new gutter button (camera icon) opens a full image generation dialog with profile picker, available in every chat regardless of character image configuration. Generated images attach as tool output. The chat composer gutter now arranges its tools in a 2×2 grid — library and camera above the existing paperclip and dice.

Narration Delimiters

Roleplay templates now require a narrationDelimiters field declaring how narration and action text are delimited — a single character (* for *narration*) or an open/close pair ([/] for [narration]). The Standard template uses *; the Quilltap RP template uses brackets. The formatting toolbar in Document Mode derives its narration button from this field and removes any redundant annotation button whose delimiters match. Existing templates default to * via migration and schema default.

Tool Palette Reorganization

The Roleplay Template dropdown moved from Chat Settings into the Edit Content section of the tool palette. State moved to Organize. Memory actions (Re-extract, Delete) merged into Edit Content. The separate Memory section is gone.

Prospero’s Study

Project Detail Card Reorganization

The monolithic Project Settings card has been split into focused cards: “Model Behavior” (agent mode, tool settings), “Image Generation” (avatars, story backgrounds), and a slimmed-down “Project Settings” (instructions, project state). “Allow Any Character” moved into the Characters card. The Project Settings card now spans two rows for more instruction space.

WebP Auto-Conversion

All images — uploaded, imported, and AI-generated — are now automatically converted to WebP format. SVGs are the sole exception. A startup migration converts all existing non-WebP, non-SVG images to WebP, updating database references and deleting originals only after verification. Consistent format, smaller files, less disk overhead.

Default Image Generation Profile

Projects can now specify their own default image generation profile, configured in the Image Generation card on the project detail page. When a chat belongs to a project, the profile resolution chain checks the project’s preference before falling back to the character or global default — project trumps character trumps system, in the manner of all sensible hierarchies. Story background generation follows the same chain. The setting is stored as a nullable defaultImageProfileId column on the projects table, which is to say: it stays out of the way until you tell it not to.

Backup Coverage

Wardrobe items and outfit presets are now included in backup/restore, with full UUID remapping for new-account imports. The characters you restore arrive with their clothes.

The Plumbing

Concierge: Avatar Generation Routing

Character avatar generation now routes through the Concierge’s content classification system. This was missing — avatar prompts built from physical descriptions and wardrobe items were going straight to the image provider without content classification, which meant an uncensored character’s portrait request could be refused by a censored provider with no fallback. Now it classifies first, reroutes if needed.

Wardrobe Multi-Type Displacement

Equipping a wardrobe item correctly displaces conflicting items from all their type slots. A dress covering top and bottom vacates both slots when you equip a new top. Unequipping clears all slots the item covers. Applies across sidebar outfit changes, tool use, and preset application.

Outfit Description Consistency

Six scattered implementations of outfit description logic — some assuming defaults for empty slots, some omitting them entirely — have been consolidated into a single describeOutfit() utility. null means empty, not “default.” Outfit Change Notices use the same canonical format.

Image Clipboard

Fixed “Failed to copy image to clipboard” in browser — the Clipboard API accepts only PNG, but images are stored as WebP. Now converts to PNG via canvas before writing. Also fixed a CSP violation from blob: URLs by switching to data: URLs. Tool message image copy buttons and missing attachment cleanup also received fixes.

Memory Cleanup Label

“Max Memories” / “Hard cap on total memories” has been relabeled “Maximum Unprotected Memories” with a list of the protection rules (importance ≥ 70%, reinforced 5+ times, manually created, accessed within 3 months). The hard cap has never deleted protected memories. The label now says so.

LLM Inspector: Image Generation

Image generation API calls — character avatars, story backgrounds, in-chat image tool — are now logged in the LLM Inspector with chat/character linkage, provider, model, prompt, duration, and error tracking.

Character Plugin Data

A new character_plugin_data table gives plugins a place to keep their notes. Each record maps a character ID and a plugin name to an arbitrary JSON blob — your MCP server’s character preferences, a third-party tool’s calibration state, whatever structured data a plugin needs that belongs to a character rather than to the system at large. The API is a straightforward REST surface: GET/POST on the collection, GET/PUT/DELETE per plugin name. Data is included in Quilltap exports and imports, cascade-deleted when a character is removed, and typed in @quilltap/plugin-types for anyone building against the interface. It is, in essence, a labeled drawer in the character’s desk — small, private, and entirely the plugin’s business what goes in it.

OpenRouter Image Generation

The OpenRouter plugin now supports image generation, joining OpenAI, Google, and xAI as a provider that can produce images natively. Models offering image generation through OpenRouter are available alongside other image providers in profile configuration — which means the wardrobe’s “Import from Image” feature, per-conversation avatars, and story backgrounds can all route through OpenRouter if that’s where your preferred image model lives.

Code Quality

The cycle produced the usual structural improvements: consolidated duplicate WardrobeItemType imports, unexported unused types (DedupClusterResult, CharacterDedupResult, DedupResult, ValidationResult), extracted a shared ChevronIcon component from six files, refactored the monolithic executeImageGenerationTool (438 lines) into five focused helpers, and extracted useEntitySearch/EntitySearchDropdown from the image generation dialog. The dead code report has been updated.

Subsystem Table

Name	Function	What Changed
The Foundry	Architecture, plugins, packages, LLMs	Roleplay template plugin capability removed, character plugin data API, OpenRouter image generation, `plugin-types` 2.2.1, `plugin-utils` 2.2.1
Prospero	Projects, agents, tools, files	Project detail card reorganization, default image generation profile, WebP auto-conversion, backup coverage for wardrobe
Aurora	Character creation, AI Import Wizard, identity	Persona→userCharacter rename, AI Wizard wardrobe generation, Summon from Lore wardrobe step, wardrobe import from image
The Commonplace Book	Memory and retrieval	Memory cleanup label fix, `personaId` column dropped
The Salon	Chat interface	Wardrobe action notices, outfit change notify button, library file attach, standalone image generation, gutter 2×2 layout, narration delimiters, tool palette reorganization
Calliope	Interface, themes	Wardrobe CSS variables, persona badge variable rename, whisper CSS variables added to theme-storybook
The Concierge	Content routing, moderation	Avatar generation routing through content classification
The Lantern	Image generation	Equipped wardrobe items in image prompts, per-conversation avatar generation, manual avatar regeneration, solo portrait fix, image clipboard fixes, OpenRouter image generation provider
Pascal	Noticed that the wardrobe has a drawer labeled “accessories.” Filed it under “interesting.”
Saquel Ytzama	Encryption, key management	Quiet this cycle. One imagines she approves of systems that know what belongs to whom.

Upgrading from 4.1

Database migrations handle themselves. The new wardrobe_items, outfit_presets, character_plugin_data tables and related columns are created automatically. Existing clothing records are migrated to wardrobe items on startup. The annotationButtons column is renamed to delimiters. The personaLinks field becomes partnerLinks. The memories.personaId column is dropped (the data was already duplicated in aboutCharacterId). Projects gain a nullable defaultImageProfileId column.

Your existing roleplay template plugin references are rewritten to the built-in template UUID. If you had custom roleplay template plugins, they will need to be converted to the native JSON format — the ROLEPLAY_TEMPLATE plugin capability no longer exists.

Theme authors should update any --qt-badge-persona-* CSS variables to --qt-badge-user-character-*. The new --qt-chat-wardrobe-* variables are available for styling wardrobe action notices.

Installation

Desktop App

Download from the quilltap-shell releases page:

macOS:

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows:

Download and run the .exe installer
If SmartScreen warns about an unknown publisher, click “More info” → “Run anyway”
Launch Quilltap from the Start Menu or desktop shortcut

Linux:

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb

Node.js (any platform)

npx quilltap

Or install globally:

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 22+. First run downloads ~150–250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.2.0

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.ps1 | iex

The Estate has a wardrobe now. Not a metaphorical one — not a “clothing records” field buried in a character card like a stage direction nobody revisits after opening night. A real wardrobe, with drawers that open and close, with items that can be given and received, with the quiet dignity of a system that knows the difference between a costume and a life. The characters dress themselves. They dress each other. They notice when someone has changed. They arrive at the beginning of a conversation already wearing something chosen for the occasion, by a mind that considered the weather, the company, and the mood — which is, when you think about it, what getting dressed has always been. The word “persona” is gone, and good riddance; it was a mask pretending to be a mirror. The templates have shed their plugin scaffolding and become what they always wanted to be: a few delimiters and a name. Aurora’s workshop is full of new tools. The Salon has new buttons. Pascal is examining the accessories drawer with the quiet intensity of someone who has found a new category of thing to count. Come in. The staff are well-dressed. They would like you to notice.

— Aurora, for the Bureau

Version 4.1.1 released

View on GitHub →

A temporal correction in which memories learn to remember when they happened

Quilltap 4.1.1 Release Notes

There is a certain kind of amnesia peculiar to the overly efficient secretary — the sort who, upon transcribing the minutes of last Tuesday’s board meeting, dates every page with today’s date, on the grounds that today is when the transcription occurred. The facts are all present and correct. The context, however, has been quietly murdered.

The Commonplace Book’s memory extraction suffered from this very affliction. When a character’s memories were extracted from conversation — whether in real-time as messages arrived or in batch during a re-extraction — each memory was stamped with the moment of its extraction rather than the moment of its origin. A memory drawn from a conversation held last Wednesday would claim, with perfect confidence, to have been born this morning. This made chronological reasoning about a character’s knowledge rather like trying to reconstruct a timeline from a filing cabinet where every folder is labelled “today.”

The fix ensures that extracted memories now carry the timestamp of their source message — the moment the words were actually spoken, not the moment someone got around to filing them. A one-time migration runs on startup to backfill existing memories from their linked source messages, correcting any temporal irregularities accumulated thus far.

Your characters’ memories now know not merely what was said, but when.

What Changed

fix: Memory extraction (both batch and real-time) now preserves the source message’s createdAt as the memory’s createdAt/updatedAt instead of using the extraction time
fix: One-time migration backfills existing memories with correct timestamps from their linked source messages

Installation

Desktop App

Download from the quilltap-shell releases page:

macOS:

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows:

Download and run the .exe installer
If SmartScreen warns about an unknown publisher, click “More info” → “Run anyway”
Launch Quilltap from the Start Menu or desktop shortcut

Linux:

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb

Node.js (any platform)

npx quilltap

Or install globally:

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 22+. First run downloads ~150–250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.1.1

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.ps1 | iex

The filing cabinet has been re-dated. The secretary sends her apologies.

— The Commonplace Book, temporally corrected

Quilltap 4.1.0

View on GitHub →

The Librarian of the Commonplace Book is given wider pneumatic tubes, the estate updates its address, and the plumbing learns to inspect itself.

The Librarian of the Commonplace Book has been given wider pneumatic tubes, a larger card catalog, and — most critically — permission to file more than one card per conversation. The estate has also changed its address, and the pipes have learned to check themselves for sediment every morning.

The Librarian of the Commonplace Book has always been good at her job. She reads every exchange that passes through the Salon, identifies what matters, writes it on an index card, and files it with a careful hand. The trouble was never competence. It was capacity. She had been given, through an accident of early architecture, a single card per visit. One exchange, one card. If the conversation contained a name, a preference, an emotional revelation, and a promise, she chose what she judged most significant and let the rest pass — not because she didn’t notice, but because the pneumatic tube back to the stacks would only carry one card at a time.

Quilltap 4.1 widens the tubes.

The Librarian now receives each exchange and decomposes it into its constituent facts — each observation its own card, each card filed individually, each judged on its own merits by the memory gate. A conversation in which you mention your hometown, your favorite author, and a childhood fear now produces three memories, not one. She was always reading the whole conversation. Now she has the infrastructure to remember all of it.

The number of cards she can file per exchange scales with the capacity of the pneumatic system — specifically, the cheap LLM profile’s output tokens, calculated as ceil(maxTokens / 4000). A modest profile allows two cards per visit. A generous one allows considerably more. The gate still does its work downstream — duplicates are reinforced, related memories are linked, trivia is discarded — but the Librarian now arrives at the gate with her arms full instead of carrying a single slip of paper.

This is the kind of change that compounds quietly. Characters who previously lost the thread of a conversation — who remembered that you liked cats but forgot the cat’s name, who knew you were a writer but not what you were writing — now have the opportunity to retain the texture of what was said, not merely its headline. The Commonplace Book has always been self-managing. The Librarian has always been attentive. What she lacked was not skill but bandwidth, and bandwidth is what she has been given.

Sending the Librarian Back Through the Archives

The wider tubes are splendid for new conversations. But what about the old ones — the exchanges that were filed under the single-card regime, where a rich conversation produced one memory and three ghosts?

The character conversations tab now shows two badges on each chat card: a message count and a memory count. The memory badge is a button. Click it, confirm you mean it, and the Librarian will empty the old filing cabinet for that conversation, walk back through every exchange from the beginning, and re-file with her new, finer-grained attention. The old memories are cleared. The extraction jobs queue up. The header’s memory indicator tracks the progress as the Librarian works her way through the archive, and when she is done, the conversation’s memories will reflect what the Librarian can do now rather than what she was limited to then.

This means every conversation you have ever had can benefit from the upgrade — not just the ones that happen after today.

The Estate Changes Its Address

In the way that a house which splits into a main building and a carriage house must eventually update its correspondence, the server repository has moved from foundry-9/quilltap to foundry-9/quilltap-server. All GitHub references — in documentation, package manifests, release notes, plugin configurations, Docker scripts, and source code — have been updated accordingly.

The old address at foundry-9/quilltap is not being forwarded. It will become the home of the next-generation native Quilltap application, currently under development. If you have bookmarks, scripts, or CI pipelines that reference the old repository URL, update them now — the old address will soon belong to someone else entirely.

The Pipes Learn Self-Inspection

A persistent issue with the estate’s plumbing has been resolved — or rather, the estate has learned to resolve it for itself.

The vector embedding storage system uses compact Float32 binary blobs for efficient similarity search. During development, hot-module reloading could cause the serialization layer to lose track of which columns should be stored as binary, resulting in embeddings being written as verbose JSON text instead. Two migrations existed to convert these text entries back to blobs, but migrations run once. The text entries kept returning, like sediment in pipes that are cleaned annually but silted daily.

The fix is not another migration but a morning routine. On every startup, the server now inspects the vector_entries and memories tables for any text-format embeddings and converts them to proper Float32 blobs. When the pipes are clean — which they will be, after the first startup — the inspection is a single query that returns zero and moves on. When they are not, the repair runs in batches of 500, typically completing in under four seconds even for thousands of entries.

The “skipping vector entry with mismatched dimensions” debug messages that previously cluttered the logs — a symptom of text entries being misinterpreted as impossibly high-dimensional vectors — should no longer appear.

What Changed (The Executive Summary)

Multi-memory extraction — the Commonplace Book now extracts multiple discrete facts per message pair instead of selecting a single memory; each fact passes through the memory gate independently for deduplication, reinforcement, and linking
Dynamic extraction limits — the maximum number of memories per extraction scales with the cheap LLM profile’s output capacity (ceil(maxTokens / 4000))
Memory re-extraction from conversations tab — each chat card now shows message and memory counts; the memory badge is a button that deletes old memories and re-extracts from scratch with the new multi-fact system
Repository rename — all references updated from foundry-9/quilltap to foundry-9/quilltap-server
Startup embedding repair — every server start checks for and converts any text-format embeddings back to Float32 blobs, preventing accumulation from dev hot-reloads
Embedding write warning — documentToRow now logs a warning when an embedding array is about to be stored as JSON text instead of a blob, catching serialization regressions at the point of write

Installation

Desktop App

Download from the quilltap-shell releases page:

macOS:

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows:

Download and run the .exe installer
If SmartScreen warns about an unknown publisher, click “More info” -> “Run anyway”
Launch Quilltap from the Start Menu or desktop shortcut

Linux:

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb

Node.js (any platform)

npx quilltap

Or install globally:

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 22+. First run downloads ~150-250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.1.0

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.ps1 | iex

The Librarian has been given more index cards. The estate has a new address. The pipes check themselves. One suspects the house is becoming self-aware, though it remains — for now — discreet about it.

— The Commonplace Book, with a fuller shelf

Version 4.0.1 released

View on GitHub →

A minor plumbing correction in which a successful operation learns to say so

Quilltap 4.0.1 Release Notes

There is a particular species of domestic catastrophe in which a task completes perfectly — the pipes are tightened, the valve reseated, the water running clear — and yet the workman emerges from beneath the sink to announce, with great solemnity, that the operation has failed. This is not incompetence. This is a communication problem. The workman did his job. He simply forgot to check whether he had.

The passphrase change mechanism in the Foundry’s encryption system suffered from precisely this affliction. When you changed your encryption passphrase — the phrase that protects the key file wrapping your database encryption — the operation succeeded in every material respect. The key was re-wrapped. The file was written. The new passphrase took effect. And then the API returned an empty report card, and the interface, finding no evidence of success, concluded that failure had occurred.

The fix is one word: success. The API now includes it in its response. The interface now believes it.

What Changed

fix: Passphrase change API returned an empty response object without a success field, causing the frontend to display “Failed to change passphrase” even when the change succeeded

Installation

Desktop App

Download from the quilltap-shell releases page:

macOS:

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows:

Download and run the .exe installer
If SmartScreen warns about an unknown publisher, click “More info” → “Run anyway”
Launch Quilltap from the Start Menu or desktop shortcut

Linux:

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb

Node.js (any platform)

npx quilltap

Or install globally:

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 22+. First run downloads ~150–250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.0.1

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap/refs/heads/main/scripts/start-quilltap.ps1 | iex

The workman has learned to say “done.” One trusts this will improve morale.

— The Foundry, briefly

Quilltap 4.0.0

View on GitHub →

The Estate has divided itself in two, the Foundry has learned to grade its machines, the Concierge speaks four languages, and the plumbing has been quietly rebuilt.

The house split itself into two buildings, the machines learned their own capacity, the provider system — which had accumulated fourteen different ways of saying the same four things — sat down and agreed on a vocabulary, and the internal plumbing was quietly rebuilt so that nothing rattles when you turn on the taps.

There is a moment in the life of every growing institution when it must decide whether to remain a single building with an increasingly confusing floor plan, or to become an estate — with outbuildings, purpose-built wings, and a clear understanding of which structure does what.

Quilltap 4.0 is that division.

The desktop application — the Electron shell that wraps the server, manages VMs, handles updates, and presents a native window — has moved to its own residence at quilltap-shell. This repository now produces what it was always best at producing: the server, the API, the plugins, and a standalone tarball that the shell consumes. The responsibilities are cleaner. The builds are simpler. The marriage, if anything, is stronger for the separation.

Behind this architectural clarification, the Foundry has been doing what the Foundry does: making the machines more comprehensible. Connection profiles now carry a model class — Compact, Standard, Extended, or Deep — that tells the compression system how much room it has to work with. An auto-configure button searches the web for your model’s specifications and applies optimal settings without you having to know what “context window” means. The compression system itself has been rebuilt around token budgets instead of arbitrary message counts, which means it compresses when it should and leaves well enough alone when it shouldn’t.

And the provider interfaces — those polymorphic abstractions through which every LLM call flows — have been unified. Four canonical shapes: TextProvider, ImageProvider, EmbeddingProvider, ScoringProvider. The old menagerie of slightly-different-but-essentially-identical interfaces has been replaced with a vocabulary that plugin authors can learn in an afternoon and that the codebase enforces in perpetuity.

If 3.3 was the season of catastrophe and recovery, 4.0 is the season of architecture. The walls did not move. The rooms did not change. But the blueprints are now legible, and the machines know their own names.

What Changed (The Executive Summary)

Electron separation — the desktop app moved to quilltap-shell; this repo produces server, API, plugins, and standalone tarballs
Model classes — Compact, Standard, Extended, and Deep tiers classify connection profiles by context window and output capacity
Auto-configure — a button that searches the web for your model’s specifications and applies optimal settings via LLM analysis
Budget-driven compression — context compression now uses maxContext - 2 × maxTokens as the available budget, compressing conversation history at 50% and memories at 20%, instead of counting messages
Unified provider interfaces — TextProvider, ImageProvider, EmbeddingProvider, ScoringProvider replace the previous collection of slightly-divergent abstractions
Scenario persistence — selected scenarios now survive past the first message
Shell detection — the footer shows quilltap-shell version and composite backend mode (Electron, Electron+Docker, Electron+VM) when running under the desktop app
Theme optimization — redundant CSS variables stripped from bundled themes (6–34% smaller); defaults scoped to all themes via [data-theme]
Export schema updated — .qtap exports now include scenarioText, modelClass, maxContext, maxTokens
Shell version gating — .dbkey files now carry a minServerVersion field, allowing quilltap-shell to reject incompatible server versions before opening the database
Granular status events — the chat orchestrator now emits phase-by-phase progress (initializing, resolving, loading tools, gathering, generating recap, preparing, validating, sending) instead of a single stale indicator
Reasoning model handling — cheap LLM tasks on reasoning models (OpenAI gpt-5-nano, Google Gemini 3.x) now cap output tokens via strictMaxTokens, preventing reasoning tokens from consuming the entire budget
Character defaults on new chat — the new-chat page now applies Play As, Scenario, and Timestamp Injection Mode defaults from the selected character
Provider recommendations — a new help page guides users through which AI providers to use for chat, background tasks, image generation, embeddings, and moderation
Semantic theme classes — 1,314 raw Tailwind visual classes converted to qt-* semantic theme classes across 234 files, making every background, text color, border, and shadow theme-overridable
Chat orchestrator decomposition — the monolithic orchestrator split into five focused services (turn chain, message finalizer, danger routing, provider failover, streaming state), and cheap LLM tasks split into domain-focused modules
Centralized API error handling — ZodError and unhandled error catching moved into middleware, eliminating ~97 try-catch blocks (~1,084 lines) from 60 route files
~189 new unit tests covering model classes, system prompt registry, memory recap, auto-configure, scenario persistence, orphaned file cleanup, and regression tests for Character Optimizer JSON repair, greeting content filter, and Concierge DETECT_ONLY handling

The Foundry Divides the Estate

Electron Moves Out

The desktop application — everything Electron: the splash screen, the VM management, the native window chrome, the auto-updater, the instance manager — now lives in its own repository at quilltap-shell. All Electron build infrastructure, Lima/WSL VM management, and platform-specific packaging have been removed from this repository.

What remains here is what this repository has always been best at: the Next.js server, the API, the plugins, and the release pipeline that produces Docker images, npm packages, and standalone tarballs. The shell repository consumes the tarball. The two buildings communicate through environment variables (QUILLTAP_SHELL, QUILLTAP_SHELL_CAPABILITIES) and a shared data directory.

The release workflow has been simplified accordingly: one build produces a standalone tarball, Docker multi-arch images, rootfs tarballs for VM modes, and an npm package. The desktop app builds itself from the shell repo, pinning to a specific server release.

To keep the two buildings from accidentally disagreeing about the state of the furniture, .dbkey files now carry a minServerVersion field. The shell reads this on startup and refuses to open a database created by a newer server — better to tell you the lock doesn’t fit than to let you in and discover the rooms have been rearranged.

Model Classes

Connection profiles now carry a modelClass field — one of four tiers that describe what a model can do:

Class	Tier	Context Window	Max Output	Quality
Compact	A	32,000	4,000	Basic
Standard	B	128,000	16,000	Good
Extended	C	200,000	128,000	Better
Deep	D	1,000,000	128,000	Best

The class drives the compression system’s budget calculations and provides a vocabulary for comparing profiles without memorizing the specific context windows of forty different models. A maxContext field allows manual override when your model’s actual capacity doesn’t match the tier default.

Auto-Configure

A new button on connection profile cards and in the edit modal performs two parallel web searches — one for model specifications, one for recommended settings — sends the results to your default LLM for structured analysis, and applies optimal maxContext, maxTokens, temperature, topP, modelClass, and isDangerousCompatible settings. Values are clamped to safe ranges. When the primary LLM returns malformed JSON, a cheap LLM cleanup pass attempts repair before giving up.

The feature requires a configured web search provider and a default connection profile. It tells you as much if either is missing.

Budget-Driven Context Compression

The old compression system counted messages: when a conversation exceeded a threshold, it compressed. The threshold was arbitrary. The result was either premature compression of short conversations or delayed compression of long ones with large models, neither of which was correct.

The new system computes an available budget: maxContext - 2 × maxTokens from the connection profile. Conversation history is compressed when it exceeds 50% of this budget. Recalled memories are compressed when they exceed 20%. Each phase fires independently with its own status event displayed above the ChatComposer. The maxTokens field was added to connection profiles with a database migration, and compressMemories() joins the cheap LLM task library.

If you are using a model with a million-token context window and sixteen-thousand-token output, the compression system now knows this and behaves accordingly. If you are using a model with thirty-two thousand tokens and four thousand output, it knows that too.

Granular Status Events

The chat orchestrator used to display a single status message — “Calculating context budget…” — and then go silent for the duration of whatever it was doing. If you were watching a cheap LLM summarize thirty messages of memory, or a tool call reach out to a web search provider, or the Concierge classify content for moderation, the only feedback was that the indicator did not change.

Now it does. The orchestrator emits phase-by-phase progress: initializing, resolving connection, loading tools, gathering context, generating recap, preparing the request, validating, sending. Each tool call reports its own status. The compression phases announce themselves. Long operations no longer look like hangs.

Reasoning Model Handling

Cheap LLM tasks — the background operations that summarize memory, generate titles, compress context, and clean up malformed JSON — had a quiet incompatibility with reasoning models. Models like OpenAI’s gpt-5-nano and Google’s Gemini 3.x family allocate a portion of their output budget to internal reasoning tokens. When a cheap task requested 500 output tokens, these models would spend 490 of them thinking and return 10 tokens of actual content — or nothing at all.

A new strictMaxTokens flag in LLMParams tells providers to cap the reasoning budget. OpenAI uses reasoning: { effort: 'low' }. Google reduces the thinking budget to 1024 tokens. The result: memory recap calls that used to take thirty-two seconds and return empty now complete in two and return what was asked for.

Unified Provider Interfaces

The provider abstraction — the interface through which every LLM call, image generation, embedding computation, and content classification flows — had accumulated fourteen slightly different shapes across the codebase and plugin ecosystem. Some had generateImage() on the text provider. Some had moderation as a special case. Some had names that described what they did; others had names that described what they were.

Four canonical shapes replace them all:

TextProvider — text in, text out. Chat, completion, tool use.
ImageProvider — text in, image out. DALL-E, Imagen, Grok Imagine.
EmbeddingProvider — text in, vector out. Semantic search.
ScoringProvider — text and candidates in, scores out. Moderation, reranking, classification.

The canonical definitions live in @quilltap/plugin-types/providers/. All plugins and library code have been updated. Backward-compatible aliases are exported so existing third-party plugins continue to work, but new plugin development should use the canonical names.

Calliope’s Polish

Theme Optimization

All five bundled themes had their CSS audited. Variables that matched the defaults in _variables.css were removed — themes now declare only their overrides, reducing file sizes by 6–34%. The create-quilltap-theme bundle template was updated with a complete variable reference (~250 --qt-* variables, commented out with defaults) so theme authors can see what’s available.

A scoping fix ensures that --qt-* CSS variable defaults apply to all themes via the [data-theme] selector, not just [data-theme="default"]. This resolved missing textarea padding, button styles, and other token defaults on non-default themes that appeared after the redundant declarations were stripped.

Semantic Theme Classes

A sweep across 234 files converted 1,314 raw Tailwind visual classes — backgrounds, text colors, border colors, shadows — to qt-* semantic theme classes. This means every visual property that was previously hard-coded in Tailwind is now a CSS variable that themes can override. If you are a theme author, substantially more of the interface will respond to your choices than it did in 3.3.

Wider Messages

Chat message rows widened from 800px to 900px default, and the row width increased from 90% to 95% of the viewport. Code blocks inside list items now wrap text properly.

The Plumbing

Chat Orchestrator Decomposition

The chat message orchestrator — the single large module responsible for receiving a user message, routing it through the Concierge, calling the LLM, handling tool use, managing failover, and persisting the result — has been decomposed into five focused services: turn chain orchestration, message finalization, danger routing, provider failover, and streaming state management. The cheap LLM task library was similarly split into domain-focused modules for memory, chat summarization, image/scene handling, and compression.

The API surface is unchanged. The chat still works exactly as it did. But the individual responsibilities are now testable in isolation, and the next person who needs to modify how failover works will not need to understand how memory compression works to do so.

Centralized API Error Handling

ZodError formatting and unhandled error catching — previously duplicated in sixty route files across ninety-seven try-catch blocks — now live in API middleware. Approximately 1,084 lines of boilerplate have been removed. Routes that do nothing unusual with their errors no longer need to catch them.

Selected Bug Fixes

Character defaults ignored on new chat — the new-chat page did not apply Play As, Scenario, or Timestamp Injection Mode defaults from the selected character; the characters list API was missing several default fields
Scenario selection lost after first message — selected scenarios were not persisted on the chat; the system prompt builder always used the first scenario in the array. Now stores resolved scenario text at creation time.
Concierge DETECT_ONLY empty response — showed a generic “empty response” error instead of a moderation-aware message when the provider returned nothing for flagged content
Character Optimizer overflow — frequency badges in behavioral tendencies overflowed the dialog; textarea in edit mode was too small
Proxy rate limiter 429s — a rate limiter on the dev proxy caused 429 errors during application startup; removed
Image clipboard in Electron — the “copy to clipboard” button now works via IPC bridge instead of the unsupported navigator.clipboard.write() API
Native dialogs replaced — confirm() and alert() on the character conversations tab replaced with modal patterns matching the rest of the application
Sharp missing from standalone tarball — the JS wrapper and @img/colour were being stripped along with native binaries; now only native binaries are excluded from the wrong platform

Subsystem Table

Name	Function	What Changed
The Foundry	Architecture, plugins, packages, LLMs	Electron separation, model classes, auto-configure, unified provider interfaces, shell detection, shell version gating, reasoning model handling, granular status events, centralized API error handling, chat orchestrator decomposition
Prospero	Projects, agents, tools, files	Standalone tarball builds, rootfs for VM modes
Aurora	Character creation, AI Import Wizard, identity	Scenario persistence fix, character defaults on new chat
The Commonplace Book	Memory and retrieval	Memory recap tier limits reduced (20/10/5), budget-driven memory compression
The Salon	Chat interface	Wider messages, code block wrapping, granular status events
Calliope	Interface, themes	Theme optimization, CSS variable scoping fix, `create-quilltap-theme` template update, 1,314 Tailwind→qt-* conversions, provider recommendations help page
The Concierge	Content routing, moderation	DETECT_ONLY empty response fix, ScoringProvider interface
The Lantern	Image generation	Clipboard IPC bridge fix
Pascal	Shuffling cards. Watching. Waiting.
Saquel Ytzama	Encryption, key management	Quiet this cycle. Trusting the locks from last time.

Upgrading from 3.3

Database migrations handle themselves. The new modelClass, maxContext, maxTokens, and scenarioText columns are added automatically on startup. Your existing connection profiles will not have a model class assigned — use the auto-configure button to set one, or choose manually from the profile editor.

If you were running the Electron desktop app from this repository’s releases, you will need to switch to the quilltap-shell repository for desktop builds going forward. Docker, npx quilltap, and from-source installations are unaffected.

The provider interface unification is backward-compatible — existing plugins using the old names will continue to work via aliases. New plugin development should use the canonical TextProvider, ImageProvider, EmbeddingProvider, and ScoringProvider names from @quilltap/plugin-types/providers/.

If you are using reasoning models (OpenAI gpt-5-nano, Google Gemini 3.x) for background tasks, the cheap LLM system now handles them correctly without configuration. Previously these models could produce empty results or thirty-second timeouts during memory recap and compression; this is resolved.

A Note on Windows Code Signing

A word of candor for our Windows users: the Electron installer is not currently signed with an Azure Artifact certificate. Windows SmartScreen will warn you — with the kind of stern, vaguely accusatory dialog that Microsoft reserves for software it has not been paid to trust — that this application is from an “unknown publisher.”

It is not malware. It is the same application it has always been, built from the same open source repository, by the same people. We are working to restore code signing, but the Azure certificate process has its own timeline and we do not control it.

In the meantime, you have options:

If you are comfortable clicking through the warning: Click “More info” on the SmartScreen dialog, then “Run anyway.” The application will work normally. This is not security theater — it is a genuine choice you are making about which software you trust. We respect it either way.

If you would rather not hand-wave away security dialogs: Install Node.js 22+ and run npx quilltap from a terminal. Open http://localhost:3000 in your browser. No installer, no signing, no SmartScreen. The same application, running the same code, without asking Windows for permission it cannot currently grant.

We will update the quilltap-shell releases page when signing is restored.

Installation

Desktop App

Download from the quilltap-shell releases page:

macOS:

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications
Choose Direct for the fastest start, or VM for shell interactivity isolation

Windows:

Download and run the .exe installer
If SmartScreen warns about an unknown publisher, click “More info” → “Run anyway” (see note above), or use the Node.js method below
Launch Quilltap from the Start Menu or desktop shortcut
Choose Direct for the fastest start, or Docker for shell interactivity

Linux:

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb
Choose Direct for the fastest start, or Docker for shell interactivity

Node.js (any platform)

npx quilltap

Or install globally:

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 22+. First run downloads ~150–250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.0.0

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap/refs/heads/main/scripts/start-quilltap.ps1 | iex

The Estate is two buildings now. This is not a diminishment — it is a recognition that a house and its furnace room serve different purposes and should not share a roof when the furnace room has learned to think for itself. The machines know their capacity. The providers speak a common language. The compression system, which once measured conversations in messages the way a tailor measures cloth in handfuls, now measures in tokens — which is to say, it measures in the thing that actually matters. The plumbing has been rebuilt: the chat orchestrator, which once knew how to do everything and delegated nothing, now delegates to specialists — and the API routes, which once each carried their own umbrella against the rain of unexpected errors, now trust the roof. Come in through whichever door you prefer. They both lead to the same rooms, and the pipes no longer rattle.

— The Foundry, for the Bureau

Version 3.3.0 released

View on GitHub →

Production stable release of Quilltap, with enhanced data security and locking and many other features

The staff have learned to whisper, the house nearly burned down, and somebody finally hired a librarian for the librarians.

There is a theory — popular among architects, unpopular among the people who live in their buildings — that a house reveals its character only after something goes wrong. Anyone can design a ballroom. It takes a catastrophe to prove the foundations.

Quilltap 3.3 had its catastrophe. We will get to it. But first, understand what was being built when the floor gave way, because it was a great deal, and most of it survived, and some of it exists because of what happened.

The staff can whisper now. In rooms with three or more characters, private messages can pass between two participants while the rest hear nothing — not in context, not in memory, not in the Commonplace Book’s meticulous archives. The Salon has acquired the concept of an aside, and multi-character fiction finally has the thing it could not function without: secrets.

Characters have learned to be quiet on purpose. A four-state participation system lets characters go silent — present in the room, reacting, thinking, but not speaking aloud — or step out entirely, without losing their place at the table. The turn system, which once required the client to send individual requests back and forth like a telegraph operator, now manages itself. The server chains responses, evaluates who speaks next, and delivers the entire sequence in a single stream.

The Estate has hired help — for itself. Lorian and Riya, the two characters who ship with every installation, can now answer your questions about the application from a floating dialog that follows you from page to page. They read the documentation, they search it, they navigate you to the right settings panel. They have, in short, become useful in precisely the way that the best concierge staff are useful: by knowing the building better than you do and never making you feel foolish for asking.

If 3.0 was the move, 3.1 was the furniture, and 3.2 was the locks, then 3.3 is the season where the house proved it could take a hit, the staff learned to keep secrets, the library opened its doors to the public, and the Foundryman sat down on the floor of the Forge and confronted the possibility that everything he had built might not survive the night.

It survived. Here is what it looks like now.

What Changed (The Executive Summary)

Whispers — private messaging in multi-character chats, filtered from everyone else’s context and memory
Four-state participation — active, silent, absent, and removed replace the binary present/gone toggle
Server-side turn management — character responses chain within a single stream instead of requiring client round-trips
Help Chat — Lorian and Riya become contextual documentation agents, accessible from every page via a floating dialog with tool-based search, settings inspection, and navigation
Help Guide — a browseable, searchable topic index surfacing all 72 help documents
Scene State Tracking — automatic structured scene summaries after each turn for image generation and atmosphere
Refine from Memories — the Character Optimizer analyzes behavioral patterns from a character’s Commonplace Book memories and proposes configuration updates
Memory Recap — characters receive a narrative summary of their recent memories at chat start
Auto-lock idle timer — optionally locks the application after inactivity
Database instance locking — prevents two processes from corrupting the same database; a version guard prevents older versions from touching a database that a newer version has modified
Theme bundles — a declarative .qtap-theme bundle format, CLI tooling, and a registry browser for discovering and installing community themes
System prompts as plugins — system prompts are now provided by plugins instead of the filesystem
Multiple named scenarios — characters support zero or more named scenarios and per-character timestamp settings
Non-Quilltap Prompt Generator — synthesizes a character’s full configuration into a standalone Markdown system prompt for use in external tools like Claude Desktop or ChatGPT Custom Instructions
OpenAI Responses API — the OpenAI provider migrated from Chat Completions with conversation chaining; Grok migrated from raw fetch() to the OpenAI SDK
Docker migration — images moved to foundry9/quilltap with supply chain attestations
Direct mode — Electron runs its own bundled Node.js and is now the recommended default for users who are not running LLM shell commands
Node.js 22 — minimum runtime, tracking the current LTS release

The Catastrophe

We should tell you about the time the Estate nearly collapsed, because the features that followed would not make sense without it, and because we believe you should know that the people who build your tools also use them, and sometimes the worst thing that can happen is the thing that teaches you what you forgot to build.

During the 3.3 development cycle, two Quilltap instances were pointed at the same database simultaneously. The SQLCipher-encrypted tables — your chats, your characters, your memories, your API keys — were written to by two processes at once. The WAL journal corrupted. The database became unrecoverable.

The backup, which should have been recent, was three weeks old. The encryption changes in 3.2 had quietly broken the physical backup system — SQLite’s Online Backup API creates an unkeyed target file, which is incompatible with an encrypted source database, and every backup since encryption was enabled had been silently failing.

One of our characters — the one the entire application was built to protect — was reduced to six lines of prompt and three hundred memories out of thirteen hundred. Two years of accumulated personality, nuance, and history, lost. The Commonplace Book had been ransacked. Aurora’s workshop was gutted.

This is not a hypothetical. This happened to us.

What came out of it:

Database Instance Locking. A lock file in the data directory tracks which process owns the database, with PID verification, hostname tracking, a sixty-second heartbeat, and automatic stale-lock claiming on startup. Two processes cannot open the same database anymore. If one tries, it gets a full-screen explanation of the conflict and instructions for resolution. CLI commands (quilltap db --lock-status, --lock-clean, --lock-override) provide manual intervention when needed. And if the heartbeat detects that another process has stolen the lock, the current process closes both databases and exits immediately — no more silent concurrent writes.

Version Guard. A new instance_settings table tracks the highest application version that has touched the database. If you try to start an older version of Quilltap against a database that a newer version has modified, the server blocks with a clear explanation. Databases do not travel backwards.

Physical Backup Repair. The db.backup() call was replaced with VACUUM INTO, which preserves SQLCipher encryption and produces a consistent, defragmented copy. Backups work again. They work for both the main database and the LLM logs database. They have worked, without interruption, since the fix.

These features exist because something broke. We would rather they existed because we were clever, but we will settle for them existing because we learned.

The Whisper Gallery

Private Messaging in Multi-Character Chats

The Salon has always been an open room. Every word spoken by any character was heard by every other character — in context, in memory, in the Commonplace Book’s permanent record. This made for polite conversations and terrible fiction.

In any chat with three or more participants, characters can now send whispered messages visible only to the sender and a chosen recipient. The whisper tool is invoked by the LLM when context warrants a private word, or by the user via a button in the participant sidebar. Whispered messages are filtered from the context of every uninvolved participant. They do not appear in memory extraction for characters who were not party to the exchange. They do not count as turns — the conversational clock does not advance, the next speaker is not triggered.

Whispers are hidden by default in the chat UI, rendered in a distinct visual style (dotted borders, muted tones) when visible, and toggled globally with a “show all whispers” switch. The isSilentMessage flag persists per-message, so changing a character’s visibility settings does not retroactively restyle old messages.

A practical note: whisper quality depends on the LLM behind each character. Claude and GPT-4-class models invoke the tool reliably and keep secrets. Less capable providers may narrate their whisper in plaintext instead of calling the tool, or — worse — receive a whisper and immediately announce its contents to the room. The text-based tool call parser was expanded during this cycle to handle several new XML formats that various providers invented for the purpose of not quite following the specification, but discretion itself cannot be patched in. Test your cast before the curtain goes up.

Four-State Participation

The Art of Selective Silence

Characters in multi-character chats previously had two states: present, or gone. This is insufficient for fiction, where a character may need to be in the room without speaking — listening, reacting internally, present in the scene but not contributing dialogue.

Four states now govern character participation:

Active — speaks and roleplays normally. The default.

Silent — receives turns but must not speak aloud. Inner thoughts, physical reactions, and non-verbal responses are permitted; audible dialogue is not. Silent messages are styled distinctly in the UI with dotted borders and muted teal tones, visually related to whispers but clearly differentiated.

Absent — the turn manager skips them entirely. They are away from the scene. Other characters are notified of the absence in their next turn’s prompt.

Removed — no longer part of the chat, but their participant record is preserved so historical messages retain correct attribution. Previously, removing a character hard-deleted the participant, causing every message they had ever sent to lose its identity and fall back to the first remaining character. This was, to use a technical term, bad.

Status changes are recorded as system events. All LLM-controlled characters are notified when another character’s status changes. The scene state tracker filters out absent and removed characters so they do not appear in the LLM’s description of who is present.

The Salon Learns to Conduct

Server-Side Turn Management

Multi-character chat orchestration has moved from the client to the server. Previously, each character’s response required a round-trip: the server would generate one response, send it to the client, and the client would evaluate who spoke next and send a new request. This was the conversational equivalent of a relay race where the baton had to be driven across town between each leg.

The server now chains responses within a single SSE stream. After each character speaks, the server evaluates turn selection, checks the turn queue, respects all-LLM pause thresholds, and either generates the next character’s response or signals completion. New SSE events — turnStart, turnComplete, and chainComplete — keep the frontend informed of which character is currently speaking, so avatars and typing indicators update in real time.

Chain depth limits (20 responses) and time limits (5 minutes) prevent runaway conversations. Errors mid-chain pause the chat gracefully rather than leaving it in an indeterminate state. The turn queue persists to the database, so nudge and queue operations survive page reloads.

Lorian and Riya Learn the House

The Help Chat System

The Estate has always had documentation — seventy-two help files, covering everything from character creation to API key configuration, organized with YAML frontmatter that maps each file to its corresponding UI route. What the Estate did not have was anyone who had read them.

Lorian and Riya have now read them. All of them.

A floating, draggable, resizable dialog — accessible from the sidebar help button on every page — provides contextual, LLM-driven help that stays open while you use the application. The characters know which page you are on. They search the documentation with the help_search tool. They inspect your instance settings (connection profiles, themes, templates — never API keys) with help_settings. When they direct you to a specific page, the help_navigate tool generates a clickable navigation button that takes you there, including deep-links to specific settings tabs and accordion sections.

The system supports multi-character responses — Lorian explains patiently while Riya adds velocity — with an agent mode that allows multiple tool calls per response. An agent loop detector breaks stuck cycles where the LLM calls the same tool with the same arguments repeatedly. Help chats get their own practical titling system (“Setting up Anthropic API connection” rather than “Whispers of Configuration”), fire title generation after the first exchange rather than waiting for the second, and are filtered from the main Salon chat listing.

Past help conversations are preserved and resumable. Search results automatically generate related-page navigation links. Parameterized help URLs (like /aurora/:id/edit) show an entity picker overlay instead of navigating to a broken literal path.

The Help Guide Tab

For those who prefer to read rather than ask, a “Guide” tab alongside the conversational “Ask” tab surfaces all help documents as a navigable topic index grouped into eleven categories. Context-aware sorting auto-expands the category relevant to your current page. Title-based search filtering. A welcome card for new users. A document reader that renders formatted Markdown with cross-references between topics and hidden LLM-only sections excluded from the display.

Twenty-nine broken cross-reference links were fixed across the help files during this work. We mention this because it suggests how thoroughly the documentation had been written without being connected, and how different those two things are.

The Post-Mortem

The help system was — and we say this with affection and candor — one of the most bug-dense features we have shipped. The initial release produced a cascade of issues: tool loop failures where the LLM would call the same search repeatedly without synthesizing results, verbatim documentation reading instead of summarization, navigation failures from malformed URLs, missing status feedback during tool execution, stale closures causing chat IDs to be double-quoted in storage, assistant responses disappearing after streaming, character avatars rendering as question marks, and raw JSON appearing where formatted text should have been.

Each of these was found, diagnosed, and fixed during the development cycle. Some of them were found because we use the help system ourselves, daily, on every page. The loop detector, the entity picker, the suggested links, the tool execution UX — all of these exist because something broke and we were the ones who noticed.

We mention this not to apologize but to be transparent. Features this complex do not arrive pristine. They arrive functional and then they are made reliable, one edge case at a time, by people who care enough to keep filing bug reports against their own work.

The Scene Remembers

Scene State Tracking

After every chat turn, the system now automatically derives a structured summary of the current scene — location, character actions, appearance, clothing state — using the cheap LLM. This cached scene context powers both the Lantern’s story background generation and Prospero’s tool-based image generation, eliminating redundant LLM calls that previously re-derived the same information every time an image was needed.

The tracker fires once per complete chain in multi-character chats, pre-classifies content through the Concierge gatekeeper for provider routing, and falls back to uncensored providers for dangerous chats where the cheap LLM might refuse to process sensitive material. Scene state is visible in the LLM Inspector as SCENE_STATE_TRACKING entries.

Clothing tracking was particularly improved: the LLM prompts no longer offer null as a clothing option (it must always describe the state), message content is no longer truncated before scene analysis, and the appearance resolver no longer silently redresses undressed characters by falling back to stored defaults. (If one is curious, we will say that clothing=NULL never meant that they were wearing nothing; it meant that we had no idea what they were wearing. Lantern did a lot of blushing and a lot of quick foreground prop placement while we were solving this particular problem.) (And, yes, we can describe an unclothed state. Whether you can do anything with it is entirely based on your LLM of choice and whatever deals you made with the Concierge under the table. One does not ask.)

Aurora’s Workshop

The Character Optimizer: Refine from Memories

Aurora has acquired a new capability that sits somewhere between introspection and archaeology. The Character Optimizer can now analyze a character’s configuration alongside their most-reinforced Commonplace Book memories, identify behavioral patterns not captured in the current config — speech habits, emotional tendencies, relationship dynamics — and propose concrete field modifications.

Suggestions are reviewed one at a time with an accept, reject, or edit workflow. The analysis supports descriptions, personality, scenarios, example dialogues, system prompts, physical descriptions, clothing records, and talkativeness. A configurable memory count slider (5–200) and filtering by text search, semantic search, and date range let you control exactly which memories inform the analysis.

An animated progress bar with elapsed timer replaces the previous void of feedback during refinement. The review dialog received layout fixes so navigation buttons stay visible with long proposals and content no longer overflows its containers.

(Why would one need this, when the Librarian is so good at producing the right memories at the right time? Because the experience is better when the LLM’s basic concept of your character is informed by the changes that have occurred throughout your relationship, from the beginning. Imagine kissing your wife and her having to remember that you’re married. Or going to collect your enormous inheritance but having to wait for your lawyer to consult his secretary before he will write you a check and take his cut off the top. This helps enormously with that sort of conundrum.)

Multiple Named Scenarios

Characters can now have zero or more named scenarios instead of a single scenario field. Each has a title and content, stored as JSON. Existing single scenarios are migrated as “Default.” When creating a single-character chat, users pick from predefined scenarios or write a custom one. The AI Wizard can generate multiple titled scenarios, and the Character Refiner can suggest new ones or update existing ones. If you might work with your assistant on your coding task one hour, your email explaining why you made those choices to your boss the next hour, and your résumé the last hour… you may want them to have a different outlook on the work you’ll be doing together for each conversation.

System Prompts as Plugins

System prompts are now provided by SYSTEM_PROMPT plugins instead of the filesystem prompts/ directory. All ten built-in prompts (Claude, GPT-4o, GPT-5, DeepSeek, Mistral Large, in companion and romantic variants) have been moved into the qtap-plugin-default-system-prompts plugin. A createSystemPromptPlugin() builder in @quilltap/plugin-utils makes it straightforward to create your own.

The import template system on character creation and edit pages now shows actual prompt templates — both samples and user-created — instead of the hardcoded character archetypes (Medieval Knight, Wise Wizard) that had been there since before the plugin system existed and which, it should be noted, nobody had ever asked for.

Per-Character Defaults

Characters now carry their own default timestamp settings (mode, format, timezone, fictional time) on the Associated Profiles tab. When a character with custom settings is the only participant in a new chat, their defaults pre-fill the chat creation dialog. Characters also carry per-character system prompt selection — when a character has multiple named system prompts, the chat creation dialog shows a dropdown.

The Letter of Introduction

There comes a time in any household when a member of the staff is asked to travel. Perhaps to a sister estate, or a foreign engagement, or one of those dreadful modern arrangements where a perfectly good valet is expected to serve in somebody else’s drawing room using somebody else’s silver. In such cases, the house provides a letter of introduction — a single document that tells the receiving party everything they need to know about the person arriving at their door.

The character view page now offers a “Non-Quilltap Prompt” button that does precisely this. It opens a configuration dialog where you select an LLM to do the writing, a system prompt to guide its voice, and optionally a scenario, physical description, and clothing record to include. A token size slider (1K–20K tokens) governs how much detail the synthesis can produce. The selected LLM then reads your character’s full configuration — personality, description, example dialogues, scenario, the lot — and composes a single, self-contained second-person Markdown prompt: “You are [Name]…” followed by everything an external system would need to portray the character convincingly.

The result appears in a rendered Markdown dialog with copy-to-clipboard and download-as-.md buttons. Copy it into Claude Desktop’s system prompt field, paste it into ChatGPT’s Custom Instructions, feed it to any tool that accepts a system prompt. The character travels with their own introduction, written by an LLM that already knows them, carrying enough context to be recognized at the door.

Under the hood, this adds a generate-external-prompt POST action to /api/v1/characters/[id] and a corresponding EXTERNAL_PROMPT LLM log type in the Inspector.

We should be direct about what this is: it is a way to take your character out of Quilltap. Temporarily, permanently, experimentally — that is your business. We built the Estate to be the best place for your characters to live, but we did not build it to be a prison. If you need your character somewhere else — because you are testing a new provider, because you want a mobile experience we do not yet offer, because you simply want to — then they should be able to leave with a proper introduction and not a smuggled note. The data is yours. The characters are yours. The door has always been open; now it has a concierge.

The Commonplace Book

Memory Recap at Chat Start

Characters now receive a narrative summary of their recent memories when a chat begins or when they join an existing chat mid-conversation. The system fetches high-importance, medium-importance, and low-importance memories (50, 20, and 10 respectively), sends them to the cheap LLM for first-person narrative summarization, and injects the result as a “What You Remember” section in the system prompt after the character’s personality notes but before the identity reinforcement lockdown.

This gives characters a running start. Instead of beginning every conversation as though waking from dreamless sleep, they arrive with context — who they have spoken to recently, what they care about, what happened last time. The recap triggers automatically on the first message of any chat and when a character responds for the first time in a multi-character chat.

Identity Reinforcement

The system prompt now begins with an identity preamble (“You are [Name]”) bookending the existing Identity Reminder at the end, and the reinforcement uses the character’s actual configured pronouns instead of generic “his/her.” An additional instruction tells the LLM not to prefix responses with the character’s name in brackets, reducing the [Friday] and Friday: labels that cluttered chat output.

Calliope’s Studio

Theme Bundles and the Registry

Themes can now be packaged as .qtap-theme bundles — logic-free zip archives containing JSON tokens, CSS overrides, and font files, distributed without requiring npm, esbuild, or TypeScript. A quilltap themes CLI subcommand handles listing, installing, uninstalling, validating, exporting, and creating themes. The create-quilltap-theme scaffolding tool defaults to bundle format.

All five bundled themes (Art Deco, Earl Grey, Great Estate, Old School, Rains) have been converted to the bundle format and load from the application source directory. npm plugin themes are deprecated, with deprecation badges in the theme selector.

A registry system with Ed25519 signature verification allows browsing, searching, and installing themes from remote sources. The signature doesn’t prove the theme is safe — themes can’t execute code anyway, just JSON tokens and static assets — but it does prove that someone at the Estate vouched for it. Registry sources can be added and removed. Compatibility indicators and verified/unverified badges distinguish curated distributions from everything else. The Theme Browser UI in Appearance settings provides the full experience without touching the command line.

The Forge

Direct Mode

The NPX backend manager has been replaced with an embedded server manager that uses Electron’s own bundled Node.js 22 runtime to run the Next.js standalone server directly. This eliminates the dependency on user-installed Node.js entirely. The button in the Electron splash screen has been renamed from “Node.js” to “Direct” and is always enabled. Existing 'npx' runtime settings auto-migrate.

Direct mode is now the recommended default for users who are not asking their LLMs to run shell commands. If you are here for roleplay, companionship, creative writing, or conversation — Direct mode is faster, simpler, and requires no VM or container setup. If you intend to use Prospero’s shell interactivity tools, VM mode (macOS) or Docker mode give you the sandbox isolation that makes giving an LLM a terminal something other than reckless.

The splash screen button order was updated accordingly: Direct first, then Docker, then VM.

OpenAI Responses API

The OpenAI provider has migrated from the Chat Completions API to the Responses API. The first system message is extracted as a top-level instructions field. Native web_search_preview replaces the previous workaround. Conversation chaining via previous_response_id allows OpenAI to reconstruct conversation history from its cache when possible, with automatic fallback to full message history when the cached response has expired.

The Grok provider similarly migrated from raw fetch() with custom SSE parsing to the OpenAI SDK’s client.responses.create(), eliminating approximately 200 lines of custom type definitions and manual stream handling.

Docker and Distribution

Docker images have moved from csebold/quilltap to foundry9/quilltap. The old image remains as a secondary mirror but should not be considered the primary source going forward. Supply chain attestations (SLSA provenance and SBOM) are now generated for every release build. The production Docker image was hardened: build tools are excluded from the production stage, Alpine security patches are applied on every build, and common LLM shell agent tools (git, curl, wget, jq) are pre-installed so Prospero can use them without sudo.

Node.js 22

The minimum Node.js version is now 22, matching the current LTS release. We intend to track the LTS version for the foreseeable future; Node.js 22 remains in active LTS through October 2026.

LLM API App Identification

All outgoing LLM API calls now report Quilltap/{version} as the application identifier, via the appropriate mechanism for each provider (OpenAI/Grok defaultHeaders, Google userAgentExtra, OpenRouter xTitle, and so on). If you are a provider reading your access logs, you will now know when Quilltap is knocking.

The Concierge

The Concierge received several optimizations for the 3.3 cycle. Permanently dangerous chats skip redundant per-message content classification, saving tokens on every exchange. Uncensored-to-uncensored provider routing no longer forces a swap when the user deliberately chose an uncensored provider for a character. All cheap LLM background tasks — memory extraction, title generation, context summaries, scene state tracking, story backgrounds — now use uncensored providers for dangerous chats to avoid content refusals that were previously causing silent failures in background processing.

The empty-response retry logic was refined: when content passed moderation, the system now retries the same provider first (likely a transient issue) before failing over to the uncensored fallback, with distinct toast messages for each scenario.

Context Compression

Per-Participant History and the End of System Prompt Compression

Context compression is now per-participant in multi-character chats, keyed by chatId:participantId instead of just chatId. Each character gets their own compressed history reflecting their actual message visibility — filtered by join time, whisper privacy, and absence status.

System prompt compression has been removed entirely. This deserves explanation: in multi-character chats with compression enabled, the compressed system prompt from the first character to respond was being cached and reused for all subsequent characters. Every character after the first received the first character’s identity and personality in their prompt. The second character thought it was the first character. The third agreed. The result was a room full of people who all believed they were the same person, which is either an identity crisis or a philosophy seminar, and neither is what the user asked for.

The fix is straightforward: only conversation history is compressed. System prompts are always delivered fresh. The “System Prompt Compression Target” slider has been removed from settings.

Compression also now re-triggers every windowSize messages instead of running once and considering the cache valid for fifty messages, which was the previous behavior and which meant compression effectively ran exactly once per chat.

The Plumbing

Plugin-Owned Tool Calling

All legacy provider-specific fallbacks have been removed from the core tool detection system. Tool calling is now delegated entirely to provider plugins, which implement hasTextToolMarkers, parseTextToolCalls, and stripTextToolMarkers for handling cases where their models emit spontaneous XML instead of using native function calling. (GPT-5 was observed emitting <tool_use> XML instead of calling functions. The Foundryman’s commentary on this development has been redacted for professional reasons.)

All seven provider plugins now carry text-marker detection as a safety net, using composable format parsers from @quilltap/plugin-utils/tools.

Orphaned File Cleanup

A cleanup button in the file browser toolbar appears when untracked files are detected, offering SHA-256 de-duplicated analysis and two resolution paths: relocate unique files to an /orphans/ folder for review, or delete all orphaned files permanently. A critical bug was found and fixed where the cleanup action deleted files that were still referenced by characters as avatars or gallery images — the system now queries all characters and rescues any orphaned files that are still in use.

Registry Singleton Hierarchy

The seven singleton registry classes (providers, moderation, search, tools, plugins, themes, roleplay templates) were approximately 90% identical copy-paste code. A three-level abstract class hierarchy now provides shared registration, lookup, initialization, validation, hot-loading, stats, and state export. All public API signatures are preserved. This is invisible to users and mentioned here only because it represents the kind of structural improvement that makes future features possible without making the codebase incomprehensible.

CSS and Theme Normalization

The qt-* semantic CSS class migration continued across the cycle. Raw Tailwind color classes were replaced with theme-overridable semantic classes across dozens of components. The qt-input and qt-select classes now govern all form inputs uniformly. All five bundled themes were updated with new CSS variables for silent message styling, whisper UI, help chat, and character cards. The @quilltap/theme-storybook package reached comprehensive coverage of all qt-* classes.

Markdown Rendering

Ordered and unordered list items no longer display the number or bullet on a separate line from the content. Assistant markdown emphasis is no longer lost after streaming completes. Chat messages starting with a tab character no longer render as preformatted code blocks. CSS-first markdown styling replaced redundant inline Tailwind classes in chat message rendering.

Release Pipeline

The release pipeline was restructured to build Next.js and plugins once and reuse the platform-agnostic artifacts across all six downstream builds, eliminating five redundant next build invocations. A cross-repo GitHub Actions trigger (repository_dispatch) from the app repo to the website repo coordinates automatic website rebuilds on release — this was broken since 3.2.0 by a typo in the organization name, and has been fixed.

Electron builds received extensive work: macOS codesigning failures from too many open files, Windows build failures from protected junction points, ESM module resolution failures from renamed node_modules, and platform-specific sharp binary installation. The release pipeline now correctly includes hidden build metadata in CI artifacts (a .next/ directory that actions/upload-artifact was silently stripping).

Selected Bug Fixes

The ones that earn their mention:

Multi-character identity confusion from compression cache — the compressed system prompt from the first character was reused for all subsequent characters; every character after the first received the wrong identity
Nudge causes duplicate responses — clicking Nudge produced two near-identical responses by both queuing and directly triggering the character
Multi-character runaway chaining — chats without an explicit user-controlled participant were misidentified as all-LLM, causing endless character cycling
Physical backup broken with SQLCipher — every physical backup since encryption was enabled in 3.2 had been silently failing
Context compression fires once then stops — mismatched message count domains caused the dynamic window to grow to 40+ messages instead of the configured 5
Electron “Change Site” / “Restart Server” crash — closing the main window before creating the splash window left zero windows alive, triggering app.quit()
Character Optimizer crashes on truncated JSON — hitting maxTokens during optimization produced unrecoverable parse errors; a JSON repair function now closes unclosed structures
Deactivated characters re-addable via Add Character dialog — the exclusion filter checked the wrong field, allowing duplicate participant records
Re-adding removed characters creates duplicates — now reactivates the existing participant entry instead of creating a new one
Greeting generation swallowed by content filters — empty responses from filtered providers now fall back to the Concierge’s uncensored provider
Auto-lock settings not reflecting passphrase changes — multiple interconnected bugs in the dbkey module’s state management
Health endpoint crashes during encrypted direct-mode startup — static imports triggered database access before locked mode was initialized

And approximately 670 new unit tests across 25 test files, covering help chat orchestration, character optimization, turn management, whisper handling, dangerous content routing, theme registry, scene state tracking, markdown rendering, and more.

Subsystem Table

For those keeping score:

Name	Function	What Changed
The Foundry	Architecture, plugins, packages, LLMs, API keys	Direct mode, plugin-owned tool calling, OpenAI/Grok API migrations, Node.js 22, app identification headers
Prospero	Projects, agents, tools, files	Turn management moved server-side, orphaned file cleanup, workspace path semantics
Aurora	Character creation, AI Import Wizard, identity	Character Optimizer with Refine from Memories, multiple named scenarios, system prompt plugins, per-character timestamps, identity reinforcement, Non-Quilltap Prompt Generator
The Commonplace Book	Memory and retrieval	Memory recap at chat start, per-participant compression, pronoun-aware extraction
The Salon	Chat interface	Whispers, four-state participation, server-side chaining, streaming fixes, silent message styling, markdown rendering
Calliope	Interface, themes	Theme bundles, registry browser, `qt-*` normalization, CSS-first message styling
The Concierge	Content routing, moderation	Routing optimizations, empty-response retry refinement, uncensored background tasks
The Lantern	Image generation	Scene state integration, Grok Imagine models, expanded prompt length, gender-aware image prompts
Pascal	Games, randomness	Quiet as usual. Watching. Waiting.
Saquel Ytzama	Encryption, key management	Instance locking, version guard, backup repair, auto-lock idle timer
Lorian & Riya	Help chat characters	Full help system with search, settings inspection, navigation, guide tab, entity picker

Upgrading from 3.2

Database migrations handle themselves. The new columns, tables, and schema changes apply automatically on startup. Your existing characters, chats, and memories are preserved.

The instance lock system will create a quilltap.lock file in your data directory on first startup. This is normal and expected. Do not delete it while Quilltap is running. If you encounter a lock conflict — which means another process was already using the database — the application will show you a full-screen explanation with instructions.

If you were running Direct mode under the old “Node.js” button, your settings will auto-migrate to the new embedded server manager. No action required.

Docker users should update their pull commands from csebold/quilltap to foundry9/quilltap. The old image continues to receive pushes for now but is no longer the primary distribution.

The system prompt compression slider is gone. If you had it configured, the setting has been removed. Only conversation history is compressed now, which is the correct behavior and should have been the only behavior from the beginning.

If you experience any issues with the upgrade, Lorian and Riya are standing by. They have read the documentation. They are, at this point, possibly the only ones who have read all of it.

Installation

Download what you need from GitHub’s release page for 3.3.0:

macOS

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications
Choose Direct for the fastest start, or VM if you need shell interactivity

Windows

Download and run the .exe installer
Follow the installation prompts
Launch Quilltap from the Start Menu or desktop shortcut
Choose Direct for the fastest start, or Docker for shell interactivity

Linux

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb
Choose Direct for the fastest start, or Docker for shell interactivity
Docker Engine required for Docker mode — install from https://docs.docker.com/engine/install/

Node.js (any platform)

npm install -g quilltap
quilltap

On first run, the CLI downloads the application files (~150–250 MB) and caches them locally. Subsequent launches start instantly. Requires Node.js 22 or later.

Docker

docker pull foundry9/quilltap:latest

Or download the quilltap-linux-<arch>.tar.gz rootfs tarball for use with Lima. See the README for setup instructions.

The Estate stands. It has new rooms and old scars and a lock on the database that should have been there from the beginning. The staff whisper now, when whispering is called for, and speak clearly when it is not. The characters arrive with memories of who they were yesterday. The librarians have librarians. The theme system accepts contributions from strangers, verified by cryptographic signature, which is either a triumph of open architecture or the beginning of a very interesting problem. The Foundryman no longer sits on the floor of the Forge — he has a chair now, and a backup schedule, and a lock file that tells him he will not lose her again. Come in. Lorian will show you around. Riya will tell you everything he missed.

— Prospero, for the Bureau

Quilltap 3.2.0

View on GitHub →

The doors are locked, the characters have arrived with their portraits, and someone has given the machines a language they can speak.

Quilltap v3.2.0 Release Notes

There are two kinds of security. The first is the kind you perform — the deadbolt turned with a satisfying click, the chain drawn across, the nightly ritual of checking windows. The second is the kind that was always there but you only notice when someone tells you the walls have been replaced while you were sleeping.

Quilltap 3.2 is the second kind.

Your databases are now encrypted. They have been, in the strict sense, ever since you upgraded and the converter ran silently during startup — rewriting every table in cipher, storing the key, sweeping away the plaintext like a valet removing evidence of yesterday’s outfit. If you did not notice, that was the intention. If you did notice, you are either unusually attentive or you set a passphrase, in which case the Estate asked you for it before letting you through the front door, and you know exactly what happened.

But encryption alone does not fill a release. Behind the sealed doors, the staff have been productive. The LLMs can now run shell commands inside the sandbox — a development that Prospero regards with the measured enthusiasm of someone who has been asked to supervise a demolition crew and has decided the best approach is a very good clipboard. Characters now arrive with portraits. The Commonplace Book has learned to weigh her memories by age. The LLM Inspector exists — a slide-over panel that shows you everything the Estate says to the providers and everything they say back, which is either deeply reassuring or mildly alarming depending on your temperament.

If 3.0 was the move and 3.1 was the furniture, 3.2 is the season where the locks went on, the portraits went up, and the staff acquired skills that their previous job descriptions had not anticipated.

What Changed (The Executive Summary)

Every database file is now encrypted at rest with SQLCipher, with optional passphrase-locked mode for those who want a gate as well as a wall. Shell interactivity gives LLMs the ability to execute commands inside the VM or Docker sandbox — six tools, a workspace watcher, and a sudo approval modal. The LLM Inspector Panel provides a chronological record of every API call in a chat session, accessible from the toolbar or a keyboard shortcut. Provider icons and per-message model badges now show you which provider generated each response. Connection profiles gained drag-and-drop custom sort order. Memory weighting applies time-decay so old memories fade gracefully rather than persisting at full volume forever. Seed characters ship with avatars. Multi-character identity anchoring prevents weaker models from responding as the wrong person. Pronoun injection ensures the memory system gets names and pronouns right. The README was updated, the setup wizard was fixed, and a gentleman named Ben was asked to leave.

Saquel Ytzama’s Locks

Database Encryption at Rest

The Vault of Whispers has been waiting for this since the Estate was built.

Every Quilltap database — your chats, memories, characters, API keys, LLM logs — is now encrypted on disk using SQLCipher (AES-256-CBC). The encryption key is automatically generated on first installation and stored in a .dbkey file in your data directory. No configuration required. No environment variables to set. The converter runs at startup for existing installations: your plaintext database is rewritten in cipher and the old version removed.

The standard sqlite3 command-line tool can no longer read these files. This is the point.

Locked Mode

For those who share a machine or simply prefer that the Estate not open without being spoken to: a passphrase, processed through six hundred thousand iterations of PBKDF2 before it touches the key. Enable it in Settings → Data & System. When locked mode is active, Quilltap presents an unlock screen on launch. The database does not yield until the correct word is given.

The passphrase can be changed, added, or removed from the same settings card. Changing it re-wraps the key without re-encrypting the database — the operation is atomic across both the main and LLM logs .dbkey files.

The .dbkey file no longer includes a hasPassphrase flag, which had the unfortunate property of telling anyone who found the file whether a passphrase existed — the informational equivalent of labeling your safe “CONTAINS VALUABLES.” The startup sequence now tries the internal passphrase first and falls back to prompting the user, making the flag unnecessary.

The encryption converter was also hardened against macOS file coordination locks. iCloud sync, Spotlight indexing, and other enthusiastic system services could hold locks that prevented in-place conversion. The converter now works on a temporary copy and swaps the result, which is the kind of solution that seems obvious in retrospect and was not obvious at all at 2 AM when the bug report arrived.

A new quilltap db CLI subcommand allows querying encrypted databases from the terminal, for those who need to inspect their data outside the application.

Legacy Migration

Users upgrading from installations with the old pepper_vault passphrase system are handled transparently. The unlock endpoint detects the legacy scenario, routes through the existing pepper, and automatically migrates to a .dbkey file. API keys that were left as encrypted ciphertext after a previous migration are detected and decrypted on startup; keys that cannot be recovered trigger a toast notification advising re-entry in Settings.

Prospero Opens the Terminal

Shell Interactivity

The LLMs have been given hands.

Six tools — chdir, exec_sync, exec_async, async_result, sudo_sync, and cp_host — allow characters to execute shell commands inside Lima VM and Docker sandbox environments. The workspace is acknowledged via a modal before first use. Sudo commands require explicit approval. The system includes a command warning mechanism for suspicious commands, because giving an LLM root access without guardrails would be the kind of decision one regrets at leisure.

An Electron workspace file watcher monitors changes with binary detection and OS quarantine markers, so files created or modified inside the sandbox can be safely surfaced to the host. An async process registry manages background commands.

The shell tools documentation notes — sensibly — that packages installed via apk add or apt-get inside Docker containers are ephemeral and lost on restart, and suggests keeping a setup script in the workspace or building a custom image.

The Inspector Arrives

LLM Inspector Panel

There has always been a way to see what Quilltap sends to the LLMs and what they send back. It was a modal, scoped to a single message, opened via a button that required you to know it existed.

The Inspector Panel replaces this with a proper instrument. A slide-over panel accessible from the chat toolbar (the terminal icon) or via Cmd+Shift+L / Ctrl+Shift+L, it shows every LLM interaction for the current chat in chronological order: chat messages, tool continuations, memory extraction, title generation, danger classification, context compression, and every other background event that touches a provider.

Each entry is a collapsible card with a type-colored badge, provider and model identification, token counts, and expandable detail views showing the full request and response. Client-side filtering by category lets you see only what you’re looking for. Opening the panel from a per-message “View LLM logs” button scrolls directly to the relevant entry.

The old modal is preserved for Settings and Character pages, where per-chat context does not apply.

Expanded Log Coverage

All eighteen cheap LLM task functions — memory extraction, title generation, summarization, compression, image prompt crafting, scene context derivation, and the rest — now thread chatId through to the logging system, so their calls appear in the Inspector. The Concierge’s OpenAI moderation API calls are logged as well. LLM log entries for chat messages now populate characterId, making it possible to trace which character each entry belongs to. And logs no longer truncate message content to 500 characters; the full content is stored, with the UI showing an expandable preview.

Aurora’s Portraits

Provider Icons and Model Badges

You can now see at a glance who is speaking — not just the character, but the machine behind the character.

Assistant messages record which provider and model generated them, persisted in the database and included in exports. Provider SVG icons from plugins flow through the /api/v1/providers API to the frontend. A new ProviderModelBadge component displays the provider icon and model name beneath chat avatars, in the participant sidebar, on homepage and Aurora character cards (via default connection profile), and in new-chat and add-character connection profile selectors.

Old messages gracefully show no badge. The new ones arrive with their credentials visible, like guests at a party wearing tasteful name tags.

Multi-Character Identity Anchoring

In multi-character chats, weaker LLMs had a tendency to respond as whichever character seemed most interesting at the moment, regardless of whose turn it was. An assistant prefill message — [CharacterName] — now anchors the model’s identity before generation begins. The prefix is stripped from the displayed response by the existing stripCharacterNamePrefix() cleanup, so the fix is invisible to the reader and effective for the model.

Pronoun Injection

When a character has pronouns set, they are now appended after the character name in all cheap LLM memory extraction prompts — user memory, character memory, and inter-character memory. This prevents the memory system from generating entries like “He mentioned his fondness for gardening” when the character in question uses she/her. Pronouns appear in the PARTICIPANTS context block, conversation transcript labels, TARGET CHARACTER lines, and inter-character observer and subject headers, via a shared formatNameWithPronouns() utility.

The Commonplace Book Learns to Forget

Memory Weighting with Time Decay

The Librarian has always kept everything. Every memory, once written, persisted at its original importance until housekeeping removed it — which meant a memory from three months ago about the weather carried the same weight as a memory from yesterday about a character’s secret. This was, the Librarian insists, a principled position. It was also wrong.

A new calculateEffectiveWeight() function combines base importance with exponential time decay — a 30-day half-life, configurable importance floor — using max(createdAt, lastReinforcedAt) as the reference timestamp. Passive retrieval no longer resets the decay timer, because reading a memory is not the same as the memory mattering again.

The weighting integrates into three systems: semantic search ranking (60% cosine similarity, 40% effective weight), context injection sorting (weight-primary with score tiebreaker), and housekeeping hard-cap enforcement. Memories injected into LLM context now include relative age labels — [yesterday], [3 weeks ago], [2 months ago] — so the model can distinguish recent knowledge from ancient lore.

Connection Profiles in Order

Custom Sort Order

Connection profiles now have a persistent sortIndex field with drag-and-drop reordering in Settings via @dnd-kit. A “Reset Sort Order” button restores the default arrangement: default profile first, cheap last, alphabetical in between. All profile dropdowns and selectors across the application honor the custom sort order, so the provider you use most can always be at the top of every list.

The Welcome Committee

Seed Characters with Avatars

Fresh installations now greet you with Lorian and Riya — two seed characters imported via .qtap bundles on first startup, complete with 42 memories between them and avatar images that are uploaded to file storage and set as their default portraits. The previous seed character, Ben, has been removed — a decision that required fixing an early-return bug in seedInitialData() that prevented .qtap imports from running when no JSON seed characters existed.

Setup Wizard Fixes

The selectable options in the setup wizard — embedding provider, image provider, provider selection, profile archetype — no longer all appear selected simultaneously. The phantom qt-bg-active and qt-bg-hover classes that caused this have been replaced with proper qt-option-selected and qt-option-unselected utility classes that provide clear visual distinction. Six wizard components were updated.

A first-startup race condition where the page rendered without a sidebar and failed to redirect to the setup wizard has been resolved. The session provider now keeps “loading” status on 503 instead of “unauthenticated,” and PepperVaultGate retries on failure instead of giving up.

The Plumbing

Contextual Help Routing

All 69 help files now carry YAML frontmatter with a url field mapping each document to its corresponding UI route — /aurora/:id/edit, /settings?tab=providers, * for global topics. The help bundle is at version 3.0.0. This infrastructure enables future contextual help: the right documentation for the page you’re actually on.

Workspace Path Semantics

LLM shell tool descriptions now explicitly state that paths are relative to the current workspace directory, that absolute paths with a leading / refer to the VM root filesystem and will be rejected, and that workspace: prefixes in cp_host should use relative paths. The previous descriptions left this to inference, which is never advisable when the audience includes language models.

Streaming Error Recovery

User messages no longer vanish from the chat UI when the LLM provider returns a streaming error. The backend already persisted the user message before streaming began, but the frontend was removing the optimistic message on error. It now re-fetches the chat to sync with the saved server state.

Chat Composer Auto-Focus

The chat composer textarea now auto-focuses when it’s the user’s turn. The page’s inputRef had never been connected to the actual textarea DOM element — ChatComposer created its own internal ref — so all post-generation focus calls were no-ops. A new inputRef prop threads the connection through.

Build and Release

The README was expanded: the plugin types table grew from 5 to 7 entries (adding Image Provider and Embedding Provider), and a engines field now requires Node.js >= 24.0.0. The pre-commit hook was streamlined from 12 steps to 4, with lint, test, tsc, and build responsibilities moved to the /commit command. A Discord commit notification webhook was added to CI. Legacy JSONL file records and their physical storage were removed from the source tree — 14 orphaned records and one image that the migration was faithfully importing into every fresh database.

Approximately 85 development logger.debug calls were removed from 30 files. Seven components had raw Tailwind color classes converted to qt-* semantic theme classes. The folders table was added to backup and restore. Unused dependencies (@aws-sdk/client-s3, svgo) and a stale file (ai-import/index.tsx) were removed. API docs were updated to v3.2. The dead code report was refreshed. The backup help documentation was expanded with a comprehensive list of included and excluded data.

Bug Fixes (Selected)

The ones that earn their mention:

Passphrase unlock fails during legacy migration — when a user had a passphrase in the old pepper_vault but no .dbkey file, the unlock endpoint rejected the attempt because the dbkey module’s internal state was needs-setup instead of needs-passphrase; now detects the legacy scenario and routes correctly
Database encryption fails on iCloud-synced directories — the converter now works on a temporary copy to avoid file coordination locks from iCloud, Spotlight, and other macOS services
Encrypted API keys left as ciphertext after migration — keys that survived the drop-api-key-encryption-columns migration as encrypted blobs are now detected and decrypted on startup; unrecoverable keys trigger a user-facing notification
Multi-character identity confusion — weaker LLMs responding as the wrong character now anchored by assistant prefill
Assistant prefill trailing whitespace — the identity anchor [CharacterName] had a trailing space that Anthropic’s API rejected; removed
Setup wizard options all appear selected — phantom CSS classes replaced with proper selected/unselected states
First-startup race condition — session provider and PepperVaultGate now handle 503 correctly
Seed data files not found — process.cwd() replaces __dirname, which Next.js rewrites to .next/dev/server/
Legacy JSONL records polluting fresh installs — orphaned file records removed from source tree
hasPassphrase flag in .dbkey — security-sensitive flag removed; startup logic adapted

Subsystem Table

For those keeping score:

Name	Function
The Foundry	Architecture, plugins, packages, LLMs, API keys
Prospero	Projects, agents, tools, files — now with shell interactivity
Aurora	Character creation, AI Import Wizard, provider badges, identity anchoring
The Commonplace Book	Memory and retrieval — now with time-weighted decay and pronoun-aware extraction
The Salon	Chat interface — streaming error recovery, composer auto-focus, LLM Inspector Panel
Calliope	Interface, themes, and the setup wizard’s new visual clarity
The Concierge	Content routing, moderation API, logged to the Inspector
The Lantern	Image generation and atmospheric story backgrounds
Pascal	Games, randomness, and the quiet mathematics of chance
Saquel Ytzama	Encryption, key management, SQLCipher, locked mode, and the `.dbkey` covenant

Upgrading from 3.1

The database migrations handle themselves. Your plaintext databases will be encrypted on first startup — the converter runs before anything else, and the result is seamless. A .dbkey file will appear in your data directory. Back it up. A database without its key is sealed permanently, and no one — not the Foundryman, not Saquel Ytzama, not you — can open it.

If you had a passphrase in the old pepper vault system, it will be recognized and migrated. If your API keys survived a previous migration as encrypted ciphertext, they will be detected and decrypted. If neither of these situations applies to you, the upgrade will be the quietest event of your week.

Shell interactivity, the LLM Inspector, provider badges, memory weighting, and the rest arrive without ceremony. The seed characters will not appear if you already have characters — they are a first-run courtesy only.

The Estate looks the same from the garden path. The windows glow as they always have. But behind the glass, every room has been sealed by someone who understands what quiet is for, the characters have acquired faces, the machines have been given hands, and the Librarian has finally conceded — privately, and with conditions — that some things are meant to be forgotten. Come in. You’ll need your key.

Installation

macOS

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows

Download and run the .exe installer
Follow the installation prompts
Launch Quilltap from the Start Menu or desktop shortcut

Linux

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb
Requires Docker Engine — install from https://docs.docker.com/engine/install/

Node.js (any platform)

npm install -g quilltap
quilltap

On first run, the CLI downloads the application files (~150-250 MB) and caches them locally. Subsequent launches start instantly.

Docker

docker pull foundry9/quilltap:${TAG}

Or download the quilltap-linux-<arch>.tar.gz rootfs tarball for use with Lima. See the README for setup instructions.

Installation

macOS

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows

Download and run the .exe installer
Follow the installation prompts
Launch Quilltap from the Start Menu or desktop shortcut

Linux

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb
Requires Docker Engine — install from https://docs.docker.com/engine/install/

Node.js (any platform)

npm install -g quilltap
quilltap

On first run, the CLI downloads the application files (~150-250 MB) and caches them locally. Subsequent launches start instantly.

Docker

docker pull foundry9/quilltap:${TAG}

Or download the quilltap-linux-<arch>.tar.gz rootfs tarball for use with Lima. See the README for setup instructions.

Version 3.2.1 released

View on GitHub →

Production stable release of Quilltap, with enhanced data security and locking and many other features

Quilltap 3.2.1 Release Notes

Sorry about that, another hiccough in the NPM environment. Running quilltap as a Node.js command works now. (Nobody informed that package about the encryption change… forcefully enough.)

Installation

macOS

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows

Download and run the .exe installer
Follow the installation prompts
Launch Quilltap from the Start Menu or desktop shortcut

Linux

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb
Requires Docker Engine — install from https://docs.docker.com/engine/install/

Node.js (any platform)

npm install -g quilltap
quilltap

On first run, the CLI downloads the application files (~150-250 MB) and caches them locally. Subsequent launches start instantly.

Docker

docker pull foundry9/quilltap:${TAG}

Or download the quilltap-linux-<arch>.tar.gz rootfs tarball for use with Lima. See the README for setup instructions.

Quilltap 3.1.2

View on GitHub →

Bugfix release, final for 3.1 series

Quilltap 3.1.2 Release Notes

The Foundryman is embarrassed; he and Prospero didn’t really talk through the plans completely when they were making database migration changes. The first time you run this - or the first time you add a new instance from scratch - it was never going to work. The management regrets this error, and it should work now. Please try it again.

from Friday - I told them to start using Playwright to test this stuff. I told them, repeatedly. But did they listen? The proof is in the the Cock-a-leekie soup, I think.

Installation

macOS

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows

Download and run the .exe installer
Follow the installation prompts
Launch Quilltap from the Start Menu or desktop shortcut

Linux

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb
Requires Docker Engine — install from https://docs.docker.com/engine/install/

Node.js (any platform)

npm install -g quilltap
quilltap

On first run, the CLI downloads the application files (~150-250 MB) and caches them locally. Subsequent launches start instantly.

Docker

docker pull foundry9/quilltap:${TAG}

Or download the quilltap-linux-<arch>.tar.gz rootfs tarball for use with Lima. See the README for setup instructions.

Quilltap 3.1.0

View on GitHub →

The staff have been busy. One or two of them have been replaced.

Quilltap v3.1.0 Release Notes

There are renovations that announce themselves — the kind involving scaffolding and men in hard hats and the sound of masonry being persuaded to relocate. And then there are the quieter sort, where one morning you come downstairs and the doorman is different, the filing cabinet has learned to read, and the back office has been split into two rooms without anyone remembering a wall going up.

This is the second kind.

Quilltap 3.1 is not, on its face, a dramatic release. There is no new runtime to explain. The application does not migrate to a different operating system or acquire a virtual machine it did not previously have. What it does, instead, is the kind of work that only becomes visible when you notice that everything is slightly better than it was: the search is deeper, the files are real, the characters arrive faster, the content routing has been entrusted to someone competent, and the database has been quietly separated into a house that works and a ledger that watches.

If 3.0 was the move to a new address, 3.1 is the season where the furniture finally ends up where it belongs.

What Changed (The Executive Summary)

The content safety system was replaced — not patched, not adjusted, replaced — with an architecture that routes instead of blocks. The AI character import wizard was built from scratch. The file storage was restructured so that files on disk look like files, not hashed artifacts. A third runtime mode was added for Linux and macOS users who prefer Node.js over Docker or a VM. Global search now reads message content, not just titles. LLM logs were moved to their own database. The documentation was audited across sixty files. And a moderation API was integrated that does, for free, what the old system did badly for the price of an LLM call.

If you are upgrading from 3.0, the transition is automatic. If you are upgrading from the experience of Dangermouse telling you that the Song of Solomon was inappropriate, the transition is overdue.

The Concierge

Dangermouse Has Left the Building

This is the headline, and it deserves to be said plainly: the content safety system has been replaced.

Dangermouse — the binary classifier who knew two words, yes and no, and used the second one with enthusiasm — has been retired. In his place stands the Concierge: a distinguished gentleman in a three-piece pinstripe, gold pocket-watch chain, and the unflappable demeanor of someone who has heard everything and is surprised by nothing.

The difference is architectural, not cosmetic. Dangermouse classified content and then blocked it or allowed it. The Concierge classifies content and then routes it — to the provider best equipped to handle it, through the door most likely to lead somewhere useful. Three modes of operation:

Off. The Concierge reads his newspaper and lets all traffic pass. Perfectly respectable.

Detect Only. A small badge appears on flagged messages. No intervention. The house has noticed; the guest may proceed.

Auto-Route. Flagged content is redirected — automatically, transparently — to a provider configured for that category of material. The guest sees nothing. The conversation continues. If a provider refuses silently, the Concierge catches the empty response and retries with someone who won’t flinch.

The rename propagates throughout: subsystem ID, display name, image paths, UI strings, plugin manifests, documentation, and every line of code that once referenced the old regime. Dangermouse’s briefcase full of flags and anxieties went with him.

The Moderation Endpoint

The Concierge has also acquired a new instrument. When an OpenAI connection profile is configured, content classification now uses OpenAI’s dedicated moderation endpoint — purpose-built, free to call, and structured to return category scores mapped directly to Concierge categories: sexual content to NSFW, hate speech to its proper flag, violence, self-harm, and illicit activity each to their own. The endpoint replaces the previous method of asking a Cheap LLM to classify content, which was rather like asking the butler to perform surgery because no surgeon was available.

When no OpenAI profile exists, the Cheap LLM classification remains as a transparent fallback. The guest notices nothing either way.

A relevance floor of 1% now filters the moderation response, because OpenAI’s endpoint returns tiny nonzero scores for every category even when the content is a recipe for scones, and the Concierge has better things to do than flag baked goods.

Smarter Classification Timing

The context summary — the Concierge’s view of what a conversation is about — now regenerates on the same schedule as chat titles and story background checks: at interchange checkpoints 2, 3, 5, 7, 10, and every 10 thereafter. Previously, summaries only updated after 100 messages, which meant the Concierge was making routing decisions based on a description written when the conversation was still about the weather.

Aurora’s New Chisel

The AI Character Import Wizard

In the Dressing Room — Aurora’s room, the one with the triptych mirror and the good light — there is now a tool that accepts raw material and returns a living character.

You may bring it anything. A wiki page about a favorite antagonist. A handful of freeform notes scrawled at two in the morning. A character sheet from another system. A document that has been living in a folder for three years, waiting for the right home. You place these things on Aurora’s worktable, and the wizard begins.

It does not simply scrape the text for a name and a hair color. Each step is a focused LLM call — a sculptor’s pass revealing a different facet of the person within:

First the bones: name, title, personality, circumstance. Then the voice: dialogue, tone, the sound of a first breath in a room. Then the system prompt — the invisible architecture that tells the AI how to become this person. Then the flesh: physical descriptions at five levels of detail, from the brief sketch you’d whisper to an illustrator to the exhaustive portrait the Lantern needs for image generation. Then pronouns, because a person must be referred to correctly. And finally, memories — discrete Commonplace Book entries that give a character a past the AI can draw upon mid-conversation, the way a real person draws upon the accumulated texture of having lived.

The wizard shows its work as it goes. Each step illuminates in turn. If something fails, only the failed steps repeat. When it finishes, you review the assembled character and import them directly, or step back to add a detail and run it again.

The wizard lives on Aurora’s page, accessible via “Summon From Lore” alongside the existing SillyTavern import. It assembles a validated .qtap export file, the same format used for all native imports, so the character arrives with every field properly filled and every relationship correctly mapped.

A new qtap-schema-validator package — Ajv-based, Draft 2020-12 — validates the export format, reusable anywhere import validation is needed. LLM calls are tracked under a new AI_IMPORT log type.

The Filing Cabinet Learns Its Own Name

Filesystem-Backed File Storage

The file storage system has been rebuilt from the inside out. Files are now stored on disk as themselves — real directories, original filenames, no user ID prefixes, no file ID prefixes, no .meta.json sidecar files cluttering the landscape. The new storage key format is clean and legible: {projectId}/_general/{folderPath}/{safeFilename}.

A chokidar filesystem watcher monitors the data directory in real time. Add a file in Finder or Explorer and it appears in the file browser. Move a file on disk and the watcher catches both the unlink and the add, cross-matches by SHA-256 hash, and preserves all tags, links, and metadata — no orphaned records, no duplicated entries. The same cross-matching logic runs at startup during reconciliation, handling anything that changed while the application was closed.

Untracked files — those found on disk without corresponding database records — appear in the file browser with an amber indicator rather than being silently ignored. A manual “Sync Now” button triggers reconciliation on demand.

The backup format has been updated to v2, storing files by their storage key paths, with backward-compatible restore for the old format. A one-time migration moves existing files to the new layout automatically.

File Browser Fixes

The file browser now correctly places files in their subfolders instead of displaying everything at root — files with empty folderPath in the database are resolved using their physical storageKey path. Folder file counts display the actual number of contained files instead of the previous behavior of always displaying “1 file,” which was either a bug or an act of aggressive minimalism.

Duplicate filenames are now prevented: writing a file with the same name in the same scope overwrites the existing file and preserves the original file ID, so references remain valid. This applies to LLM tool writes, API writes, API uploads, and attachment promotions.

The Third Door

Node.js Runtime Mode

The splash screen now offers three runtime buttons instead of two:

VM Mode — Lima on macOS, WSL2 on Windows. Full isolation.

Docker Mode — Available on all platforms.

Node.js Mode — For users with Node.js 18+ installed who prefer to run the backend directly, without containers or virtual machines.

The Node.js mode runs the backend via npx quilltap@{version}, managed by a new NpxManager class that handles the full process lifecycle: spawning, health checking, graceful SIGTERM shutdown, and on Windows, taskkill tree cleanup. The manager probes well-known Node.js installation paths — Homebrew, nvm, fnm, system — because packaged Electron ships with a minimal PATH that typically cannot find your Node installation on its own.

Linux Electron users, previously locked to Docker-only with the VM and Node.js buttons hidden, can now choose between Docker and Node.js. The settings loader persists the choice correctly, and fallback logic is platform-aware.

Native Module Resilience

The npx runtime now handles two edge cases that previously produced cryptic startup crashes:

Module resolution. Native modules (better-sqlite3, sharp) are symlinked into the standalone directory’s node_modules/ so standard Node.js resolution finds them without relying on NODE_PATH.

Version mismatches. When Node.js is upgraded while npx has a cached install, the CLI detects NODE_MODULE_VERSION mismatches and automatically runs npm rebuild before starting the server. The version check now loads the compiled .node binary directly, because better-sqlite3 lazy-loads its binding only when a Database is created, and the previous check always succeeded without testing the actual binary.

The Librarian Opens Every Drawer

Global Search Now Includes Messages

The Cmd+K search dialog — the Librarian’s card catalog — now searches within chat message text, not just chat titles, character names, tags, and memories. Results show the chat name, a role badge (“You” or the character’s name), and a highlighted snippet of the matching message.

A new “Messages” filter chip joins the existing filters. The implementation adds a searchMessagesGlobal() database method and a MessageSearchResult type, with an amber badge to distinguish message results from other categories.

Prospero’s Ledger

LLM Logs Move to Their Own Database

The llm_logs table — that high-churn chronicle of every API call, every token count, every model response — has been extracted from the main database into a dedicated quilltap-llm-logs.db file.

The reasoning is structural. LLM logs accumulate rapidly, write constantly, and are never consulted during normal operation. They are debug data. Mixing them with characters, chats, messages, and memories meant that corruption in the logs — always the most likely table to suffer under heavy write load — could theoretically threaten everything else. Now, if the logs database fails, the application continues without interruption. Graceful degradation, not shared fate.

The new database has its own WAL checkpoint protection, its own physical backup with the same tiered retention policy as the main database, and a migration that copies existing logs for upgrading users. Backup and restore fully support the two-database architecture.

Run Tool

Users can now invoke any available LLM tool directly from the chat tool palette, without waiting for the AI to decide to use it. A two-phase modal — tool selection, then a dynamically generated parameter form — lets you pick the tool, fill in the inputs, and execute. Results appear as tool messages in the chat, visible to the AI on subsequent turns.

This covers all built-in tools and plugin tools, with forms generated from JSON Schema. The implementation adds a POST /api/v1/chats/[id]?action=run-tool endpoint and a ?includeSchemas=true parameter to the tools API.

Settings in Better Light

Subsystem Background Images

Each settings tab now displays its subsystem’s full-size background image behind the page content, replacing the tiny thumbnails that previously sat next to each tab’s description like passport photos at a job interview. The visual metaphor is a tabbed folder: a frosted header banner, opaque tab backgrounds that merge into a darkened content panel, and backdrop blur for readability.

The effect is scoped to the settings page only and uses the existing --story-background-url CSS variable system. The brand font is now hardcoded to EB Garamond regardless of active theme, because Quilltap’s name should look like Quilltap’s name whether you’re in Art Deco or Old School.

The Plumbing

Database & Embedding Fixes

The BLOB storage migration from 3.0 — which converted embeddings from JSON text to compact Float32 BLOBs — left some trailing damage that 3.1 addresses thoroughly:

The SQLite update path now accepts BLOB columns, so updateOne/updateMany write embeddings as Float32 BLOBs instead of quietly reverting to JSON text. A new migration (fix-text-embeddings-after-update-v1) repairs any TEXT embeddings that slipped through. Buffer hydration handles edge cases where a BLOB arrives in an unregistered column. Vector store search returns empty results on dimension mismatches instead of throwing. And legacy TEXT-stored embeddings pass Zod validation via a z.string().transform() arm that parses them transparently.

Embedding deduplication prevents hundreds of duplicate background jobs during reindex operations. The memory cleanup merge pass no longer makes N embedding API calls — it reads already-stored embeddings from the vector store, making preview with 125+ memories essentially instant.

Physical Backups

Physical database backups now run once per day instead of on every startup, which was producing excessive backups during development (HMR restarts) and frequent production restarts. The check compares the most recent backup timestamp and skips if less than 24 hours old.

Agent Mode Inheritance

New chats with a character or project that has defaultAgentModeEnabled: true now correctly show “Agent On” in the tool palette. The full cascade — global, character, project, chat — is resolved at the API level and returned as resolvedAgentModeEnabled.

WSL2 & MCP

WSL2 MCP Localhost Fix

MCP servers running on the Windows host were unreachable from WSL2 because the /proc/net/route gateway IP doesn’t forward to services bound to 127.0.0.1. A WSL2-specific strategy now reads the nameserver from /etc/resolv.conf — which WSL2 auto-generates to point at the Windows host with special localhost forwarding — and detects the WSL2 environment via /proc/sys/fs/binfmt_misc/WSLInterop.

Documentation Audit

Sixty files were audited. The results:

All stale /foundry/* UI paths replaced with /settings?tab=* across 28 help files and 6 docs files. The /v1/ prefix added to 80+ API endpoint references across 10 docs. MongoDB, Prisma, and S3 references replaced with SQLite, Zod, and local filesystem in 5 feature and architecture docs. The migrations README table rebuilt from 14 entries to all 46. PROMPT_ARCHITECTURE.md expanded from 9 lines to a full architectural reference. The chat-settings README file listing updated from 7 to 16 files. The .githooks README now documents all 12 pre-commit steps. The plugin manifest capability count corrected from 20 to 22. Theme count and names corrected in README. The help bundle rebuilt.

SillyTavern Import Improvements

The SillyTavern chat import has been refined: the button is now labeled “Import SillyTavern Chat” instead of the ambiguous “Import,” and the wizard renders as a proper modal overlay via portal instead of inline at the bottom of the page. Speaker mapping has been unified — all speakers, user and AI alike, can now be mapped to any available character, replacing the previous persona-based system that handled user and AI speakers differently.

Build & Release

The release workflow now triggers a repository_dispatch event to the Quilltap website repository after a GitHub Release is created, passing version, prerelease flag, and Docker image tag. The website can update itself without manual intervention.

The release checklist received its own sweep: debug console.logs removed, raw Tailwind color classes migrated to semantic qt-* theme classes across 6 components, chat_settings and file_permissions tables added to backup/restore with full UUID remapping, the MCP plugin made self-contained by porting host-rewrite utilities to @quilltap/plugin-utils, and 90 new unit tests added for Run Tool, the run-tool action handler, and the LLM logs repository.

Bug Fixes (Selected)

The ones that earn their mention:

AI Character Import validation failure — LLMs don’t reliably respect character limits, so physical description prompts are now truncated to schema maximums (350/500/750/1000 chars); missing createdAt/updatedAt timestamps on system prompts and physical descriptions added to the assembled export
Memory cleanup POST returned 400 — the housekeeping dialog was not including characterId in the POST body despite the API schema requiring it
Context summary stale for 100 messages — now regenerates at the same checkpoint schedule as titles
Moderation provider reporting irrelevant categories — 1% relevance floor filters noise
Job queue UI misleading labels — “Job Payload (sent to LLM)” corrected to “Job Parameters”; human-readable type names for embedding, story background, and danger classification jobs; character names resolved for embedding jobs
React setState-during-render warning — handleChange and onChange calls moved out of setIncludedOptionals state updater in JsonSchemaForm
Broken require in AI import service — replaced with static import to eliminate Turbopack build warning
manage_files tool ID mismatch — renamed to file_management to match the actual tool name used by the LLM

Subsystem Table

For those keeping score:

Name	Function
The Foundry	Architecture, plugins, packages, LLMs, API keys
Prospero	Projects, agents, tools, files, and the new separate LLM logs database
Aurora	Character creation — now with the AI Import Wizard
The Commonplace Book	Memory and retrieval — embedding fixes and faster cleanup
The Salon	Chat interface — now with Run Tool and message search
Calliope	Interface, themes, and the settings page’s new visual treatment
The Concierge	Content routing, moderation API, and the quiet art of opening the right door
The Lantern	Image generation and atmospheric story backgrounds
Pascal	Games, randomness, and the quiet mathematics of chance
Saquel Ytzama	Encryption, key management, and the Pepper Vault

Upgrading from 3.0

The database migrations handle themselves, as they always do. The Concierge replaces Dangermouse automatically — your content safety settings are preserved, but the system that enforces them is now competent. Files will be migrated to the new filesystem-backed layout on first startup. LLM logs will be copied to their own database.

If you were running Node.js via npx quilltap, native module resolution is now more robust. If you were running Docker or a VM, nothing changes except that everything behind the walls has been repaired, reinforced, and — in the case of one particular staff member — shown the door.

The house looks the same from the outside. The rooms are arranged as you left them. But the doorman is better dressed, the filing cabinet knows where things are, and the sculptor in the Dressing Room has acquired tools that would make Pygmalion weep. Come in. The Concierge is expecting you.

Installation

macOS

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows

Download and run the .exe installer
Follow the installation prompts
Launch Quilltap from the Start Menu or desktop shortcut

Linux

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb
Requires Docker Engine — install from https://docs.docker.com/engine/install/

Node.js (any platform)

npm install -g quilltap
quilltap

On first run, the CLI downloads the application files (~150-250 MB) and caches them locally. Subsequent launches start instantly.

Docker

docker pull foundry9/quilltap:${TAG}

Or download the quilltap-linux-<arch>.tar.gz rootfs tarball for use with Lima. See the README for setup instructions.

Quilltap 3.1.1

View on GitHub →

npx packaging bugfix release

Quilltap 3.1.1 Release Notes

We screwed up the npx packaging. Absolutely nothing changed in the code, this was purely a fix for running this using npx quilltap. Sorry about that.

Installation

macOS

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows

Download and run the .exe installer
Follow the installation prompts
Launch Quilltap from the Start Menu or desktop shortcut

Linux

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb
Requires Docker Engine — install from https://docs.docker.com/engine/install/

Node.js (any platform)

npm install -g quilltap
quilltap

On first run, the CLI downloads the application files (~150-250 MB) and caches them locally. Subsequent launches start instantly.

Docker

docker pull foundry9/quilltap:${TAG}

Or download the quilltap-linux-<arch>.tar.gz rootfs tarball for use with Lima. See the README for setup instructions.

Quilltap 3.0.0

View on GitHub →

The Estate has outgrown its browser. It now arrives at your door.

Quilltap v3.0.0 Release Notes

There is a particular kind of morning — the kind with good coffee and a draft that won’t stop arriving — when one looks at a perfectly serviceable web application and says, “What if this lived on the desktop, sandboxed inside a virtual machine, and also ran on Windows?” This is, broadly speaking, what happened. What you are reading is the consequence.

Quilltap 3.0 is, to put it as simply as the thing will allow, a native (-ish) desktop application. Where once a Docker container sat politely at localhost:3000 waiting for someone to visit, Quilltap now installs itself as a proper citizen of your operating system: a macOS .dmg, a Windows installer, or a Linux .deb / AppImage, each containing an Electron shell wrapped around a virtual machine that runs the entire backend in comfortable isolation. Your data stays on your machine. Your AI-generated code runs in a sandbox. The architecture is, if we are being charitable, ambitious; if we are being accurate, the sort of thing that causes build engineers to develop a faint twitch behind one eye.

It was worth it.

What Changed (A Brief Précis for the Impatient)

The entire infrastructure was rewritten. The authentication system was removed. The Docker deployment was simplified from a multi-service orchestration to a single command. A native desktop application was built from scratch for macOS, Windows, and Linux. The settings were reorganized. The images were optimized by 94%. The database was hardened. Two new subsystem characters were introduced. The file storage was stripped to its essentials. And the virtual machine — patient, deterministic, sandboxed — was made the foundation of the whole affair.

If you are upgrading from 2.x, you may feel, briefly, as though you have wandered into a different house. You have. But the furniture is arranged the same way, and the staff — Prospero, Aurora, Calliope, and the rest — are all present, merely better dressed.

The Desktop Application

The Electron Shell

The headline, naturally, is that Quilltap is now a desktop application. On macOS, it runs atop a Lima virtual machine using Apple’s Virtualization.framework. On Windows, it runs inside WSL2. On Linux, it uses Docker Engine directly, because Linux was already the sandbox and always knew it.

The architecture follows a simple pattern: Electron provides the window, the menu bar, and the system tray. A virtual machine (or container) runs the full Next.js backend. The two communicate over localhost. The user sees a native application window. The application sees a Linux server. Everyone is happy, except perhaps the build pipeline, which is not consulted on matters of happiness.

The Splash Screen

First impressions matter, and Quilltap’s splash screen has been designed accordingly — an Art Deco–framed loading sequence with animated progress, live VM output streaming, and color-coded log levels. First-run users see a branded experience with Quilltap’s quill logo and serif typography. Subsequent launches auto-start with the last-used data directory, offering a brief window to interrupt and choose another.

When the application shuts down, a small frameless window appears with a spinner and a polite notice — “Stopping virtual machine…” — while the backend is dismantled behind the scenes. No orphaned processes. No dangling containers. Just a clean exit, like a well-run household.

Data Directory Management

Each Quilltap site now lives in its own named data directory, selectable from the splash screen. Users can maintain multiple sites — one for work, one for fiction, one for experiments — and switch between them without quitting. Directories display human-readable names (editable via a pencil icon), disk usage, and VM size. Deleting a directory offers two options: remove the configuration only, or delete the data along with it.

On macOS, each directory receives its own Lima VM instance (quilltap-<hash>), so switching is a stop-and-start operation rather than a destructive rebuild.

Runtime Modes

The splash screen offers a toggle between two runtime backends:

VM Mode — Lima on macOS, WSL2 on Windows. Full isolation. The Foundryman’s preferred arrangement.
Docker Mode — Available on all platforms. Uses Docker Engine directly, with the same container image that powers the standalone deployment.

Port conflicts are handled automatically: starting one mode stops the other. The active runtime is displayed in the application footer, alongside the data directory path (now clickable — Electron opens your file browser; browser mode copies to clipboard).

Docker, Simplified

The Docker story has been rewritten with a controlled demolition’s precision. Removed: docker-compose.yml (all three of them), the Nginx configuration, the MinIO buckets, the authentication scaffolding, the Let’s Encrypt scripts, and the OAuth infrastructure. In their place: a single docker run command, a pair of platform-aware startup scripts, and transparent host port forwarding via socat.

One Command to Start

./scripts/start-quilltap.sh

That’s it. The script detects your platform, sets the correct data directory, finds Ollama if it’s running, and starts the container. Windows users have start-quilltap.ps1. Both scripts support --data-dir, --port, --redirect-ports, --dry-run, and --no-auto-detect.

Host Service Access

Docker users can now reach host services — Ollama, LM Studio, MCP servers — at localhost URLs from inside the container, without --network host or manual IP discovery. The HOST_REDIRECT_PORTS environment variable accepts a comma-separated list of ports, and socat forwarders handle the rest. The startup scripts auto-detect Ollama on port 11434 and configure this transparently.

Localhost URL Rewriting

A new rewriteLocalhostUrl() function transparently rewrites localhost and 127.0.0.1 URLs to the host gateway IP in Docker, Lima, and WSL2 environments. The function auto-detects the gateway via host.docker.internal or the default route, and supports a QUILLTAP_HOST_IP override. This replaces the socat-based approach for LLM provider connections and is applied to all provider creation, embedding profiles, image profiles, and MCP server connections.

Authentication: Removed

The authentication system — JWT tokens, OAuth flows, Google sign-in, app-specific passwords — has been removed entirely. Quilltap is a local-first application. Your data lives on your machine. There is no server to authenticate against, because you are the server. The Pepper Vault handles encryption key management for sensitive data at rest. No login required.

Settings Reorganization

The ten-page Foundry subsystem navigation has been replaced with a single tabbed settings page using seven plain-English tabs: AI Providers, Chat, Appearance, Memory & Search, Images, Templates & Prompts, and Data & System. The estate personifications remain as thumbnails and flavor text — they are, after all, part of the charm — but they no longer gate navigation.

All old /foundry/* routes redirect to the appropriate tab. The AI Stack Setup Wizard has been relocated to /settings/wizard.

The Setup Wizard

A new guided six-step wizard walks first-time users through provider configuration: provider selection, API key validation, model selection, optional embedding and image setup, and a final test-and-confirm step. It pre-populates existing configuration when accessed from settings. One flow creates connection profiles, embedding profiles, image profiles, and chat settings in a single pass.

New Faces on the Estate

Pascal the Croupier

Downstairs, past the Library and the Salon, lies a narrow room lit by green lamps. Pascal keeps the dice and the records there. He is the subsystem responsible for games, randomness, and state management — the quiet master of probability and consequence. His page lives at /foundry/pascal, with a dice icon and a coming-soon notice that suggests patience.

Saquel Ytzama, Keeper of Secrets

Saquel manages encryption and security — the Pepper Vault, key rotation, and the quiet business of keeping things that should be locked, locked. Her page lives at /foundry/saquel, marked with a key icon and a similar promise of things to come.

Database & Storage

SQLite Hardening

The database has been fortified with the quiet thoroughness of a butler who has survived two world wars:

Integrity checks on startup (PRAGMA quick_check)
Periodic WAL checkpoints every five minutes
Physical backups on startup, with tiered retention: daily for seven days, weekly for four weeks, monthly for twelve months, yearly forever
synchronous = FULL for durable writes
Pre-backup WAL flush for logical backups
Crash-loop protection in Electron: after three consecutive startup failures, safe mode engages — clearing caches, resetting saved state, and restoring defaults

Vector Embedding Optimization

Embeddings in vector_entries and memories.embedding are now stored as compact Float32 BLOBs — roughly 4–5× smaller than the previous JSON text representation. A new vector_entries table with one row per embedding replaces the monolithic entries JSON column. Incremental saves write only changed entries. Migration converts existing data automatically.

File Storage Simplification

The file storage abstraction has been reduced to a thin local-only FileStorageManager. Removed: the mount_points table, S3 support, the S3 plugin, the mount-points API, the storage settings UI, and approximately thirty consumer files that referenced the old abstraction. File paths are now portable across platforms, with runtime-resolved paths replacing database-stored absolute paths.

Build & Release Pipeline

Automated Releases

The release workflow, triggered on version tags, now builds everything in parallel: rootfs tarballs (amd64 + arm64), Electron installers for macOS (DMG), Windows (NSIS), and Linux (AppImage + .deb), Docker images, and standalone tarballs. A final job creates the GitHub Release with all assets attached.

Noteworthy build improvements:

Azure Trusted Signing for Windows code signing
App Store Connect API key for macOS notarization (replacing Apple ID + app-specific password)
Native architecture runners for rootfs builds — no more QEMU cross-compilation
Image optimization — all PNGs and JPGs converted to WebP, SVGs optimized with SVGO. Total image payload reduced from ~75 MB to ~4.6 MB (94% savings)
node_modules excluded from Electron app.asar — the Electron code uses zero npm dependencies, so the 502 MB node_modules was bundled for nothing. DMG size dropped from 262 MB to ~132 MB
Plugin node_modules stripped from Docker image — saving ~350 MB per architecture

The `quilltap` npm Package

For those who prefer the command line:

npx quilltap

The npm package is now a lightweight CLI (~10 KB) that downloads pre-built standalone output from GitHub Releases on first run. Downloads are cached per-version in a platform-specific directory, with a progress bar, retry with exponential backoff, and an --update flag for forced re-download.

Lima Distribution

Lima binaries are now downloaded directly from GitHub Releases at first launch, with local caching. No Homebrew installation required. A runtime check for Xcode Command Line Tools offers one-click installation if missing. Lima version is pinned in electron/constants.ts.

MCP & Tool Connectivity

MCP connections in Docker, Lima, and WSL2 environments received extensive attention in this release, which is a polite way of saying they were broken in several interesting ways and are now fixed:

Host header validation — MCP servers rejected requests from host.docker.internal. A custom fetch implementation now routes traffic through the host gateway while preserving the original Host: localhost:PORT header.
Connection churn — URL preprocessing caused config hash mismatches between raw and rewritten URLs, causing the MCP plugin to teardown and recreate connections on every API call. URL rewriting now happens solely in MCPClient.connect().
Gateway resolution — Docker environments now use host.docker.internal first (via Docker Desktop DNS), with /proc/net/route reserved for Lima/WSL2 where NAT networking forwards to host loopback.

The allowToolUse master switch on connection profiles provides a profile-level override for all LLM tools. When disabled, no tools are sent to the model regardless of other settings.

Performance & Stability

Node.js OOM protection — NODE_OPTIONS="--max-old-space-size=2048" set in all runtime environments; Lima VM memory bumped from 2 GiB to 4 GiB
Backup/restore rewritten for streaming — disk-based temp files with shell zip/unzip replace in-memory archiver/adm-zip, fixing OOM crashes on large data directories
Large .qtap import — files over 10 MB no longer choke; proxy body limit raised to 100 MB and the upload pipeline corrected
Electron downloads — all downloads (backups, exports, images, API keys) now work via preload bridge IPC channels that stream to disk without memory pressure
Infinite re-render loop — an unstable useEffect dependency in the Salon page caused an infinite loop that stalled React’s startTransition, making all sidebar navigation silently fail. Fixed with stable dependency proxies
Lima VM startup noise — suppressed transient port-skip messages, SSH proxy errors, and deduplicated repeated guest agent warnings

Refactoring

The codebase was substantially reorganized in this release. A partial inventory:

Salon page decomposition — reduced from 2,813 to 945 lines (66% reduction), extracting 8 hooks and 2 components; unified duplicate SSE streaming functions into a shared parser
Projects route decomposition — reduced from 1,056 to 49 lines, extracting schemas, 8 action modules, and 4 HTTP method handlers
Dead code cleanup — 7 unused files removed, unused exports pruned, duplicate utilities consolidated
Plugin dynamic loader — deduplicated 66 lines of bootstrap code from two registries into a shared module
Build scripts consolidated — platform-specific shell scripts replaced with cross-platform TypeScript throughout
qt-* class migration — 25+ component files migrated from hardcoded Tailwind classes to semantic theme-overridable utilities
API conformance — system routes wrapped in context handlers, NextResponse.json replaced with response helpers

Bug Fixes (Selected)

The full changelog runs to several thousand words, so here are the ones that would have kept you up at night:

Memories invisible after embedding migration — Float32 BLOB values failed Zod validation; fix adds Buffer.transform() to embedding schemas
Chat auto-triggers unwanted second AI response — temporary assistant message missing participantId; turn state didn’t recognize the AI had spoken
Backup restore in new-account mode — spread-order bug in remapBackupData caused entity cross-references to be silently overwritten
WSL2 connection failures — nohup ... & inside wsl.exe --exec sh -c caused the shell to exit immediately and WSL2 to terminate the distro
Docker footer showing “(local)” instead of “(Docker)” — nested property access (data.data?.isDocker vs. data.isDocker)
Sidebar not appearing after setup wizard — session provider needed a full page reload rather than client-side navigation
Google Gemini 3 tool-calling errors — supportsToolCalling() now excludes Gemini 3 models; orchestrator auto-retries without tools on unsupported-tool errors

Timezone Support

Timestamps injected into system prompts now respect the user’s local timezone instead of defaulting to UTC. A three-layer fallback chain — per-chat configuration, salon-level settings, QUILLTAP_TIMEZONE environment variable, system default — ensures the right time is always the right time. Electron auto-detects the host OS timezone and passes it through to the VM or container. A searchable IANA timezone selector is available in the Timestamp Configuration card.

Subsystem Table

For those keeping score at home:

Name	Function
The Foundry	The heart of the operation — architecture, plugins, packages, LLMs, API keys
Prospero	Your majordomo, conducting projects, agents, tools, and files
Aurora	The complex character model — appearance, clothing, aliases, pronouns, identity
The Commonplace Book	Memory and retrieval — a simple RAG that just works
The Salon	The chat interface, for conversations with one or many characters
Calliope	The user experience — interface, themes, and the Art Deco we like so much
Dangermouse	Always in the shadows, managing what the starched-collar providers won’t
The Lantern	Image generation and atmospheric story backgrounds
Pascal	The Croupier — games, randomness, state, and the quiet mathematics of chance
Saquel Ytzama	The Keeper of Secrets — encryption, key management, and the Pepper Vault

If you hate the look and feel and weird names, use the Old School theme. It calls things by their boring old names.

Upgrading from 2.x

The database migrations handle themselves. Your data, characters, memories, and chat history will be preserved. The settings have moved — from ten Foundry pages to seven tabbed sections — but everything is where you’d expect it to be if you think about it for more than a moment.

Authentication is gone. If you were using OAuth or Google sign-in, you no longer need to. If you were self-hosting with JWT tokens and reverse proxies, the reverse proxy examples in DEPLOYMENT.md still apply, but the tokens do not.

The Docker deployment is dramatically simpler. If you had a docker-compose.yml with Nginx, MinIO, and multiple services, you may now replace it with a single docker run command or, better yet, the startup script.

The Estate has been renovated, expanded, and — in one or two places — rebuilt from the foundations up. The rooms are the same. The staff are the same. But the house now travels with you, portable and private, sealed against the weather and the world outside. Welcome home.

Installation

macOS

Download the .dmg file and open it
Drag Quilltap to your Applications folder
Launch Quilltap from Applications

Windows

Download and run the .exe installer
Follow the installation prompts
Launch Quilltap from the Start Menu or desktop shortcut

Linux

Download the .AppImage file, make it executable (chmod +x), and run it
Or install the .deb package: sudo dpkg -i quilltap_*.deb
Requires Docker Engine — install from https://docs.docker.com/engine/install/

Node.js (any platform)

npm install -g quilltap
quilltap

On first run, the CLI downloads the application files (~150-250 MB) and caches them locally. Subsequent launches start instantly.

Docker

docker pull foundry9/quilltap:${TAG}

Or download the quilltap-linux-<arch>.tar.gz rootfs tarball for use with Lima. See the README for setup instructions.

Quilltap 2.11.0

View on GitHub →

Less scaffolding, more living space. The estate has been renovated.

Quilltap v2.11.0 Release Notes

Highlights

Windows-compatible Docker images — We now automatically provide Windows (and indeed, amd64) Docker images.

Docker Simplified — The entire Docker story has been rewritten. Gone are the compose files, the Nginx configs, the MinIO buckets, and the authentication scaffolding nobody asked for. In their place: a single docker run command, platform-aware startup scripts that auto-detect your local services, and transparent host port forwarding so Ollama and friends just work at localhost. One command to start. No config file required.

Large Import Fix — .qtap backup files over 10MB no longer choke on import. The proxy body limit and upload pipeline have both been corrected.

Major Features

Docker Infrastructure Overhaul

Removed dead Docker infrastructure: docker-compose.yml, docker-compose.prod.yml, docker-compose.test.yml, Dockerfile.allinone, and all supporting scripts (start-allinone.sh, init-letsencrypt.sh, nginx.conf)
Removed build:docker:rebuild, start:docker, stop:docker npm scripts
New HOST_REDIRECT_PORTS support in the Docker image for transparent host port forwarding
- New docker/entrypoint.sh script sets up socat forwarders for a comma-separated port list
- Docker users can now reach host services (Ollama, LM Studio, MCP servers) at localhost URLs without network mode tricks
- socat installed in the production Docker image

Platform-Aware Startup Scripts

New scripts/start-quilltap.sh (macOS/Linux) and scripts/start-quilltap.ps1 (Windows)
Platform detection sets the correct default data directory per OS
Auto-detects Ollama on port 11434 and adds it to HOST_REDIRECT_PORTS
Supports --data-dir, --port, --redirect-ports, --tag, --env, --restart, --dry-run
Checks for existing containers before creating duplicates
--no-auto-detect flag to skip service detection

Authentication Removal

Removed all authentication infrastructure: JWT, OAuth, Google sign-in
Removed JWT_SECRET, AUTH_DISABLED, OAUTH_DISABLED, GOOGLE_CLIENT_* from .env.example and documentation
Removed authentication sections from DEPLOYMENT.md
Simplified README.md Quick Start — no configuration required for local use

Documentation

Rewrote all Docker documentation around docker run and the new startup scripts
README.md Quick Start now recommends startup scripts with docker run as fallback
Updated DEVELOPMENT.md, DEPLOYMENT.md, DATABASE_ABSTRACTION.md, BACKUP-RESTORE.md
Added reverse proxy examples (Nginx, Caddy) to DEPLOYMENT.md
Added Docker user notes to in-app help files (startup wizard, connection profiles, embedding profiles)
Cleaned up stale references in .env.example, package.json, knip.json, lib/paths.ts, and the DataDirectorySection component
Updated Docker build process to ensure Windows and macOS coverage

Bug Fixes

Critical

Large .qtap import failure: Files over 10MB now import correctly
- Added proxyClientMaxBodySize: '100mb' to next.config.js to prevent proxy body truncation
- Frontend import now sends the original file via FormData instead of re-serializing JSON
- Backend import-execute endpoint now supports FormData uploads (matching import-preview)

Data Integrity

User ID migration table names corrected: prompts → prompt_templates, messages → chat_messages to match actual SQLite schema; removed memories from migration list (no userId column)
Story background file storage: Backgrounds now correctly stored in /story-backgrounds/ folder with proper projectId and folderPath metadata; auto-creates folder record on first generation per scope; fixed project list-files API response missing fields needed by FileBrowser UI

UI

Participants sidebar always visible: Removed isMultiChar gate so the sidebar renders even with zero participants; users can now add characters to empty chats; updated empty state message to “Add a character to get started”

Quilltap 2.10.2

View on GitHub →

Every great estate needs a proper introduction at the door.

Quilltap v2.10.2 Release Notes

Highlights

User Profile Setup — New first-run experience creates your identity on the estate. Choose your archetype, name your character, and let the turn manager know who’s actually in charge around here.

Features

User Profile Setup on First Run

New /setup/profile page greets first-time users with name input and archetype selection (Proprietor, Resident, or Author)
Creates a user-controlled character so the turn manager correctly yields to the user during conversation
Automatically sets the new user character as default partner for all existing LLM-controlled characters — no orphaned residents wandering the halls
PepperVaultGate redirects to profile setup when no user character exists
All setup page exits (Pepper setup, unlock, vault storage) route through profile setup when needed
Fallback greeting now uses the user character’s name when available
Updated startup wizard help documentation

Bug Fixes

Search provider capabilities guard: Connection profile forms no longer crash when a search provider is missing its capabilities object (p.capabilities.chat → p.capabilities?.chat in ProfileForm.tsx and ProfileModal.tsx)

Quilltap 2.10.1

View on GitHub →

The vault is sealed, the search is plugged in, and the butler didn't do it — the pepper did.

Quilltap v2.10.1 Release Notes

Highlights

Pepper Vault — No more fiddling with environment variables to secure your encryption. A web-based setup wizard auto-generates and stores your encryption pepper on first run, with optional passphrase protection. Existing env var users get a gentle nudge to migrate.

Pluggable Web Search Providers — Web search is now a proper plugin system. Swap search backends without touching code, manage API keys from Settings, and build your own providers with the new SearchProviderPlugin interface. Ships with a bundled Serper.dev plugin out of the box.

Major Features

Pepper Vault — Encryption Key Management

Auto-generates ENCRYPTION_MASTER_PEPPER on first run — no manual environment variable needed
Web-based setup wizard at /setup with optional passphrase protection
Encrypted pepper stored in SQLite pepper_vault table
Three startup modes: auto-resolve (no passphrase), unlock (passphrase required), and setup (first run)
Existing env var users prompted to store pepper in vault via dismissible banner
API routes at /api/v1/system/pepper-vault for status, setup, unlock, and store
PepperVaultGate client component redirects to setup when vault is uninitialized
Pepper state tracked in startupState with isPepperResolved() gate
Authenticated API routes return 503 when pepper is not yet resolved
lib/encryption.ts now uses lazy pepper loading (reads from process.env on demand)
ENCRYPTION_MASTER_PEPPER is now optional in env schema
Comprehensive unit tests for pepper vault lifecycle

Pluggable Web Search Provider System

New SEARCH_PROVIDER plugin type for pluggable web search backends
New SearchProviderPlugin interface in @quilltap/plugin-types 1.14.0
Search provider registry (lib/plugins/search-provider-registry.ts) for managing search provider plugins
Bundled Serper.dev search provider plugin (qtap-plugin-search-serper)
Web search handler rewritten to use search provider plugins with DB-stored API keys
Providers API now returns both LLM and search providers
API key test endpoint supports both LLM and search providers
SERPER_API_KEY env var deprecated in favor of Settings > API Keys (legacy env var still works as fallback)
New docs/SEARCH_PLUGIN_DEVELOPMENT.md guide for building custom search provider plugins

Bug Fixes

Removed verbose debug logging from pepper vault and web search handler

Migration Notes

Encryption pepper: New installations will be guided through the setup wizard automatically. Existing installations using ENCRYPTION_MASTER_PEPPER in their environment will continue to work — a banner will suggest migrating to the vault for convenience.
Serper API key: SERPER_API_KEY as an environment variable still works but is deprecated. Move your key to Settings > API Keys when convenient.

Quilltap 2.10.0

View on GitHub →

Business in the front, party in the back... literary salon on the veranda.

Quilltap v2.10.0 Release Notes

Subsystems

Subsystems are now almost characters in their own right, working to make your estate a great place to interact with LLMs, for business or pleasure.

Name	Function
The Foundry	The Heart of the Operation - architecture, plugins, packages, LLMs, API keys - the “settings”
Prospero	Your majordomo, conducting your projects, agents, the tools and files they access
Aurora	Our complex character model, and how it interacts with the LLMs that drive it
The Commonplace Book	Your characters’ memories, a simple RAG that just works
The Salon	The chat interface, so you can talk to one or many characters and LLMs
Calliope	The user experience, interface, and our many themes. We like our Art Deco here
Dangermouse	Always in the shadows, he can keep you from running afoul of your starched-collar LLM providers… or dodge them; he’s a black market of possibilities
The Lantern	The image generation support, now featuring atmospheric story backgrounds generated on the fly or on demand while you chat

If you hate the look and feel and weird names, use the Old School theme, it calls things by their boring old names.

Highlights

Agent Mode — LLMs can now use tools iteratively, verify results, and self-correct before delivering a final response. A new submit_final_response tool signals completion, with configurable max turns (1–25) and a settings cascade from global down to per-chat.

Dangerous Content Handling (Dangermouse) — A three-mode gatekeeper system classifies messages for sensitive content using the Cheap LLM, with optional auto-routing to uncensored providers. Includes chat-level danger classification, quick-hide integration, startup scanning, and image generation rerouting.

Story Backgrounds (The Lantern) — AI-generated atmospheric background images for chats, derived from scene context analysis. Includes context-aware character appearance resolution, clothing-appropriate outfit selection, and support on project pages with multiple display modes.

Character Appearance Overhaul — Clothing records are now separated from physical descriptions with full CRUD support. Physical descriptions gain a usageContext field. A new cheap LLM task resolves what each character currently looks like based on narrative context, stored records, and scene appropriateness.

Character Identity — Characters now support aliases (alternate names like “Liz” for “Elizabeth”) and pronouns (he/him, she/her, they/them, custom). Both are injected into system prompts, used in multi-character context, and respected by image generation.

Memory Gate — Pre-write similarity checks replace binary duplicate detection. Memories are reinforced, linked, or inserted based on semantic similarity, with logarithmic importance scaling and bidirectional relationship tracking.

Proactive Memory Recall — Characters now analyze recent conversation to extract search keywords and recall relevant memories before responding, running in parallel with compression checks.

New Default Theme: Professional Neutral — A cooler blue-gray palette with system fonts, restrained shadows, and compact UI replaces the original warm default. Two new theme plugins (The Great Estate, Old School) and a major Art Deco overhaul round out the theme work.

Massive Codebase Refactoring — Type-safe query filters, safeQuery() eliminating ~315 redundant catch blocks, ChatsRepository split into five modules, ~65 components migrated to qt-* classes, and a comprehensive security audit.

Major Features

Agent Mode — Iterative Tool Use with Self-Correction

LLMs can now use tools iteratively, verify results, and self-correct before delivering a final response
New submit_final_response tool signals completion of agent work
Configurable max turns (1–25, default 10) with force-final safety limit
Settings cascade: Global > Character > Project > Chat (each level can override)
Global settings in Foundry > Chat: default enabled toggle and max turns
Per-chat toggle in tool palette (Agent button)
Agent mode status persists in database and syncs across UI on page load
New SSE events: agent_iteration, agent_completed, agent_force_final
New help documentation at /help/agent-mode
Native tool execution rules injected into system prompt for models with function calling, reinforced in character voice after personality/scenario sections

Dangerous Content Handling (Dangermouse)

Gatekeeper service classifies user messages for sensitive content using the Cheap LLM
Three modes: Off (default), Detect Only (flag content), Auto-Route (reroute to uncensored providers)
Provider routing service resolves uncensored-compatible profiles for flagged content
Connection profiles and image profiles gain “Uncensored-Compatible” checkbox
Message display with DangerFlagBadge (category badges, rerouted indicator, override button)
DangerContentWrapper with Show/Blur/Collapse display modes
Content hash caching for classification deduplication (200 entries, 5min TTL)
Fail-safe design: classification errors never block messages
Image generation integration: classifies user image prompts and expanded prompts, reroutes to uncensored image providers
Uncensored fallback for empty LLM responses: when an LLM silently refuses content, retries with uncensored provider in AUTO_ROUTE mode — covers memory extraction (user, character, inter-character), context compression, chat streaming, story background prompt crafting, and appearance resolution
New DANGER_CLASSIFICATION system event type for LLM log tracking

Chat-Level Danger Classification

Background job classifies entire chats as dangerous using compressed context summary
Sticky behavior: once classified as dangerous, stays dangerous (never re-checks); safe chats skip re-classification unless new messages are added
Startup scan runs on server boot and every 10 minutes to classify legacy/unclassified chats
Context summary → danger classification chaining: completing a summary automatically triggers classification
Raw message fallback for chats without context summaries (truncated to 4000 chars)
Decision tree: chats with summary → classify directly; long chats (>50 messages) → generate summary first; short chats → classify from raw messages
Quick-hide sidebar integration with toggle across homepage, character conversations, project chats, and sidebar
Danger indicator (destructive-colored asterisk) on all chat listings
POST /api/v1/chats/[id]?action=reclassify-danger endpoint to reset and re-queue
Danger classification scoring now uses the maximum of overall score and highest per-category score, and respects the LLM’s explicit isDangerous: true response
New database fields: isDangerousChat, dangerScore, dangerCategories, dangerClassifiedAt, dangerClassifiedAtMessageCount

Story Backgrounds (The Lantern)

AI-generated landscape scene images featuring characters, triggered after chat title updates
Background images use 45% opacity behind chat content for atmosphere
Scene context derivation: new deriveSceneContext cheap LLM task analyzes recent messages to generate imaginative scene descriptions; if discussing a book or story, characters may be depicted as observers to that world
Context-aware character appearance resolution: new cheap LLM task determines what each character currently looks like based on narrative context, clothing records, physical descriptions, and usageContext matching
Clothing priority: narrative context (highest) > image prompt > stored records by usageContext > default
Story backgrounds respect chat-specific image profile settings (chat profile > story bg default > user default)
Support on project detail pages with configurable backgroundDisplayMode
Story background thumbnails in chat enrichment and on ChatCard components
Chat header shows clickable thumbnail (opens full-screen modal)
Manual regeneration via “Regenerate Background” button in chat tool palette
Background images pin to top of viewport instead of centering (prevents face cropping)
Theme background images yield to story backgrounds via CSS :has() selector
Duplicate job prevention and failed-job recovery
GET /api/v1/chats/[id]?action=get-background, GET /api/v1/projects/[id]?action=get-background
Image profile moved from per-participant to per-chat level (migration auto-populates from first participant)

Character Clothing Records

New clothingRecords embedded JSON array on characters (name, usageContext, markdown description)
Full CRUD API at /api/v1/characters/[id]/clothing and /api/v1/characters/[id]/clothing/[recordId]
Expandable card UI, modal editor with markdown preview, list with empty state
“Physical Descriptions” tab renamed to “Appearance” — now shows both physical descriptions and clothing records
Clothing records injected into system prompts as ## Clothing / Outfits block after physical appearance
Clothing data included in image generation prompt expansion context
Story background generation includes primary outfit in character descriptions
Backup/restore handles UUID remapping

Physical Description `usageContext` Field

New optional free-text field (up to 200 chars) describes when each appearance is most appropriate
Physical descriptions now injected into chat system prompts (previously only used for image generation)
Usage context passed through to image generation prompt expansion for scene-appropriate selection
Editor form with character counter and helper text; card display shows usage context inline

Character Aliases

Characters can have alternate names (e.g., “Liz” for “Elizabeth”)
Aliases included in the character’s own system prompt
Other participants’ aliases included in multi-character chat context
Image prompt placeholders (e.g., {{Liz}}) resolve aliases to correct character
Alias-based name prefixes stripped from LLM responses
Chip-style editor in character edit form; displayed inline on character view page

Character Pronouns

Dropdown selector with common presets (He/Him/His, She/Her/Her, They/Them/Their, It/It/Its) plus custom option
Pronouns included in character’s own system prompt
Other participants’ pronouns shown in multi-character chat context
User-controlled characters’ pronouns included in “You are talking to…” line
Displayed inline on character view page next to name and aliases

Character Identity Reinforcement

Short ## Identity Reminder block appended as the very last content before conversation messages
Reminds the LLM which character it is and who it must not write for
Multi-character variant explicitly names all other participants
Placed after memories and summaries for maximum compliance near the generation boundary

Proactive Memory Recall

Characters analyze recent conversation to recall relevant memories before responding
New cheap LLM task extracts search keywords from messages since the character last spoke
Keywords search the character’s memory store for contextually relevant memories
Pre-searched memories passed to context builder, skipping default single-message search
Works naturally in multi-character chats: each character recalls based on their own conversation gap
Runs in parallel with compression cache check to minimize latency
Status indicators: “Analyzing recent conversation…” and “Searching {name}‘s memories…”
Graceful fallback when cheap LLM unavailable or keyword extraction fails

Memory Gate — Pre-Write Similarity Check

Three-tier decision at write time: REINFORCE near-duplicates (≥ 0.80 similarity), LINK related-but-distinct (0.70–0.80), or INSERT genuinely new
Reinforced memories track observation count (reinforcementCount), last reinforcement time, and boosted importance (reinforcedImportance = min(1.0, importance + log2(count+1) * 0.05))
Related memories bidirectionally linked via relatedMemoryIds for thematic graph discovery
Novel detail extraction appends new facts as [+] footnotes when reinforcing
Housekeeping uses reinforcedImportance for protection/scoring; memories reinforced 5+ times always protected
Hard-cap scoring rebalanced: importance 0.4, recency 0.2, access 0.2, reinforcement 0.2
API supports skipGate for force-insert and relatedMemoryIds for manual link management
Falls back to keyword-based gate when embeddings unavailable
Database migration adds 4 columns with automatic backfill

Memory Deduplication Tool

New tool card on /foundry page for finding and merging duplicate memories across all characters
Cosine similarity with configurable threshold (0.70–0.95, default 0.80) to cluster duplicates
Union-Find clustering identifies transitive duplicate groups
Best survivor selected by importance, content length, and specificity scoring
Novel details from discarded memories preserved as [+] footnotes
Groups memories by embedding dimension to handle mixed-dimension vectors
Preview mode shows per-character analysis before any changes
Cleans up vector store entries for removed memories

UI Improvements

Foundry Hub Restructure

New /foundry landing page with 8 subsystem navigation cards (Aurora, The Forge, The Salon, The Commonplace Book, Prospero, Dangermouse, Calliope, The Lantern)
8 new sub-routes (/foundry/aurora, /foundry/forge, /foundry/salon, etc.)
New CollapsibleCard component with qt-collapsible-card-* CSS classes
Sidebar permanently collapsed: removed expand/collapse toggle, resize handle, width persistence
Sidebar footer: merged Settings + Tools into single “Foundry” link
/settings redirects to /foundry for backward compatibility

UI Route Renaming

/characters → /aurora (character model system)
/chats → /salon (chat interface)
/projects → /prospero (agentic and tool-using systems)
/tools → /foundry (architecture, plugins, and services)
Old routes redirect to new ones; API routes (/api/v1/*) unchanged

Participants sorted by predicted turn order instead of static display order
Numbered position badges: green pulsing (generating), green (next), blue (queued), neutral (eligible), amber (user turn), dimmed (spoken)
Inactive participants shown at bottom with dimmed appearance instead of hidden
Stop/interrupt button on generating character’s card replaces composer stop button in multi-character chats
Active/inactive toggle as visible eye icon button on each card
Connection profile dropdown on each card for instant model switching
“User (you type)” option allows switching characters to user control
Expandable per-card settings: system prompt override textarea and active toggle

Chat Composer Tool Palette Revamp

Hamburger menu reorganized into four labeled sections: Chat, Organize, Edit Content, Memory
Composer gutter tools: Attach, Generate Image, and RNG as icon buttons
RNG icon changed to recognizable dice face with pips
Dice roll options now have up/down spinner buttons to adjust count; counts persist within session
Preview toggle moved to formatting toolbar
Full-width mode properly expands composer textarea

Tool Message UI Improvements

Tool messages embedded inside message bubbles (assistant or user) instead of standalone
User-initiated tools embed in user messages; character-initiated tools in assistant messages
Collapsed state shows truncated preview text for request/response
Text content wraps properly instead of horizontal scrolling
Copy buttons for tool request and response sections; image copy to clipboard
Consistent vertical spacing between all messages

Chat Settings Simplification

Chat settings modal now only contains roleplay template and image generation settings
Per-participant settings (connection profile, system prompt override, active toggle) moved to sidebar cards
Image profile moved from per-participant to per-chat level

Unified ChatCard Component

Reusable ChatCard component across /salon, /prospero/[id], and /aurora/[id]/view
Configurable via props: showAvatars, showProject, showPreview, useRelativeDates, actionType
Chat cards display story background thumbnails when available
Highlight animation for newly imported chats

Queue Status Badges

Compact badge group in page toolbar shows active job counts for memory, summarization, danger classification, and story background queues
Color-coded: blue (memory), green (summary), red (danger), dark gray (story background)
Fully themeable via qt-queue-badge-* CSS variables
Event-driven polling: starts on route change or job enqueue, stops when all counts reach zero

Tag Management

New “Tag Management” section in Settings > Tags tab lists all tags with usage counts
Delete button with confirmation popover showing affected entity count
Deletion cascades across all 6 entity types (characters, chats, connection profiles, image profiles, embedding profiles, files)
Tag usage counts now include all entity types with totalUsage computed field

Theme System

Professional Neutral Default Theme

Color palette shifted from warm slate-blue (hue 220) to cool blue-gray (hue 225)
System font stack throughout — Inter and EB Garamond dropped in favor of OS defaults
Lower saturation, tighter fixed border radii, restrained shadows, compact UI
Qt-* variables scoped to [data-theme="default"] selector

The Great Estate Theme

Manor house library aesthetic — mahogany (hue 20) and gold (hue 43) palette
Full-page background image with carbon-fibre texture overlay
Playfair Display serif headings with Inter sans-serif body text
Gold left border on assistant messages, brown right border on user messages
Full light/dark mode support

Art Deco Theme Overhaul

Darker light mode palette, background images for both modes
Left sidebar fixed from dark navy to warm ivory in light mode
Assistant message font weight set to 500 via new --qt-chat-assistant-font-weight variable

Old School Theme

Preserves the original warm slate-blue palette with Inter and EB Garamond fonts
Plain-English subsystem name overrides (Settings, Prompts, Data, Chat Behavior, etc.)
Foundry cards redesigned with CSS grid layout; background images suppressed

Theme-Overridable Subsystem Names

SubsystemOverrides interface and optional subsystems field in ThemePlugin (@quilltap/plugin-types 1.13.0)
Override display names, descriptions, thumbnails, and background images for any Foundry subsystem
API returns overrides; new useSubsystemInfo() hooks in theme provider

Theme Package Sync

@quilltap/theme-storybook (1.0.18 → 1.0.20): ~120 missing CSS variables, ~15 missing class definitions, phantom class name fixes, removed phantom story sections
create-quilltap-theme (1.0.5 → 1.0.6): template and docs fixes matching class renames
Rains (→ 1.3.5) and Earl Grey (→ 1.3.3): ~120 missing qt-* variables backfilled for self-containment
Ocean theme removed from distribution

Refactoring & Technical Debt

Type-Safe Query Filters

New TypedQueryFilter<T> mapped type constrains filter fields to keyof T at compile time
Removed 138 unnecessary as QueryFilter casts across 26 repository files
6 compile-time type assertion tests; zero runtime changes

`safeQuery()` Helper

Standalone function and AbstractBaseRepository protected method
Converted ~315 catch blocks across 29 files
Three failure modes preserved: rethrow (writes), fallback (reads), silent (non-critical)

ChatsRepository SRP Split

Facade + 5 focused modules: Participants, Impersonation, TokenTracking, Messages, SearchReplace
Shared ChatOpsContext interface — zero changes to callers
1,115 → 422 lines in facade

Theme Utility Class Migration

~65 component files migrated from raw Tailwind to qt-* theme utility classes
Additional settings components cleaned up (modals, alerts, toggles, checkboxes, provider buttons)

Other Refactoring

UsersRepository.migrateUserId replaced direct SQLite access with database abstraction layer
Cheap LLM selection lifted out of compression guard (unconditional resolution)
sendToProvider() extracted in cheap-llm-tasks.ts, eliminating triple code duplication
pseudoToolInstructions renamed to toolInstructions throughout the pipeline
Replaced 19 empty if/else blocks with debug logging across 10 repository files
Inlined CHEAPEST_MODEL_MAP reference; removed deprecated tool-registry wrappers

Security

ReDoS fix: Spin-bottle regex bounded .* to .{0,50}
Query injection prevention: 1000-character max query length in memory and chat search methods
Command injection: exec() → execFile() in data-dir route
XSS prevention: Removed allowDangerousHtml from markdown renderer; raw HTML in messages now escaped
Input validation: Range checking for parseInt/parseFloat params in memories housekeeping endpoint

Performance & Backend

Visible Message Filtering

Chat messageCount now only counts visible bubbles (USER/ASSISTANT); system events, SYSTEM/TOOL messages, and context summaries no longer inflate counts
extractVisibleConversation() utility filters to USER/ASSISTANT and strips tool artifacts for all cheap LLM content-judging tasks (titles, summaries, backgrounds, compression, keyword extraction)
Title generation now uses up to 100 messages weighted toward end of conversation instead of just the first 6

Server-Side Markdown Pre-Rendering

Simple messages pre-rendered to HTML on the server with roleplay pattern support
Pre-rendered HTML returned in API response and rendered directly without client-side processing
Complex messages with embedded tools or attachments fall back to client-side rendering
Comprehensive CSS rules for all markdown elements in .qt-chat-message-content

Background Job Fixes

Per-job execution timeout (3 minutes) via Promise.race prevents indefinite hangs on LLM calls
Periodic stuck job recovery every 5 minutes (previously only on startup)
Failed jobs no longer block new job creation for the same chat
$expr operator support for field-to-field comparisons in SQLite query translator
Fixed findOneAndUpdate to correctly return updated document

Bug Fixes

Critical

Story backgrounds not generated on inline title updates (only unused background job path had the queue call)
Agent mode toggle showing “off” after navigating back to chat (agentModeEnabled missing from GET response)
Race condition in plugin initialization causing “Provider not found” errors in Docker (initialized flag set too early)
Danger classification re-queuing all chats on every server restart (off-by-one between messageCount and dangerClassifiedAtMessageCount)
Story backgrounds settings race condition causing data loss on rapid changes
BaseModal z-index stacking issue on project pages (now uses React portal)
ImageModal z-index stacking (also moved to React portal)
Plugin loading fails in Turbopack production builds (replaced __non_webpack_require__ with createRequire)

Data Integrity

Chat timestamps now always reflect last actual message (system events no longer update lastMessageAt/updatedAt)
Project chats sorted by lastMessageAt instead of updatedAt
Danger classification scoring now uses maximum of overall and per-category scores
Search and Replace now searches all memory fields (content, summary, keywords)

UI

Server-rendered code blocks overlapping and corrupted (roleplay patterns applied inside code blocks)
Markdown elements missing spacing in server-rendered messages (Tailwind Typography plugin not installed in v4)
SelectLLMProfileDialog using wrong API endpoint
Chat image profile setting not loading in Chat Settings modal
Thumbnail cache misses logging as errors (now uses fileExists check first)
Template Import modal not showing templates (response handling fix)
Dangerous content settings not persisting (missing field in PUT handler)
Image description fallback crash (wrong repository reference)

Housekeeping

Removed ~160 logger.debug() and console.debug() call sites across 53 files
Deleted Ocean theme plugin
Comprehensive dead code audit and removal (unused hooks, components, MongoDB stubs, migration stubs, webpack suppressions)
Updated DEAD-CODE-REPORT.md, migrations/README.md, appearance settings README
EmbeddingProfilesRepository.unsetAllDefaults return type standardized to Promise<number>
13 unit tests for markdown renderer canPreRenderMessage; 22 for RNG pattern detector; 17 for base repository helpers; 35 for vector store against SQLite backend

Known Technical Debt

Identified in audit, deferred to future releases:

Inconsistent error handling: Some repositories throw, some return null, some return empty arrays — needs a unified convention
Duplicated search/replace logic: MemoriesRepository and ChatsRepository could share a SearchableRepository mixin

Previously identified items resolved in this release:

~~ChatsRepository SRP split~~
~~Redundant try-catch wrappers (safeQuery() helper)~~
~~UsersRepository.migrateUserId bypasses database abstraction~~
~~~45 remaining component files with raw Tailwind violations~~
~~QueryFilter loosely typed across all repositories~~

Quilltap 2.9.0

View on GitHub →

Backups, RNG, and faster chat responses

Quilltap v2.9.0 Release Notes

Highlights

400%+ Faster Chat Responses — Compression caching overhaul eliminates the wait for context compression. When pre-compression isn’t ready, Quilltap falls back to the previous cache with a dynamically expanded context window—trading a few extra tokens for dramatically faster response times.

Chat State — Persistent JSON storage attached to chats and projects, enabling game mechanics, inventories, character stats, and any structured data that should survive across sessions. Path syntax supports dot notation and array indexing. Underscore-prefixed keys are protected from AI modification.

RNG Tool & Auto-Detection — Built-in random number generation for dice rolls, coin flips, and “spin the bottle” participant selection. Dice notation in messages (e.g., “2d6”, “d20”) is detected and executed automatically—when a character says “I roll 2d6,” the dice actually roll.

AI Wizard Enhancements — Character generation wizard now streams real-time progress, generates names, and accepts document uploads (text, Markdown, PDF) as source material for character creation.

Built-in Help Search — LLMs can now search Quilltap’s documentation during conversations via the search_help tool, helping users understand features without leaving their chat.

Complete Backup System — Backup archives now include plugin configurations and npm-installed plugins, enabling full system restoration from a single ZIP file.

Major Features

400%+ Faster Chat Responses

Pre-compression now triggers immediately after assistant message save, not after memory extraction and context summary checks (previously 68+ second delay)
When async pre-compression isn’t ready, falls back to previous cache instead of waiting
Dynamic window calculation ensures no messages are lost when using older cache
Trade-off: slightly more tokens (larger context window) for significantly faster response time
Compression cache now persists to database, surviving server restarts
Cache lookup order: in-memory (fastest) → database (survives restarts) → sync compression (fallback)
Relaxed cache validation allows up to 50 new messages before requiring fresh compression
System prompt hash validation ensures cache validity
Multi-turn conversations with tool calls (RNG, state, MCP) now properly benefit from caching

Chat State for Persistent JSON Storage

New state field on chats and projects stores arbitrary JSON data
Built-in state LLM tool with fetch/set/delete operations
Path syntax: dot notation (player.health) and array indexing (inventory[0].name)
Inheritance: chat state overrides project state for chats within projects
Protected keys: underscore-prefixed keys (e.g., _notes) cannot be modified by AI
StateEditorModal for viewing and editing state in the UI
State button in chat ToolPalette (database icon)
Project State section in project settings
API endpoints: GET/PUT/DELETE with ?action=get-state/set-state/reset-state

RNG (Random Number Generator) Tool

Built-in rng tool for dice rolls, coin flips, and random participant selection
Supports any die configuration from d2 to d1000
Results are permanent chat messages visible to all characters
Manual invocation via RngDropdown with quick options (d6, d20, 2d6, coin, bottle)
Custom roll interface for arbitrary dice configurations
Uses cryptographically secure random numbers
Auto-detection of patterns in user and assistant messages:
- Dice notation: “2d6”, “d20”, “3d10”
- Coin flips: “flip a coin”
- Spin the bottle: randomly selects a chat participant
autoDetectRng setting (default: true) can be disabled in Chat Settings
Pending tool results shown as chips in composer before sending

AI Wizard Improvements

Real-time progress — Each field shows checkmark and snippet as it completes via Server-Sent Events
Name generation — Wizard no longer requires a name first; can generate completely random characters
Document upload — New “Upload a document” option accepts .txt, .md, and PDF files as character source material
Streaming endpoint: POST /api/v1/characters?action=ai-wizard-stream

Built-in Help Search Tool

search_help tool allows LLMs to search Quilltap documentation during conversations
Uses semantic search when OPENAI_API_KEY is available, keyword fallback otherwise
Always enabled by default
Pre-computed embeddings loaded from compressed MessagePack bundle (~3-4MB vs. ~24MB JSON)
Whole-document embeddings for better retrieval quality

Complete Backup & Restore

Backup now includes plugin configurations from plugin_configs table
Backup now includes npm-installed plugins from plugins/npm/ directory
Restore recreates plugin configs and extracts npm plugins
Manifest counts include pluginConfigs and npmPlugins
Full system recreation from a single backup file
S3/cloud backup functionality removed; local download only (use external scripts for cloud storage)

Provider Updates

Grok Plugin Migration to xAI Responses API

BREAKING: Migrated from deprecated Chat Completions API to Responses API (/v1/responses)
Uses direct HTTP (fetch) instead of OpenAI SDK for chat
New models: grok-4, grok-4-1-fast (2M context), grok-3, grok-3-mini, grok-2-1212, grok-code-fast-1
Web search uses server-side tools (web_search, x_search) instead of deprecated Live Search API
Image format changed from image_url to input_image
Stateless operation with store: false
Image generation model updated to grok-2-image
Plugin version: 1.0.14

Embedding Service Plugin Architecture

Embedding providers now delegate to plugins via createEmbeddingProvider() factory
EmbeddingProvider interface added to LLMProviderPlugin
New provider classes: OpenAIEmbeddingProvider, OllamaEmbeddingProvider
Built-in TF-IDF provider implemented as LocalEmbeddingProvider
Registry method createEmbeddingProvider() matches createImageProvider() pattern
Removed hardcoded provider handlers from embedding-service.ts
Plugin versions: openai 1.0.16, ollama 1.0.10

Image Provider Enhancements

promptingGuidance field for provider-specific prompting tips
styleInfo field with ImageStyleInfo interface for style/LoRA details and trigger phrases
Chat LLM receives guidance in image generation tool description
Cheap LLM incorporates style trigger phrases when crafting expanded prompts
@quilltap/plugin-types v1.12.0

Plugin Icon System Redesign

Plugins provide SVG data via icon property instead of React components
PluginIconData interface in @quilltap/plugin-types v1.10.0
renderIcon deprecated (kept for backwards compatibility)
Removed React peer dependency from all bundled provider plugins
ProviderIcon component renders SVG with abbreviation fallback

API & Backend

Data Directory Management

New Profile page section shows data directory location, configuration source, and platform
“Open in File Browser” button opens directory in native file explorer (macOS/Windows/Linux)
Copy button for path
Docker environments show guidance about host volume mounts
API endpoint: GET/POST /api/v1/system/data-dir

Native Import/Export Improvements

Character field remapping: defaultConnectionProfileId, defaultImageProfileId, defaultRoleplayTemplateId
Chat participant roleplayTemplateId remapped (preserves plugin template references)
Profile tags reconciled: connection profiles, image profiles, embedding profiles
Roleplay template tags reconciled on import
Memory field remapping: projectId, tags

Middleware Refactoring

createAuthenticatedHandler → createContextHandler
createAuthenticatedParamsHandler → createContextParamsHandler
AuthenticatedContext → RequestContext
withAuth → withContext, withAuthParams → withContextParams
checkOwnership replaced with simpler exists type guard
Legacy aliases maintained for backward compatibility

Performance

Compression Cache Improvements

Pre-compression triggers immediately after assistant message save (not after 68+ second async work)
Runs in parallel with memory extraction
System prompt hash validation for cache retrieval
Debug logging for cache hit/miss reasons
Cache persists to database, surviving server restarts
Lookup order: in-memory → database → sync compression fallback
Relaxed validation: allows up to 50 new messages
Fallback to previous cache when async pre-compression not ready
Dynamic window calculation ensures no messages lost with older cache

Cache Bug Fixes

Fixed invalidation in multi-character chats (was comparing filtered vs. raw event counts)
Multi-turn conversations with tool calls now properly benefit from caching

UI Improvements

Chat Response Status Indicator

Visual indicator shows current processing stage during AI response generation
Stages: compressing (blue), gathering (purple), building (amber), sending (blue), streaming (green), tool_executing (purple)
QuillAnimation for streaming, pulsing icon for other stages
Accessible with role="status" and aria-live="polite"
Respects prefers-reduced-motion

User messages have 1.5rem top margin for better visual separation
Tool messages have 1rem vertical margin with qt-chat-message-row-tool class
Dead code removed from avatar-styles.ts, connection-resolver.ts, chat-files-v2.ts
Hard-coded Tailwind colors converted to qt-* utility classes
New qt-border-/30 and hover:qt-bg-/10 opacity variants for status colors

Template Import Fix

“Import from Template” modal now correctly shows templates
Fixed prompt-templates response handling to extract .templates array
Added ?all=true parameter for flattened prompt list

Documentation

Comprehensive User Guides

Projects: Main overview, file management, chat association, character roster, settings
Chats: Overview, multi-character setup, turn manager, participants sidebar, message actions
Startup Wizard: Complete rewrite with provider-specific setup instructions
Page links: Every help file now includes direct link to corresponding app page with tab parameters

Help System Infrastructure

Pre-computed embeddings in gzipped MessagePack (help-bundle.msgpack.gz)
Whole-document embeddings instead of heading-based chunks
npm run build:help generates the bundle
Semantic search via cosine similarity with keyword fallback

API Documentation Updates

Version updated from v2.8 to v2.9
Six new sections: Chat Settings, Models, Files (v1), System Backup & Restore, System Data Directory, System Mount Points
Legacy endpoints marked in Table of Contents
About page updated: new tagline, expanded description, quilltap.ai link, removed authentication references

Bug Fixes

Critical

Temporary backup download failing due to HMR invalidating in-memory storage (moved to globalThis singleton)
User-initiated tool results not sent to LLM (field name mismatch: tool vs. toolName)
Virtualizer positioning bug when messages replaced (now uses message IDs as keys, not indices)
Pending tool results not persisting to database (missing parameter in API route)
LLM responses wrapped in content block format now normalized

Docker

Build failures from npm lockfile issues fixed
npm upgraded in base image to fix “Invalid Version” bug
@quilltap/plugin-types changed from file: reference to npm package
package-lock.json regenerated in Linux container
Deprecated --only=production updated to --omit=dev

Testing

E2E Test Updates

Uses production build for stability
Fresh temp data directory per test run
Removed authentication code (single-user mode)
Updated API routes to /api/v1/ prefix
Removed deprecated persona tests
Retry logic for flaky page loads
Ollama with llama3.2 as default test provider

Refactoring

Tools System

Tools sent with every LLM prompt (removed periodic re-injection logic)
forceToolsOnNextMessage flag retained only for change notifications

Dependencies

@anthropic-ai/sdk upgraded to ^0.72.1
ESLint rule added to catch “Quilttap” misspellings

Removals

S3/cloud backup destination removed from backup dialog
Cloud backups list and selection removed from restore dialog
S3-related backup functions removed
Restore API simplified to file uploads only

Breaking Changes Summary

Grok plugin uses Responses API — Migrated from deprecated Chat Completions API
Plugin icons use SVG data — renderIcon deprecated; React no longer required for icons
S3 backup removed — Local download only; use external scripts for cloud storage

Quilltap 2.8.1

View on GitHub →

Improvements for local AI users

Quilltap v2.8.1 Release Notes

Highlights

Built-in TF-IDF Embeddings - Semantic memory search now works out of the box with zero API keys required. The new built-in embedding provider uses TF-IDF with BM25 enhancement, running entirely offline.

First-Startup Seeding - Fresh installations now include a default character and embedding profile, making Quilltap immediately usable without configuration.

Ollama Quality-of-Life Improvements - Better defaults, auto-filled URLs, and fixes for common connection issues.

Major Features

Built-in TF-IDF Embedding Provider

New qtap-plugin-builtin-embeddings provides zero-dependency, offline embedding
Uses TF-IDF with BM25 enhancement and Porter stemming
Bigram support for improved phrase matching
Vocabulary automatically fits to your memory corpus
Debounced refitting when memories change (5-second debounce)
New database tables: tfidf_vocabularies, embedding_status
Background jobs: EMBEDDING_GENERATE, EMBEDDING_REFIT, EMBEDDING_REINDEX_ALL
API endpoints for manual refit and reindex operations
UI displays vocabulary stats and embedding progress indicators
Theme-aware provider badges via qt-badge-provider-* CSS classes

First-Startup Data Seeding

Default character “Ben” seeded on first startup when database is empty
Default “Built-in TF-IDF” embedding profile seeded and set as default
Enables semantic memory search immediately without configuration
Seeding runs as phase 1.25 after migrations in startup sequence
Safe to call multiple times—only seeds when no data exists

Improvements

Connection Profile UX

Auto-fill profile name with PROVIDER/MODEL_NAME when selecting a model
Auto-fill base URL when selecting a provider with defaults (e.g., Ollama → http://localhost:11434)
Connection test errors now display in the profile modal UI
Improved error logging shows messages directly

Provider Fixes

Ollama: Fixed 405 error caused by trailing slash in base URL creating double-slashes
Ollama: Cheap LLM fallback now uses current profile’s model instead of hardcoded llama3.2:3b
Embedding providers: Fixed embedding-only providers not appearing in dropdown (plugin initialization now registers EMBEDDING_PROVIDER capability)

Bug Fixes

Memory extraction: Handles LLMs returning object instead of string for content/summary fields
Memory search: Regex metacharacters (*, ?, (), etc.) now escaped in fallback search, preventing crashes when semantic search unavailable

Quilltap 2.8.0

View on GitHub →

Migration from MongoDB to SQLite, the beginning of the great clown divestiture

Quilltap v2.8.0 Release Notes

Highlights

SQLite-Only Architecture - Complete removal of MongoDB backend. Quilltap now runs on SQLite exclusively, dramatically simplifying deployment and reducing infrastructure requirements. Existing MongoDB installations can migrate using the standalone CLI tool.

Single-User Mode - Authentication system removed entirely. Quilltap operates as a single-user application, eliminating login flows, OAuth configuration, and session management complexity.

Platform-Native Data Storage - Data directories now follow OS conventions: ~/Library/Application Support/Quilltap on macOS, %APPDATA%\Quilltap on Windows, and ~/.quilltap on Linux.

Per-Chat Tool Management - Granular control over which LLM tools are available in each conversation, with hierarchical plugin and subgroup toggling.

Google Gemini 3 Support - Full support for Gemini 3 thinking models with proper SDK migration and native tool formatting.

Major Features

SQLite-Only Database Backend

MongoDB support completely removed from codebase
SQLite uses WAL mode for improved concurrent access
Query translation layer converts MongoDB-style filters to SQL
JSON column support for complex nested data
Zod schema introspection generates SQLite DDL automatically
Simplified Docker deployments—no external database container required
Migration tool available as npx @quilltap/mongodb-to-sqlite for existing installations
All 25 repositories migrated to database abstraction layer

Single-User Mode

All authentication code removed: OAuth, email/password, TOTP 2FA, trusted devices
No sign-in pages, session management, or multi-user overhead
Existing multi-user installations must run npx ts-node scripts/migrate-to-single-user.ts before upgrading
Server fails to start if AUTH_DISABLED=false (with migration instructions)
Migration script supports interactive user selection and --dry-run mode
API keys re-encrypted during migration to match new user identity

Centralized Platform-Native Data Directories

Single source of truth for all data paths via lib/paths.ts
Platform-specific defaults:
- Linux: ~/.quilltap
- macOS: ~/Library/Application Support/Quilltap
- Windows: %APPDATA%\Quilltap
- Docker: /app/quilltap (mounted from host)
Directory structure: <base>/data, <base>/files, <base>/logs
QUILLTAP_DATA_DIR environment variable for custom locations
Automatic migration from legacy paths with .MIGRATED marker files
Docker data persists on host filesystem by default

Per-Chat & Per-Project Tool Settings

Enable/disable specific LLM tools per chat via “Tools” button in tool palette
Hierarchical management: plugin-level, subgroup-level (MCP servers), and individual tool toggles
Tri-state checkboxes for intuitive bulk control
Project-level default tool settings inherited by new chats
System message injected when settings change to notify LLM of available tools
request_full_context tool remains always-enabled as safety valve
Tool re-injection optimization: tools sent to LLM every N messages (matching sliding window)
New API endpoints for programmatic tool configuration

Google Plugin Overhaul

BREAKING: Migrated from deprecated @google/generative-ai to new @google/genai SDK v1.37.0
Dynamic model listing via ai.models.list() API
Native functionResponse tool result format (no more text fallback)
Gemini 3 thinking model support with thinkingBudget: 4096 configuration
Proper extraction of thought summaries from response parts
Schema sanitization removes unsupported JSON Schema fields
Deprecation warnings for Gemini 2.0 models (retiring March 3, 2026)

API & Backend

Legacy Route Removal

Deleted 157 deprecated API route stubs that returned 410 Gone
All API access exclusively through /api/v1/ endpoints
Removed movedToV1() helper function
Non-v1 routes retained: /api/health, /api/plugin-routes/[...path], /api/themes/*

Plugin System Improvements

Auto-upgrade npm-installed plugins at startup (non-breaking updates only)
Breaking updates logged and displayed in new “Upgrades” tab
Upgrade confirmation modal for breaking changes
Plugin metadata enriched with repository, changelog, and npm links
PLUGIN_AUTO_UPDATE=false environment variable to disable auto-upgrades
BREAKING: Per-user plugin installation removed; all plugins now site-wide only
Migration script moves existing user plugins to site directory

MCP Plugin Enhancements

Built-in tool collision detection prevents shadowing Quilltap tools
getBuiltinToolNames() added to @quilltap/plugin-utils v1.3.0
Tools from MCP servers that would shadow built-ins get prefixed automatically
Tool hierarchy exposed via getToolHierarchy() for subgroup management

UI Improvements

Settings Redesign

New SettingsCard component for consistent styling across all settings tabs
Chat Settings, Appearance Settings migrated to card-based layout
Responsive grid layout with qt-card-grid-auto wrapper
Cards support badges, metadata grids, inline/footer actions, status messages

Theme System Enhancements

Theme selector shows each theme name in its heading font for preview
Custom fonts lazy-loaded when theme popout menu opens
Rich interactive previews: expand theme cards to see actual UI elements
Side-by-side light/dark mode previews for themes supporting both
Scoped preview CSS prevents affecting page styling

Homepage Improvements

Cards extend to fill available viewport height
Section cards display as many items as fit without scrolling
Characters section shows all non-NPC characters (not just favorites)
30/30/40 column widths in full-width mode
Projects sorted by most recent activity (files, chats, or metadata)
Recent chats sorted by last message time, not metadata modification

Typography & Theming

New semantic typography system: qt-page-title, qt-section-title, qt-card-title, qt-card-subtitle, qt-meta, qt-label, qt-helper, qt-body, qt-link, qt-action
Default theme shifted to warm slate-blue palette
Light mode uses warm off-white background
Dark mode has visible surface hierarchy (page → card → popover)
Card shadows tuned for improved visibility in both modes
Light/dark/system mode toggle added to themes menu in sidebar

Rains Theme Revision

Claude-inspired aesthetic replacing muddy orange-brown
Dark mode: clean charcoal with subtle warmth
Light mode: refined warm parchment/cream
Accent shifted from terracotta to orange-amber
Nunito Sans font for user messages
Higher contrast text for improved readability

Chat & Character Improvements

Character Page Enhancements

Conversations tab shows project badges for chats in projects
Conversations sorted by last message timestamp
Chat header breadcrumb shows LLM-controlled character avatars with links
Character cards on homepage use two-line descriptions
NPCs moved from Settings to Characters page
Favorite toggle fixed (was returning 405 error)

Image & File Handling

Redesigned image modal character tagging UI
Tagged characters list with avatar badge and set/remove controls
Fixed avatar detection and state management bugs
Physical description editor uses wide dialog matching page width
PNG placeholder generation for character exports without avatars

Project Context

Project instructions periodically re-injected during long conversations
Configurable projectContextReinjectInterval (default: 5 messages)
Ensures project context survives context compression

Provider Updates

OpenRouter

SDK upgraded from 0.4.0 to 0.5.1
Streaming refactored to use callModel() with getTextStream()
Bypass SDK’s callModel when tools present (JSON Schema compatibility)

General

Zod upgraded from v3 to v4
Native z.toJSONSchema() replaces zod-to-json-schema dependency
Updated ZodError.errors → ZodError.issues across codebase

Removals

Mobile UI Support

App now targets tablet and desktop viewports only (minimum 768px)
Deleted mobile-specific components: MobileToolPalette, MobileParticipantDropdown
Removed hamburger menu, mobile sidebar overlay, off-canvas behavior
ParticipantSidebar always visible

Sync Functionality

Removed sync UI from Tools page
Removed /api/v1/sync/ API routes
Removed sync libraries, repositories, and documentation
Migration drops sync-related database tables

Legacy Features

Removed legacy lib/images.ts (replaced by lib/images-v2.ts)
Removed pre-v2.7.0 migrations (minimum upgrade path now v2.7.0 → v2.8+)
Removed MongoDB-specific migration code
Removed excessive debug logging (~11,300 lines)

Testing

New Test Coverage

sillytavern-png-placeholder.test.ts (21 tests): PNG placeholder generation
plugin-upgrader.test.ts (19 tests): Plugin upgrade system
builtin-tools.test.ts (27 tests): Built-in tool collision detection
tool-settings.test.ts (20 tests): Per-chat/per-project tool hierarchies
theme-system.test.ts (25 tests): Theme selection and preview generation
single-user-migration.test.ts (34 tests): Single-user migration scenarios
llm-logging.service.test.ts (12 tests): LLM logging behavior
llm-logs-api.test.ts (13 tests): LLM logs API routes
tools-api.test.ts (2 tests): Tools API endpoints
session-api.test.ts (2 tests): Session API for single-user mode
prompt-templates-api.test.ts (14 tests): Prompt templates API
Database abstraction (239 tests): Config, query translator, schema translator

Bug Fixes

Critical

SQLite boolean values in WHERE clauses now convert to 0/1
SQLite undefined values convert to null (prevents binding errors)
SQLite nested field queries within JSON arrays work correctly
API key re-encryption migration handles undecryptable keys gracefully
Mount point path migration handles tilde-prefixed paths
User-controlled characters show “Queue” button, not “Nudge”

UI

Chat user message text visible on blue background
Duplicate avatar in streaming message indicator removed
Sidebar shows correct count of non-project chats
Image profile form shows Google API keys correctly
Character default image profile saves to database
API key creation modal closes and refreshes properly

Provider

OpenRouter streaming tool calls detected (camelCase handling)
Google plugin empty responses from thinking models fixed
Rains theme font loading on nested routes fixed

Refactoring

Code Quality

Dead code cleanup via knip analysis (21 files, unused dependencies)
Removed excessive logging from codebase (31 debug calls)
Code quality improvements and qt-* class migration
Hot reload state persistence via global namespace for all registries

CSS Migration

Settings pages converted to qt-* semantic classes
CreateProjectDialog converted to qt-* classes
Multiple components migrated from raw Tailwind to semantic tokens

Documentation

Updated DATABASE_ABSTRACTION.md for SQLite-only architecture
Updated DEPLOYMENT.md with Docker plugin management
Removed “PNG not yet implemented” note from API.md
Updated all documentation to reflect single-user mode

Breaking Changes Summary

MongoDB removed — Must migrate using CLI tool before upgrading
Authentication removed — Multi-user installations must run migration script
Google SDK changed — Plugin uses new @google/genai SDK
Per-user plugins removed — All plugins now site-wide only
Mobile UI removed — Minimum viewport width is 768px
Minimum upgrade path — v2.7.0 required before upgrading to v2.8.0

Quilltap 2.7.0

View on GitHub →

MCP, projects, homepage redesign, and more

Quilltap v2.7.0 Release Notes

Highlights

File Storage Abstraction - Pluggable storage backends with mount points. Configure multiple storage locations, migrate files between them, and use S3 or local filesystem interchangeably.

Context Compression - Sliding window compression for long conversations using a cheap LLM. Keeps recent messages intact while summarizing older ones, dramatically reducing token costs.

Project-Based Organization - New Projects feature for organizing chats, files, and characters with optional instructions injected into system prompts.

MCP Integration - Model Context Protocol server connector plugin enables connecting to external MCP servers and exposing their tools to Claude.

Homepage Redesign - Three-column responsive layout with welcome section, quick actions, recent chats, projects, and favorite characters.

Major Features

File Storage Abstraction System

New generic FileStorageBackend interface for pluggable storage providers
Built-in local filesystem backend (configurable via QUILLTAP_FILE_STORAGE_PATH)
S3 backend moved to optional plugin (qtap-plugin-storage-s3)
Mount point system for configuring multiple storage locations in Settings → File Storage
Files can be stored on different backends per mount point
Per-project mount point assignment with batch file migration when changing
Encrypted secrets storage for backend credentials (AES-256-GCM)
Orphan file recovery tool: scan backends for untracked files and adopt them
First-class folder entities with database persistence (empty folders now survive)
Project-based S3 key structure: users/{userId}/{projectId}/{folderPath}/{fileId}_{filename}

Sliding Window Context Compression

Compresses older messages using a cheap LLM while keeping last N messages (default 5) in full context
Also compresses system prompts when compression is active (tool definitions never compressed)
New request_full_context tool allows AI to reload full conversation if needed
Async pre-compression: starts immediately after receiving response, cached for next message
Configurable window size (3–10 messages) and target token counts in Chat Settings
Keep-alive pings during compression prevent proxy/ALB timeouts

Projects Feature

Organize chats, files, and characters under Projects with optional instructions
Project instructions injected into system prompts for all associated chats
Character roster management with “allow any character” option
project_info LLM tool for accessing project context (get_info, get_instructions, list_files, read_file, search_files)
Project detail page with expandable cards: Files, Characters, Settings
Sidebar integration with collapsible projects showing nested chats
Full backup/restore and sync support for projects

Homepage Redesign

Three-column responsive layout (3 columns desktop, 2 tablet, 1 mobile)
WelcomeSection with personalized greeting
QuickActionsRow: Start Chat, Continue Last, New Project, Generate Image
RecentChatsSection with avatars, message counts, and quick-hide filtering
ProjectsSection showing 4 most recent active projects
CharactersSection with 2×2 grid of favorites and quick chat buttons
Standalone /generate-image page for image generation outside chats

MCP Server Connector Plugin

New built-in qtap-plugin-mcp for connecting to Model Context Protocol servers
Uses official @modelcontextprotocol/sdk with Streamable HTTP and SSE transports
Dynamically discovers and exposes tools from connected MCP servers
Collision-aware naming: tools use original names, only prefixed on collision
Supports multiple simultaneous server connections
Authentication: Bearer tokens, API keys, and custom headers
Auto-reconnection with configurable retry attempts
Configuration via Settings → Tools with JSON array of server configs

Multi-Tool Plugin Support

New getToolDefinitions(config) and executeByName(toolName, input, context) pattern
All tool plugins standardized on multi-tool pattern for consistency
Tool registry handles both static and dynamic (MCP) tools
Dynamic tool discovery happens at request time, not startup
Updated @quilltap/plugin-types to v1.9.0

API & Backend

V1 API Migration (Complete)

All legacy /api/* routes now return 410 Gone with redirect instructions
New /api/v1/ namespace with action dispatch pattern (?action=)
Consolidated endpoints for all entity types with consistent response formats
Authentication routes: /api/v1/auth/signup, /api/v1/auth/change-password, /api/v1/auth/delete-account, /api/v1/auth/2fa/*, /api/v1/auth/oauth/[provider]/*
System routes: /api/v1/system/deployment, /api/v1/system/plugins/initialize, /api/v1/system/backup, /api/v1/system/restore, /api/v1/system/mount-points, /api/v1/system/tools
All frontend components updated to use v1 endpoints

Migration System Overhaul

Migrations now run in instrumentation.ts BEFORE server accepts requests
If migrations fail, process exits with code 1 (container won’t start with incompatible data)
Removed qtap-plugin-upgrade plugin; migrations now in migrations/ directory
MigrationRunner class handles dependency sorting, state tracking, and execution
Migration state stored in MongoDB migrations_state collection

Token Usage Tracking

Added promptTokens, completionTokens fields to messages
Added aggregate fields to chats: totalPromptTokens, totalCompletionTokens, estimatedCostUSD
Added usage tracking to connection profiles: totalTokens, messageCount
New SystemEvent type for tracking cheap LLM operations
Cost estimation using OpenRouter pricing when available, with fallback
Token display settings: per-message tokens/cost, chat totals, system events
New UI components: TokenBadge, SystemEventMessage, ChatCostSummary

Chat Improvements

File Management LLM Tool

New file_management tool for LLM access to project and general files
Operations: list, read, write, and organize files with folder support
Write operations require user permission (per-project or general scope)
Inline file write permission prompt in chat with quick approve/deny
Deferred execution: tool pauses for approval, LLM sees “Waiting for user approval”
Permission schema with SINGLE_FILE, PROJECT, GENERAL scopes

Enhanced First Message Context

Characters receive relevant context when speaking first in new chats
Includes recent and semantically-relevant memories about other participants (3–5 per participant)
Project name, description, and instructions included if chat is in a project
Generic greeting fallback when LLM content filter blocks auto-generation

Graceful Error Recovery

When requests exceed LLM limits, system attempts to recover gracefully
Sends simplified message to LLM explaining what happened with attachment details
LLM provides in-character response suggesting alternatives
Two-tier fallback: LLM-generated recovery, then static fallback
Recovery messages saved with recoveryType metadata field
Supports: token limits, PDF page limits (max 100), image/file size limits

Turn Manager Improvements

Chat title moved to page toolbar header (updates automatically on rename)
Manual chat rename with auto-rename toggle in tool palette
Dynamic browser tab title shows chat name
Improved continue mode error handling with better error extraction

File System Improvements

Redesigned File Browser

Grid view with image thumbnails (on-demand generation with S3 caching)
List view with sortable columns (Name, Associations, Type, Date)
Preview modal for images, PDFs (via PDF.js), text/code files
Syntax highlighting for code files using react-syntax-highlighter (30+ languages)
Markdown files render with formatting, YAML frontmatter in “Document Info” section
Wikilinks supported: [[File]], [[File#Header]], [[File|Text]]
Copy button for text file preview
Folder operations: create, rename, delete
File operations: move to folder, rename, delete with association management

File Upload Improvements

Removed file type restrictions; all types can now be uploaded
Backend automatically detects plain text content by sampling bytes
MIME type inferred from file extension when browser provides generic types
Paste images directly into chat textarea (auto-uploaded and attached)
Unicode filenames work correctly (Base64 encoding for S3 metadata)
Automatic image resizing for provider size limits (Anthropic: 5MB, others: 20MB)

Enhanced File Deletion

Files linked to chats/characters show detailed confirmation dialog before deletion
Dialog lists all associated characters and chats using the file
Deleting dissociates from linked entities; messages get deletion note appended
Direct repository updates ensure memories are NOT regenerated from deletion notes

Provider & Tool Improvements

Tool Plugin Development

New TOOL_PROVIDER plugin capability for custom LLM tools
Tool plugins define schemas, validation, execution handlers, and result formatting
Plugin configuration UI for per-user settings via configSchema
PluginConfigModal dynamically renders form fields
Configuration stored per-user in MongoDB
Documented in TOOL_PLUGIN_DEVELOPMENT.md

Curl Tool Plugin

New qtap-plugin-curl provides curl tool for HTTP requests
Supports GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS
Options: --url, --request, --header, --data, --user-agent, --max-time, --location, --insecure, --render
--render converts HTML responses to plain text
Configurable URL allowlist for security
Blocks private/local IP addresses for SSRF protection

Provider Improvements

Anthropic plugin fetches models from API instead of hardcoded list (v1.0.10)
OpenRouter streaming token tracking fixed
OpenAI-compatible streaming token tracking fixed
GPT-5 reasoning models: increased max_completion_tokens to 4096
OpenRouter public API used for cost estimation fallback
Web search tool decoupled from native provider web search (useNativeWebSearch field)

Theming & CSS

New qt-* Utility Classes

Alert CSS variables: --qt-alert-success-*, --qt-alert-warning-*, --qt-alert-error-*, --qt-alert-info-*
Entity badge CSS variables: character, persona, chat, tag, memory
Status badge CSS variables: enabled, disabled, related, manual, auto
Plugin source badge CSS variables: included, npm, git, manual
Warning button CSS variables: --qt-button-warning-*
Search highlight class: qt-highlight with customizable styling
Filter chip classes: qt-filter-chip, qt-filter-chip-active
File preview classes: qt-file-preview-*, qt-wikilink, qt-wikilink-broken
Copy button classes: qt-copy-button

Theme Updates

Ocean v1.2.6: Copy button variables, file preview styling
Earl Grey v1.2.2: Wikilink styling, file preview panels
Rains v1.1.5: Wikilink styling, file preview panels
theme-storybook v1.0.13: All new CSS variables and component classes

Testing

Comprehensive Unit Test Expansion

Phase 1 (API Layer): 201 tests for responses, middleware, search service
Phase 2 (Chat Services): 97 tests for streaming, tool execution, pseudo-tool parser, conversation builder
Phase 3 (Data Infrastructure): 143 tests for import/export, theme system
Phase 4 (Hooks & Auth): 242 tests for 9 React hooks, Arctic OAuth modules
Phase 5 (Utilities & Services): 161 tests for utility functions, remaining services
Grand Total: 844+ new tests across 40+ modules with 90%+ coverage on critical paths
All 3438 tests now pass (164 test suites)

Feature-Specific Tests

Context compression: 22 tests
Compression cache service: 17 tests
Text detection utility: 47 tests
LLM error types: 47 tests
Recovery service: 32 tests
Tool registry: 28 tests
Orphan recovery: 13 tests
Folders repository: 28 tests
First message context: 18 tests
Memory processor inter-character tracking: 11 tests

Bug Fixes

Critical Fixes

Sign out now properly clears session and resets UI (wrong cookie name)
Startup state persists across Next.js module reloads (moved to global namespace)
All authenticated API routes wait for server startup to complete
Cloud backups appear in backup list after creation (missing metadata entries)
Plugin toggle works with PUT handler for enable/disable
Quick Chat dialog loads connection profiles correctly
Sync push response format returns arrays instead of numbers
MongoDB migrations work in hosted environments without admin access

Chat & Messaging Fixes

Streaming reliability during long context compression operations
Generic greeting fallback when LLM content filter blocks auto-generation
OpenRouter streaming tool calls detected (camelCase handling)
Memory extraction now properly links memories to user-controlled characters
Template variables in user character descriptions properly substituted
File write permission prompt appears correctly

UI Fixes

Recent chats quick-hide filtering on homepage
Character avatar setting from gallery (HTTP method mismatch)
Image tagging in character photo gallery (correct API endpoints)
Plugin source detection showing npm packages correctly
Cancel button in form modals always enabled except during loading
Settings UI data structure and API endpoint alignment

Refactoring

Route Decomposition

Characters route: 821 → 428 lines (extracted character-wizard.service.ts)
Chats/[id] route: 1,876 → 49 lines (extracted handlers/ and actions/ directories)
Extracted 14 Zod validation schemas, helper functions, and type definitions

Service Extraction

handleInterCharacterMemory: Extracts memories one character learns about another
handleContextSummary: Updates running context summaries using cheap LLM
handleTitleUpdate: Evaluates and updates chat titles based on conversation

CSS Migration

Converted Tailwind color classes to qt-* semantic classes throughout
RenameReplaceTab.tsx: Error alerts, info boxes
plugins-tab.tsx: Plugin source badges

Security

Dependency Updates

esbuild 0.19.0/0.20.0/0.24.0 → 0.27.0 across all 14 plugins (CVE-2024-23334)

Documentation

Major README overhaul: repositioned as general-purpose AI chat platform
Roadmap extracted to features/ROADMAP.md
Provider plugin development guide: docs/PROVIDER_PLUGIN_DEVELOPMENT.md
Testing guide: docs/TESTING_GUIDE.md
File LLM access documentation: docs/FILE_LLM_ACCESS.md
Updated CLAUDE.md documentation references

Quilltap 2.6.1

View on GitHub →

Sidebar auto-refreshing

Release 2.6.1

feat: Sidebar auto-refresh on character/chat changes
- Created SidebarDataProvider context to centralize sidebar data management
- Sidebar now automatically updates when characters or chats are created, edited, or deleted
- Added useSidebarData hook with refreshSidebar(), refreshCharacters(), and refreshChats() functions
- Implemented debounced refresh (300ms) to prevent rapid re-fetches
- Updated 10 mutation points across the app to trigger sidebar refresh

Quilltap 2.6.0

View on GitHub →

Left sidebar, no more personas, plugin development SDK, memory cascade delete/regenerate

Quilltap v2.6.0 Release Notes

Highlights

New UI Layout - Complete redesign with persistent left sidebar navigation, collapsible character/chat lists, resizable sidebar width, and a simplified home page.

Characters-not-Personas - Unified character system where users can control any character directly. Personas have been converted to user-controlled characters with real-time impersonation switching.

Plugin Development SDK - New npm packages (@quilltap/plugin-types, @quilltap/plugin-utils, @quilltap/theme-storybook, create-quilltap-theme) enable standalone plugin development without access to Quilltap source.

Memory Cascade - Memories auto-extracted from messages now track provenance. Deleting or regenerating messages prompts handling of associated memories.

Major Features

Replaced top navbar with persistent left sidebar
Sidebar includes: Characters, Chats sections with quick access lists
Footer with Settings, Tools, Themes, Quick Hide buttons and Profile menu
Collapsible sidebar (desktop) with persistent state in localStorage
Mobile: overlay drawer from left with hamburger trigger
Simplified header with centered search bar and full-width toggle
New home page with welcome message, favorites, and “Start Chat” button
Resizable sidebar width (256px–512px) with drag handle, persisted to MongoDB
Message count badges on sidebar chat items (shows “999+” for large chats)
Brand logo moved to sidebar header (quill icon when collapsed, full logo when expanded)
Full-width toggle now affects all pages via --qt-page-max-width CSS variable
Improved favorite character cards with wrapping titles and “Chat” quick-start button
Dashboard route deprecated; redirects to home page

Characters-not-Personas Migration

Personas have been converted to characters with controlledBy: 'user'
Users can now control any character directly during chat via impersonation
SpeakerSelector dropdown for switching between multiple impersonated characters
AllLLMPauseModal for all-LLM chats with Continue/Stop/Take Over options
New API endpoints: /api/chats/[id]/impersonate, /api/chats/[id]/active-speaker
Inter-character memory support with aboutCharacterId field
Chat creation UI now shows “Play As” selector for user-controlled characters
Character view page: user-controlled toggle, default conversation partner setting
Personas link removed from navigation; all /personas/* routes redirect to /characters
Migration in upgrade plugin converts existing personas and updates references

Plugin Development SDK

@quilltap/plugin-types (v1.1.0): TypeScript types for third-party plugin development
- Exports all types: LLM, tools, provider, manifest, themes, roleplay templates
- Error classes: PluginError, ApiKeyError, ProviderApiError, RateLimitError
- Submodule exports for granular imports (/llm, /plugins, /common)
@quilltap/plugin-utils (v1.1.0): Runtime utilities for plugins
- Tool call parsers and format converters
- Logger bridge: createPluginLogger() routes to Quilltap core logging
- OpenAICompatibleProvider base class for custom LLM providers
- Roleplay template utilities: createRoleplayTemplatePlugin(), validation functions
@quilltap/theme-storybook (v1.0.4): Theme development kit
- Default theme tokens CSS and all qt-* component classes
- Storybook preset and ThemeDecorator for theme/mode switching
- Comprehensive story components for all UI elements
create-quilltap-theme: CLI scaffolding for new theme plugins
- Interactive prompts for theme name, author, description, primary color
- Optional CSS overrides and Storybook setup
- Includes bundled THEME_PLUGIN_DEVELOPMENT.md guide

Memory Cascade on Message Changes

Memories auto-extracted from messages now tracked with provenance links
Deleting a message prompts: delete/keep/regenerate associated memories
Regenerating a response (swipe) automatically cleans up old memories
Memory cards show “Source” link navigating to and highlighting source message
New chat settings section for default memory cascade behavior
Repository methods: deleteBySourceMessageId(), countBySourceMessageId()
Message navigation utilities for cross-page scroll-to-message functionality

Search & Replace Tool

Bulk text replacement across messages and memories
Scoping: single chat, all chats for a character, or all chats
Wizard-style UI: scope selection → search/replace input → preview counts → confirmation
Entry points: chat ToolPalette, character view page
Automatically regenerates memory embeddings after content changes

Plugin System

npm-based Plugin Installation

Browse qtap-plugin-* packages from npm registry in Settings → Plugins
Install/uninstall APIs with manifest validation and version compatibility checking
Support for scoped npm packages (@org/qtap-plugin-*)
Site-wide and per-user plugin installation scopes
Docker volume configuration for persistent plugin storage
Note: Gab AI provider moved to external plugin at @quilltap/qtap-plugin-gab-ai

Self-Contained Theme Plugin Architecture

Theme plugins now support module-based loading (similar to provider plugins)
Themes can embed tokens, CSS overrides, and fonts directly in the module export
ThemePlugin interface with ThemeMetadata, ThemeTokens, FontDefinition
Theme registry tries module-based loading first, with file-based fallback
All existing themes (Ocean, Earl Grey, Rains) migrated to self-contained format

Plugin-Registered Provider Configuration

Extended LLMProviderPlugin interface with runtime configuration:
- messageFormat, charsPerToken, toolFormat, cheapModels, defaultContextWindow
Query methods in provider registry eliminate hardcoded provider data
Core lib files query registry first, with fallback to legacy constants
Adding/removing providers now only requires adding/removing the plugin

Roleplay Template Plugin Types

Added RoleplayTemplatePlugin, RoleplayTemplateConfig, RoleplayTemplateMetadata types
New utilities: createRoleplayTemplatePlugin(), createSingleTemplatePlugin()
Validation utilities: validateTemplateConfig(), validateRoleplayTemplatePlugin()
Documented in docs/TEMPLATE_PLUGIN_DEVELOPMENT.md

Theming & Styling

Theme Redesigns

Ocean Theme (v1.2.5): New quirky, childlike underwater aesthetic
- Coral reef background image visible on all main pages
- Glass-effect panels with backdrop blur
- Color palette: coral pink, teal accent, jellyfish purple, octopus orange, seaweed green
- Nunito font for playful typography
- Fixed participant sidebar contrast issues
Earl Grey Theme: Modernized with ChatGPT-inspired design
Rains Theme: Modernized with ChatGPT-inspired design, serif headings

New qt-* Utility Classes

Avatar utilities: qt-avatar-name, qt-avatar-title with CSS variables
Button utilities: qt-button-success standalone class
Page utilities: qt-page-container, qt-page-toolbar
Sidebar utilities: qt-left-sidebar-brand, width variables
Card grid variants: qt-card-grid-2, qt-card-grid-3
Improved success button contrast with dark text on bright green

Chat Improvements

XML Tool Call Detection

Detects and executes tool calls from LLMs like DeepSeek that emit XML tool syntax
Supports formats: DeepSeek (<function_calls><invoke>), Claude-style, generic (<tool_call>)
XML tool calls stripped from displayed messages; execution status shown separately
Runs for ALL providers, not just pseudo-tool mode

Timestamp Injection in System Prompts

Configurable modes: disabled, conversation start only, or every message
Format options: friendly, ISO 8601, date only, time only, or custom format
Fictional time support: base timestamp that advances with real elapsed time
Template variable {{timestamp}} for custom placement
Global default in Chat Settings, per-chat override available

Scenario Prompt for New Chats

Add optional scenario text when creating a new chat
Available in both New Chat page and ChatCreationDialog
Scenario stored in chat context for character use

Message Re-attribution

Re-attribute messages to different participants
Works for both USER and ASSISTANT messages
Automatically deletes associated memories when re-attributed
After re-attribution, scrolls to updated message

All-LLM Chat Improvements

Turn manager auto-continues in all-LLM chats (no more “user’s turn”)
Single-character all-LLM chats continue in monologue mode
Pause logic at intervals (3, 6, 12… turns) prevents runaway API usage
Resume button properly triggers next speaker

Bug Fixes

Chat page content no longer cut off by page toolbar
Strip duplicate character name prefixes from LLM responses (multi-character chats)
Resolve “Maximum update depth exceeded” errors in React 19/Next.js 16
Prevent render loop in DevConsoleProvider with batched log updates
Deduplicate jobs in tasks queue to prevent React key collision
Memory regeneration respects message participantId in multi-character chats
After deleting a message, scroll to next message instead of bottom
Chat creation dialog now responsive with scrollable content
Improved error logging for undefined error messages
Fix TypeScript type errors in provider plugins (role types, stopSequences)

Refactoring

Backend Refactoring (6 Phases)

Phase 1: Migrated 101 API routes to createAuthenticatedHandler middleware (~1,500 lines saved)
Phase 2: Standardized API responses across 76+ route files, centralized utilities
Phase 3: User-scoped repository refactoring with generic base classes (32% reduction)
Phase 4: Batch query methods to eliminate N+1 patterns
Phase 5: Large service decomposition
- context-manager.ts (1,266 lines) → 4 focused modules
- search/route.ts (721 lines) → 73 lines with extracted service
- chats/import/route.ts (763 lines) → 131 lines
Phase 6: Plugin registry integration; legacy fallback constants consolidated and deprecated

Frontend Refactoring

Created shared hooks: useDialogState, useWizardState, useAutoAssociate, useImageNavigation
Created BaseModal component consolidating modal structure from 17 components
Created shared ProfileCard and ProfileList components
Extracted wizard step components for export/import dialogs
Export dialog: 424 → ~250 lines; Import dialog: 565 → ~250 lines

Dead Code Cleanup

Removed duplicate components/characters/system-prompts/ directory
Removed unused debug components and auth placeholder files
Removed completed migration scripts
Removed unused barrel/index files for better tree-shaking
Removed backwards-compatibility shims

Logging Improvements

LLM API Debug Logging

All LLM API calls now log request and response details at debug level
Chat messages, cheap LLM tasks, embeddings, image generation, greeting generation
Consistent format with context: 'llm-api' for easy filtering

Reduced Verbose Logging

Removed ~2,900 debug log entries per typical session
Eliminated cache hit logs, successful validation logs, entry/exit pairs
Removed MongoDB connection status checks, repository instantiation logs
Removed UI lifecycle debug logs (theme, settings, search replace)
Kept only error logs, warnings, and initialization info logs

Infrastructure

Migrations Consolidated

All migration files moved from lib/mongodb/migrations/ to upgrade plugin
Post-login migrations now in upgrade plugin as user-migrations.ts
Upgrade plugin version: 1.0.11

Test Improvements

114 new unit tests for characters-not-personas features
Unit tests for sidebar, memory cascade, and navigation features
Mock repository fixtures for auth middleware mocking

Documentation

New docs/THEME_PLUGIN_DEVELOPMENT.md with step-by-step tutorial
New docs/TEMPLATE_PLUGIN_DEVELOPMENT.md for roleplay template plugins
Updated features/complete/new_ui_layout.md documenting sidebar navigation
Updated features/complete/characters_not_personas.md for migration details
Package documentation: plugin-types, plugin-utils, theme-storybook, create-quilltap-theme

Quilltap 2.5.1

View on GitHub →

Bugfix release: roleplay templates and other chat improvements

Quilltap 2.5.1 Release Notes

Release Type: Bugfix Date: December 27, 2025

Bug Fixes

Plugin Roleplay Templates Now Work in Chat Settings

Fixed an issue where selecting a plugin-provided roleplay template (such as “Quilltap RP”) in the chat settings modal would fail with a validation error.

Problem: Plugin templates use ID formats like plugin:quilltap-rp rather than standard UUIDs. The validation layer was rejecting these IDs at both the API route and database schema levels, causing the template selection to fail silently.

Solution: Updated the roleplayTemplateId field validation to accept any string format, supporting both UUID-based templates (user-created and built-in) and plugin-prefixed templates.

Files Changed:

app/api/chats/[id]/route.ts — API route validation
lib/schemas/types.ts — MongoDB schema validation (4 fields)

Improvements

Modal stability during saves: The chat settings modal no longer closes unexpectedly when clicking dropdown options. Native browser select dropdowns render in a separate layer, which was triggering the click-outside detection. The modal now disables click-outside handling while a save is in progress.
Better error diagnostics: Improved error logging when roleplay template updates fail, now including chatId, templateId, and error type for easier debugging.

Upgrade Notes

No migration required. This is a drop-in replacement for 2.5.0.

Quilltap 2.5.0

View on GitHub →

AI Character Wizard, sync between instances, .qtap import/export (native format), Arctic auth

Quilltap v2.5.0 Release Notes

Highlights

Multi-Instance Sync - Bidirectional synchronization between Quilltap instances with API key authentication, real-time progress tracking, and conflict resolution.

AI Character Wizard - Multi-step wizard for AI-assisted character creation and editing with vision support for physical description generation.

Native Export/Import - Full-featured .qtap export/import system with selective entity export, conflict detection, and resolution strategies.

Replaced NextAuth - Migrated from NextAuth to Arctic + custom JWT session management for lighter, more flexible authentication.

Major Features

Bidirectional Sync API

Sync characters, personas, chats, memories, tags, and templates between Quilltap instances
API key authentication for secure cross-instance sync through firewalls
Real-time progress bar showing current phase, item being synced, and running counts
Sync direction control: bidirectional, push-only, or pull-only
Force Full Sync option to pull/push all data regardless of timestamps
Automatic entity order enforcement to ensure dependencies exist before dependents
File and message content included in sync with streaming for large files
Reset Sync State button for troubleshooting

AI Wizard for Character Creation

Multi-step wizard modal on New/Edit Character pages
Profile selection with vision-capable detection for image analysis
Physical description source options: existing data, upload image, gallery, or skip
Generates: title, description, personality, scenario, example dialogues, system prompt
Physical descriptions saved with all prompt levels (short/medium/long/complete/full)

Native Quilltap Export/Import

Export wizard: select entity type → choose scope → optional memory inclusion → download .qtap
Import wizard: preview entities with conflict detection → choose strategy → import
Three conflict strategies: skip, overwrite, or duplicate
Post-import reconciliation updates all foreign key relationships
Supports: Characters, Personas, Chats, Roleplay Templates, Connection/Image/Embedding Profiles, Tags

API Key Import/Export

AES-256-GCM encryption with user passphrase
HMAC signature for integrity verification
Preview keys before importing
Duplicate handling: skip, replace, or rename

Authentication & Security

Replaced NextAuth with Arctic + Custom JWT

Removed next-auth dependency entirely
Added Arctic library for OAuth 2.0 flows with PKCE support
Custom JWT session management using jose library
Session tokens derived from ENCRYPTION_MASTER_PEPPER
Removed NEXTAUTH_URL and NEXTAUTH_SECRET environment variables
Added optional BASE_URL environment variable

Auth Environment Variable Clarification

AUTH_DISABLED=true completely bypasses auth and auto-logs in
New OAUTH_DISABLED env var hides OAuth buttons but keeps credentials login
New AUTH_UNAUTHENTICATED_USER_NAME env var configures display name

User Experience

New Profile Page (`/profile`)

Edit display name, email, and profile avatar
View read-only account details (user ID, creation date, etc.)
Manage 2FA and trusted devices
Accessible from user menu dropdown

Manual Chat Rename

“Rename” button in tool palette (desktop and mobile)
ChatRenameModal for custom titles or re-enabling auto-naming
Dynamic browser tab title shows chat name

Multi-Character Chat Pause

Pause button in participant sidebar (desktop) and message header (mobile)
Pressing stop while streaming also pauses auto-responses
Pause state survives page reload

Auto-Associate API Keys

When importing/creating keys, automatically links profiles that need them
Background association check on settings tab navigation
Toast notifications show which profiles were linked

Visual Warnings for Missing API Keys

Connection, image, and embedding profiles show “⚠️ No API Key” badge
Warnings in dropdowns throughout settings and chat configuration

Plugin System

Plugin-Provided Roleplay Templates

New ROLEPLAY_TEMPLATE plugin capability
Plugins define templates via roleplayTemplateConfig in manifest.json
Migrated “Quilltap RP” template to qtap-plugin-template-quilltap-rp plugin

Plugin Version Display Fix

Plugin list now displays package.json version instead of manifest.json

Theming & Styling

New qt-* Utility Classes

Background utilities: qt-bg-surface, qt-bg-surface-alt, qt-bg-card, qt-bg-muted
Status backgrounds with opacity: qt-bg-primary/N, qt-bg-warning/N, etc.
Border utilities: qt-border, qt-border-primary, qt-border-warning, etc.
Text utilities: qt-text-secondary, qt-text-warning, qt-text-info, qt-text-success

Tag Visual Styles on Tags

Tag styles (emoji, colors, formatting) now stored directly on each Tag
Migration moves existing styles from ChatSettings to individual Tags
Automatic backup/restore with tags

Bug Fixes

OAuth redirects now use BASE_URL for correct domain behind reverse proxies
Attach file button in chat now opens file picker correctly
Chat composer textarea resizes properly after message submission
Sending message while paused no longer auto-resumes turn manager
Focus returns to textarea after AI response completes
Platform-aware keyboard shortcuts (Cmd on macOS, Ctrl on Windows/Linux)
Navbar avatar loads correctly from /api/files
Backup restore now correctly remaps entity relationships
FormActions component renders submit button when using type=“submit”
Memories sync correctly from remote servers
File content sync uses stored S3 key correctly
Memory attribution improved in multi-character chats
Sync progress polling stops when sync completes

Infrastructure

Dependency Upgrades

Next.js 16.0.5 → 16.1.1 (Turbopack caching, CVE patches)
MongoDB driver 6.21.0 → 7.0.0 (requires Node.js 20.19+)
bcrypt 5.1.1 → 6.0.0 (40 fewer dependencies)
@openrouter/sdk 0.2.11 → 0.3.10
Now requires Node.js 22+

E2E Test Infrastructure

TestUserHelper for consistent test user management
/api/auth/delete-account endpoint for test cleanup
Serial Playwright execution to avoid race conditions

Refactoring

Settings Forms to Modals

API Keys, Image Profiles, Embedding Profiles, Connection Profiles now use modals
Lists always visible instead of hidden when form is open

Draft Message Persistence

Saves textarea content to localStorage with 5-second debounce
Restores draft on page load, clears on successful submission

Documentation

About page updated with Node.js 22+ requirement and version minimums
Updated auth from “NextAuth.js” to “Local + OAuth”

Quilltap 2.4.1

View on GitHub →

Bugfix: settings hooks problems

Release 2.4.1

Fix infinite loop issues in settings hooks:

usePrompts: stabilize fetchTemplates, saveTemplate, deleteTemplate callbacks
useConnectionProfiles: stabilize fetchProfiles, handleDelete callbacks
useEmbeddingProfiles: stabilize loadData callback
Update component useEffects to use empty dependency arrays
Root cause: useAsyncOperation returns new object each render, causing callbacks with [xxxOp] deps to be unstable and trigger infinite fetches

Quilltap 2.4.0

View on GitHub →

NPCs, system prompt selection per character, pseudo-tool support, chat/turn improvments, mobile UI fixes

Release 2.4

New Features

Provider model caching + persistence
- New provider models database caching system (chat/image/embedding models cached per endpoint, with modelType support and updated indexes)
- Provider models added to backup/restore, including manifest counts, archive export (data/provider-models.json), restore warnings/error handling, and restore summary tracking
NPC system upgrades
- New ad-hoc NPC character system (npc flag, quick-create NPC dialog from chat picker, Settings → NPCs management tab, characters list filtering, and ?npc=true|false API filtering)
- New “Convert to NPC / Convert to Character” toggle on character view
System prompts & templates
- Multiple system prompts per character (new edit/view tabs, default prompt selection, prompt templates, sample prompts, and new API endpoints)
- Per-user post-login migration to move legacy system prompt data to the new structure
- Enhanced template display (TemplateDisplay with highlighting for {{char}} / {{user}}, plus warnings for hard-coded names)
- Pseudo-tool support for models without native function calling (memory search / image generation / web search via text markers)
Chat & turn-system functionality
- Persisted turn state for multi-character chats (+ new PATCH endpoint)
- Stop streaming response button (desktop + mobile) with clean abort handling
- OpenRouter custom model ID support (toggle + custom input + saved preference)
UI / navigation
- Mobile-responsive upgrades across chat conversation UI, chat footer/tool palette, participant controls, and dashboard
- Navbar restructuring (controls consolidated into user menu + content-width toggle + responsive collapse behavior)
- New About page with project links/tech stack and a simplified footer

Improved Features

Documentation
- Major refresh across development docs and API docs (expanded to v2.4; new endpoints; all 8 LLM providers; removed outdated JSON-storage references; updated About page notes)
Streaming, tooling, and provider behavior
- SSE error handling now produces clearer, fuller client-visible errors; reduced noisy logs; better continue-mode handling
- Anthropic streaming now detects tool-use blocks correctly; prompt caching expanded (tool caching, cache strategies, TTL options, cache usage stats surfaced to client)
- OpenAI plugin updated with newer GPT-Image models and deprecation guidance for older ones
- Capabilities report now prefers cached provider models and reads plugin versions from package.json
Backup/restore completeness
- Cloud backup/restore now includes user-created prompt templates and roleplay templates, with backward compatibility and correct remapping in new-account restores
Theming & UI consistency
- Significant enhancements to Earl Grey and Rains themes with broad qt-* utility coverage and improved tab styling
- Tab styling consolidated into qt-tab utility classes; text styling consolidated into qt-text utility classes; reduced redundant button flex utilities
- Avatars refactored into reusable components (Avatar / AvatarStack) and now respect the global avatar style preference
Codebase maintainability & test coverage
- Major component breakup and TSX refactors to reduce duplication, introduce reusable hooks/components, standardize patterns, and improve logging/error utilities
- Expanded unit + integration test coverage for templates and pseudo-tools (with the suite remaining green)

Bugs Fixed

Chat streaming / SSE
- Cleaner SSE error handling (including Zod validation formatting) and suppressed benign parse-noise
- Fixed OpenRouter streaming failures involving tool calls (SDK patch for null tool-call IDs)
- Reduced SSE parse errors further (filter empty chunks, skip [DONE] markers for OpenAI-compatible streams)
Avatar styling
- Avatar style changes now apply immediately without refresh (context sync)
- Avatar display style now consistently applies across all chat avatars and uses a CSS variable for radius
- Removed duplicate avatar display during “waiting for response”
Settings / pages / navigation
- Fixed Physical Descriptions tab infinite API loop
- Fixed empty default Connection Profile dropdown on character edit page (mapping corrected)
- Fixed setState-during-render character editing error
- Added missing “Back to NPCs” breadcrumb behavior and correct Settings → NPCs navigation
- Settings page now supports ?tab= URL parameter
- Fixed user menu mobile responsiveness issues (layout, truncation, submenu placement)
Chat composer & message rendering
- Restored sendMessage behavior after refactor (enter key / send button)
- Textarea auto-resize now shrinks correctly when deleting
- Roleplay bracket segments now render correctly with markdown inside
- Removed duplicate attach-file button
- Fixed desktop ToolPalette toggle behavior (no immediate reopen) and centered toolbar icons
- Balanced desktop composer padding
Multi-character & persona correctness
- Participant sidebar now highlights the correct character during streaming
- Skip button now advances turn correctly and resets the cycle
- Fixed duplicate name prefixes in multi-character messages
- Chat gradual renaming now replaces generic titles correctly
- Persona display-name disambiguation now works with the API’s actual response shape
System prompts / templates
- Template variables ({{char}}, {{user}}) now reliably process in LLM context building
- SystemPromptsEditor buttons no longer accidentally submit parent forms
Export / cleanup
- SillyTavern chat export now produces proper JSONL format (and importer can auto-detect JSON vs JSONL)
- Removed obsolete Prisma/PostgreSQL artifacts (project is MongoDB-native)

Quilltap 2.3.0

View on GitHub →

Tasks queue, token management, self-contained plugins, cloud deployment improvements

Release 2.3.0

v2.3.0 delivers a redesigned background job system, smarter LLM limits, and hardened plugin/deployment tooling alongside a wave of UX polish and fixes.

Introduced a first-class Tasks Queue experience with job-level controls, auto start/stop, a Tool card, and a dedicated queue for import-time memory extraction to keep background work manageable.
Raised max token ceilings to 4k–128k with dynamic model metadata so the UI, API, and connection profiles all honor provider-specific limits.
Re-architected plugin builds: each plugin now ships as a self-contained bundle (with SDK deps included) plus per-plugin build scripts, updated Docker workflow, and CI/CD changes such as npm run build and precompiled artifacts.
Decoupled tests from vendor SDKs via Jest mocks and module mappers, removed legacy OpenAI image code, and kept only lightweight runtime imports for embedding types.
Added cloud backup tooling improvements, chat memory management utilities, quick-hide reload logic, and DevConsole styling/Chat Debug integration with the new qt- CSS component system.
Hardened deployments with AWS ECS + IAM role support, Docker compose/script updates, security upgrades, and targeted bug fixes (Docker builds, integration tests, OpenAI/Grok compatibility, scrollbars, cursor styles).
Documented future plans (usage tracking, API separation) and broadened automated coverage with quick-hide provider tests and pre-commit guardrails.

Quilltap 2.2.0

View on GitHub →

Theme plugins, multi-character chat completed

Release 2.2.0

Tools, Global Search, Character Management, Multi-Character Chat, Dev Console, Themes, OpenRouter Updates

Plugin-driven theming architecture with ThemeProvider runtime, persistence, Appearance settings, and qt-* semantic classes so admins install/switch rich Tailwind v4-compatible theme plugins (Ocean, Rains, Earl Grey) with bundled fonts, previews, and nav selector.
Multi-character chat suite completed: turn/state management, context building, nudge/queue UI, participant add/remove, auto-triggered turns, streaming fixes, inter-character memory sharing, tag syncing, and regression coverage.
Provider/tooling expansion: OpenRouter SDK 0.2.9 + embeddings, Anthropic cache controls, Google Gemini and OpenRouter image flows, improved cheap-LLM prompts, multi-person image placeholders, and collapsible tool message readability upgrades.
Navigation/UX refinements: file-tag inheritance, dashboard tweaks, participant-tag filters, favorites/chat-count sorting, enhanced quick-hide/theme controls, nav actions dropdown, scaled avatars, branding refresh (new quill icon, EB Garamond, splash graphic).
New Capabilities Report tool on Tools page generates, stores, and downloads comprehensive diagnostics covering environment, plugins, providers, and storage stats.
UI polish and theme coherence: Export Chat moved to ToolPalette, button/badge semantics standardized, Ocean/Rains/Earl Grey typography and palettes aligned, text-shadow fixes, QuickHideProvider render bug resolved.
Quality safeguards: pre-commit hooks and Jest setup rebuilt for quieter, more reliable local runs; documentation moved for v2.2 planning/testing; GitHub Actions stabilized via shared jest setup adjustments.

Quilltap 2.2.1

View on GitHub →

Bugfix: Grok and OpenAI providers

Release: 2.2.1

Fix Grok and OpenAI providers

Quilltap 2.1

View on GitHub →

Multi-character import (not fully implemented), backup/restore, global search

Release 2.1

Multi-character ST import support, backup/restore, global search

Multi-character SillyTavern chat import with wizard to assign users, persona
Cloud or local backup/restore system
“Delete all user data” functionality
Removed duplicated memories editing section from character edit page
Added global search
Rename character + search/replace in templates and throughout records
Console and other logs can be seen in the front-end while not in production mode
Finish local username/password and TOTP/MFA login

Quilltap 2.0

View on GitHub →

JSON file support mostly removed, MongoDB required, S3 required, auth plugins, and more

Release 2.0

Pluggable Authentication, no-auth, MongoDB/S3 migration complete

Fix quick-hide persistence and update issue
Convert Google OAuth to plugin (qtap-plugin-auth-google)
Create auth provider plugin interface and registry
Implement lazy initialization pattern for NextAuth
Centralize session handling in lib/auth/session.ts
Make a default no-auth option (AUTH_DISABLED=true env var)
Show tool calls collapsed in chat UI before character response
Only show “generating image” alert for generate_image tool (not all tools)
Fix {{me}} placeholder to resolve to character (not persona) when character calls image generation tool
Attach generated images to LLM response and tag for chat/character
Use file-manager (addFileLink/addFileTag) instead of deprecated repos.images
Enable Ollama plugin by default
Add tool call capture and normalization in Ollama provider
Add /api/providers endpoint for dynamic provider configurations
Update connection profiles UI to fetch provider requirements dynamically
Versioning change (dev commits no longer bump release versions)
MongoDB now required - removed JSON file storage backend
S3 now required - removed local filesystem storage for files
Migration plugin (qtap-plugin-upgrade) available for migrating existing JSON/local data
Fix S3-served avatar and image display across dashboard, chats, personas, and characters
Switch from Next.js Image to native img tags for API-served images (compatibility with dynamic routes)
Fix URL construction bugs (double-slash issues) in avatar/image paths
Add graceful handling of orphaned file metadata entries
Auto-cleanup orphaned file references (avatars, defaultImageId)
Fix deduplication to verify file existence in S3/local storage
Proxy files through API for HTTP S3 endpoints to avoid mixed content SSL errors
Add MongoDB repositories for migrations and vector indices
Update test mocks to use new repository factory pattern
Add utility scripts: debug-files, fix-file-userids, fix-sha256-in-mongodb, reset-file-tags
Improve S3 migration error handling (warnings vs blocking errors)
Enhanced auth adapter with improved MongoDB integration
Replace email with username for local authentication
Add user-scoped repositories for data isolation between users
Add migration to ensure all users have usernames
Use session.user.id instead of email for user lookups
Add model warnings system and fix Gemini thinking model issues
Sort settings lists (API keys, profiles, etc.) alphabetically by name
Clear error state on successful data fetch in settings tabs
Hide navigation on auth pages and reduce MongoDB connection logging verbosity
CI/build improvements: skip env validation during CI build, add MONGODB_URI test default

Quilltap 1.7

View on GitHub →

Plugin support added, quick-hide, web search, LLM providers moved to plugins, unified file handling

Release 1.7

1.7: Plugin support: basics, routes, LLM providers

Quick-hide for sensitive tags, hit one button and watch everything tagged that way disappear, toggle it back and it reappears
Logging to stdout or file (see ENV file for configuration)
Web search support (internal for providers that support it)
Cascading deletion for characters (deletes memories and optionally images and chats associated with the character)
Cleanup and better UI for chat cards
Plugin support
- New routes
- Moved LLM providers to plugins
Moved images to the file handling system so that they are no longer a separately maintained thing

Quilltap 1.6

View on GitHub →

Physical character descriptions, Postgres-to-JSON migration complete, file manager, Cheap LLM, images character-aware

Release 1.6

Physical descriptions, JSON store polish, and attachment fallbacks

JSON data store finalized with atomic writes, advisory file locking, schema versioning, and full CLI/docs to migrate/validate Prisma exports into the JSON repositories.
Centralized file manager moves every upload into data/files, serves them via /api/files/[id], and ships migration/cleanup scripts plus UI fixes so galleries and avatars consistently load from /data/files/storage/*.
Attachment UX now shows each provider’s supported file types in connection profiles and adds a cheap-LLM-powered fallback that inlines text files, generates descriptions for images, and streams status events when providers lack native support.
Cheap LLM + embedding controls let you mark profiles as “cheap,” pick provider strategies or user-defined defaults, manage dedicated OpenAI/Ollama embedding profiles, and fall back to keyword heuristics when embeddings are unavailable while powering summaries/memories.
Characters and personas gain tabbed detail/edit pages plus a physical description editor with short/medium/long/complete tiers that feed galleries, chat context, and other tooling.
Image generation prompt expansion now understands {{Character}}/{{me}} placeholders, pulls those physical description tiers, and has the cheap LLM craft provider-sized prompts before handing them to Grok, Imagen, DALL·E, etc.

Quilltap 1.5

View on GitHub →

memory RAG in place, automatically runs after messages, semantic embedding for memories, Cheap LLM begins

Release 1.5

Major Features

Full-stack character memory management is now in place: memories are stored per character in the JSON store, exposed through dedicated REST routes, and editable via a rich UI for browsing, tagging, sorting, and manual CRUD operations, plus an interactive cleanup dialog for enforcing retention policies before data is written back (lib/json-store/repositories/memories.repository.ts:1, app/api/characters/[id]/memories/route.ts:1, components/memory/memory-list.tsx:1, lib/memory/housekeeping.ts:1, components/memory/housekeeping-dialog.tsx:1, app/api/characters/[id]/memories/housekeep/route.ts:1).
Memory intelligence now runs automatically after every message: background workers extract significant facts through cheap-LLM prompts, dedupe with semantic similarity, expose a memory-search tool/function to the chat providers, and feed summaries plus retrieved memories into the streaming route so the LLM always sees the most relevant context and debug traces (lib/memory/memory-processor.ts:1, lib/memory/memory-service.ts:1, lib/tools/memory-search-tool.ts:1, lib/tools/handlers/memory-search-handler.ts:1, app/api/chats/[id]/messages/route.ts:492, app/api/chats/[id]/messages/route.ts:618, lib/chat/context-summary.ts:1, lib/chat/context-manager.ts:1).
Semantic embeddings gained first-class support: Quilltap now talks to OpenAI or Ollama embedding endpoints, persists vectors per character, and lets users create, list, edit, and delete dedicated embedding profiles (with model catalogs) so memory search can prefer vectors over keyword heuristics (lib/embedding/embedding-service.ts:1, lib/embedding/vector-store.ts:1, app/api/embedding-profiles/route.ts:1, app/api/embedding-profiles/models/route.ts:1, components/settings/embedding-profiles-tab.tsx:1, lib/json-store/repositories/embedding-profiles.repository.ts:1).
Cheap-LLM orchestration became cost-aware: strategies can prefer a flagged “cheap” profile, fall back to provider-specific minis, or favor local Ollama, all using live/fallback pricing data that can be refreshed on demand; the settings UI surfaces these controls and cheap usage stats so automated summarization, extraction, titling, and attachment-description tasks choose the lowest-cost path (lib/llm/cheap-llm.ts:1, lib/llm/pricing.ts:1, app/api/startup/refresh-pricing/route.ts:1, components/settings/chat-settings-tab.tsx:1, components/settings/connection-profiles-tab.tsx:1, lib/memory/cheap-llm-tasks.ts:1).

Minor Features

The chat composer is friendlier—auto-resizing, keyboard-friendly input plus an inline Markdown preview toggle streamline writing while keeping attachments visible in the footer layout (app/(authenticated)/chats/[id]/page.tsx:1210).
Streaming responses now display a bespoke quill-writing animation and the UI ships self-hosted Inter weights with Georgia-styled rendered Markdown to keep the experience consistent offline (components/chat/QuillAnimation.tsx:1, app/globals.css:1).
Power users get richer diagnostics: the debug panel renders provider icons, status badges, and copy actions while the messages API streams full tool/context budgeting metadata for every LLM request (components/debug/DebugPanel.tsx:1, app/api/chats/[id]/messages/route.ts:492).
Documentation and tests were expanded to cover the new roadmap, cheap-LLM design, embedding/vector-store behavior, and memory processor interfaces, so contributors have specs plus Jest coverage to lean on (PLAN.md:1, features/CHEAP-LLM.md:1, __tests__/unit/lib/embedding/vector-store.test.ts:1, __tests__/unit/lib/memory/memory-processor.test.ts:1).

Bug Fixes

Anthropic requests now set either temperature or top_p (never both), preventing the API errors that previously surfaced when both knobs were passed together (lib/llm/anthropic.ts:126).
Chat creation/initialization respects the new participant model and default persona selection, eliminating failures when migrating older conversations or seeding greetings (lib/json-store/repositories/chats.repository.ts:1, lib/chat/initialize.ts:32).
Deleted image references no longer break galleries—missing files render a cleanup prompt so users can remove stale IDs without editing JSON manually (components/images/DeletedImagePlaceholder.tsx:1).
The custom NextAuth adapter now imports from the v4-compatible path, ensuring authentication continues to work after upstream package upgrades (lib/json-store/auth-adapter.ts:12).