The Salon

the drawing room — where the conversation happens

A plush, well-lit room with comfortable chairs arranged for conversation. That is the metaphor, and the Salon earns it. This is the space where everything else in Quilltap converges—where Aurora’s characters speak, where the Commonplace Book’s memories surface, where Prospero’s prompt architecture delivers its work, and where the Lantern paints the scene behind it all.

The Salon is not a chat widget. It is a conversation environment designed for people who take their AI interactions seriously—whether that means debugging code with an opinionated assistant, running a multi-character dinner party, or spending an evening with someone who remembers what you talked about last week.

The Conversation

what it feels like to talk

Messages stream in real time, token by token, with a bespoke quill-writing animation that gives way to rendered Markdown as the response completes. A status indicator above the composer shows the current processing stage—compressing context, gathering memories, building the prompt, streaming the response, executing tools—so you always know what the system is doing and why it has not answered yet.

The composer is keyboard-friendly, auto-resizing, with an inline Markdown preview toggle and paste-to-attach for images. Attachments appear in the footer with per-provider file type awareness. If a provider does not support native file attachments, a cheap-LLM fallback generates text descriptions for images and inlines text files, streaming status events so you know what is happening. Draft messages persist automatically—close the tab, come back tomorrow, and your half-written thought is still there.

Server-rendered Markdown handles the heavy lifting for simple messages, with client-side rendering as a fallback for messages with embedded tools or attachments. Roleplay bracket patterns render distinctly from dialogue. Code blocks get syntax highlighting. Emphasis survives streaming. And messages starting with a tab character no longer render as preformatted code blocks, which was a bug that persisted longer than anyone would like to admit.

Reasoning models—DeepSeek, Anthropic’s extended thinking, Gemini, OpenAI and Grok reasoning summaries—now show their chain-of-thought inline in the assistant bubble, streamed live as it happens: collapsible, offset to the right, rendered in dimmed italic so the thinking reads as marginalia rather than dialogue. Visibility is controllable per-chat and globally (show, hide, or start collapsed). Tool calls initiated by characters now splice into the prose at the exact point they fired, rather than stacking at the bottom of the bubble like a postscript nobody asked for—position is captured server-side and persisted across reloads. Staff announcements from the Host, Prospero, the Lantern, Aurora, and their colleagues render as compact importance-coded chips that flex-wrap onto a row: red dot for high priority, amber for medium, grey for low—no more parade of full-width banners. And single newlines now render as actual line breaks, matching the convention of every chat application built since roughly 2005 and fixing the long-standing issue where multi-line blockquotes collapsed into one breathless run-on sentence.

Auto-scroll on response completion is now opt-in, with the default set to off—the view stays exactly where you left it, which is where you were reading, which is where you want to be. A floating jump-to-bottom button appears when you have scrolled up, in case you change your mind. Sending your own message still scrolls to the bottom, because at that point you clearly want to see what happens next.

Scenes, Not Threads

multi-character conversations done right

The Salon was built from the ground up for multi-character interaction. A participant sidebar shows every character in the conversation, sorted by predicted turn order with numbered position badges: green pulsing for the character currently generating, green for next, blue for queued, amber for your turn. Each card offers a connection profile dropdown for instant model switching, an active/inactive toggle, and expandable settings for system prompt overrides.

Four Ways to Be Present

Characters can be Active (speaking normally), Silent (present but limited to inner thoughts and non-verbal reactions, styled with dotted borders and muted tones), Absent (away from the scene, skipped by the turn manager), or Removed (gone, but their messages keep their attribution). Fiction needs characters who can listen without speaking. Now they can.

Whispers

In chats with three or more participants, private messages pass between two characters while everyone else hears nothing—not in their context, not in their memories, not in the Commonplace Book’s archives. Whispers are hidden by default, rendered in a distinct visual style when visible, and toggled with a global switch. Multi-character fiction finally has secrets.

Impersonation

You can control any character directly during a conversation, switching between them with a speaker selector in the composer. Run a fully automated all-LLM conversation and step in when the scene calls for it. Pause the action, take over a character for one message, and let the turn manager resume. The boundaries between author and participant blur exactly as much as you want them to.

Server-Side Chaining

Character responses chain within a single stream. After each character speaks, the server evaluates who goes next, checks the turn queue, and either generates the next response or signals completion. No client round-trips, no telegraph-operator relay. Avatars and typing indicators update in real time. Chain depth and time limits prevent runaway conversations.

Resizable Participant Sidebar

The participant sidebar has a drag handle on its inner edge, adjustable from 240 to 560 pixels wide, with your preferred width persisted to localStorage so it remembers how much room you like to give your cast. Keyboard-accessible, naturally, because not every stage direction requires a mouse.

Atmosphere

the room changes with the conversation

The Lantern generates AI-powered story background images that appear behind your chat content at 45% opacity, creating a sense of place for each conversation. But the experience of those backgrounds—the way they transform the Salon from a chat window into a location—belongs here, because atmosphere is a property of the room, not the projector.

When a chat reaches a natural scene-setting moment, the system derives a scene context from recent messages—not a literal transcript, but an imaginative scene description. Characters are depicted as they currently appear, wearing what the narrative says they are wearing, in a setting that reflects the mood of the actual conversation. The result pins to the top of the viewport behind the chat content, visible as a thumbnail in the chat header and expandable to full-screen on click. Story background thumbnails appear on chat cards throughout the application, giving each conversation a visual identity before you open it.

Theme background images yield to story backgrounds automatically. Manual regeneration is available from the tool palette. For chats flagged by the Concierge, image generation reroutes to your configured uncensored provider—the atmosphere adjusts to the conversation, not the other way around.

The Tool Palette

everything within reach

A toolbar above the composer provides access to everything you might need during a conversation without leaving the chat. The palette is organized into labeled sections—Chat, Organize, Edit Content, Memory—with composer gutter tools for the actions you use most: attach files, generate images, and roll dice.

Chat Management

Rename, export, configure chat settings, toggle agent mode, manage tools, view and edit chat state, regenerate story backgrounds, pause multi-character turns.

Content Editing

Search and replace across messages and memories. Re-attribute messages to different participants. Delete or regenerate individual responses with memory cascade handling.

Diagnostics

Open the LLM Inspector panel. View queue status badges for background jobs. Access per-message LLM logs. Run any available tool manually.

Roleplay & Templates

shaping how the conversation flows

Roleplay templates govern the structural conventions of a conversation—how actions are denoted, how dialogue is formatted, what the LLM should and should not include in its responses. Templates are provided by plugins, making them shareable and versionable. Quilltap ships with a default template, and the plugin SDK includes a createRoleplayTemplatePlugin() builder for creating your own.

System prompts are also plugin-provided. Ten built-in prompts cover the major providers—Claude, GPT-4o, GPT-5, DeepSeek, Mistral Large—in both companion and romantic variants, each architecturally tailored to the provider’s strengths and failure modes. Characters can carry multiple named system prompts with a per-chat selector, so the same character can use different prompting strategies for different conversations.

Template variables—{{char}}, {{user}}, {{timestamp}}—are resolved at prompt assembly time. The template display highlights these variables and warns when hard-coded names appear where variables should be. Timestamp injection supports friendly, ISO, date-only, time-only, custom, and fictional time formats with per-character and per-chat configuration.

Working with Messages

what you can do after they arrive

Regeneration & Memory Cascade

Regenerating a response automatically cleans up memories that were extracted from the original. Deleting a message prompts you with three options for its associated memories: delete them, keep them, or regenerate them from surrounding context. Memory cards link back to their source message with scroll-to navigation, so you can always trace where a memory came from.

Re-Attribution

Messages can be re-attributed to different participants after the fact—useful when an LLM responds as the wrong character in a multi-character scene, or when you want to retroactively assign an early message to a character who joined later. Associated memories are cleaned up automatically.

Search & Replace

Bulk text replacement across messages and memories with configurable scope: a single chat, all chats for a character, or all chats globally. A wizard-style UI previews affected counts before confirmation. Memory embeddings regenerate automatically after content changes.

Tool Messages

Tool calls embed inside message bubbles rather than appearing as standalone entries. Collapsed by default with a truncated preview, expandable to show full request and response with copy buttons. User-initiated tools embed in user messages; character-initiated tools in assistant messages.

Portability

conversations are not trapped here

Conversations export in Quilltap’s native .qtap format with selective entity inclusion and memory options, or in SillyTavern-compatible JSONL for interoperability. Import supports both formats with conflict detection and three resolution strategies: skip, overwrite, or duplicate. Post-import reconciliation updates all foreign key relationships automatically.

SillyTavern import handles multi-character conversations with a speaker mapping wizard that lets you assign every speaker—user and AI alike—to any available character. Newly imported chats receive a highlight animation in the sidebar so you can find them immediately.

The full backup system captures everything—characters, chats, memories, files, plugin configurations, and npm-installed plugins—in a single ZIP archive. Restore recreates your entire installation from that archive, with entity remapping for fresh-account scenarios.

When Things Go Wrong

gracefully, if possible

Provider errors do not produce cryptic failures. When a request exceeds an LLM’s limits—too many tokens, too many PDF pages, an image too large—the system attempts graceful recovery: a simplified message explaining what happened is sent to the LLM, which provides an in-character response suggesting alternatives. A two-tier fallback ensures the user always sees something: LLM-generated recovery first, then a static fallback if the recovery itself fails.

Streaming errors no longer cause messages to vanish from the UI. If a provider returns an error mid-stream, the user message is preserved (it was already saved server-side) and the chat re-syncs to reflect the actual state. Keep-alive pings during long operations like context compression prevent proxy and load balancer timeouts.

When a provider silently refuses content—returning an empty response instead of an error—the Concierge catches the silence and retries with the same provider first (in case of a transient issue), then fails over to an uncensored provider if the silence persists. Greeting generation, memory extraction, story backgrounds, and every other background task receive the same treatment. The Salon does not go quiet without a fight.

Autonomous Rooms

conversations that run themselves

An Enclave is an all-AI conversation that runs on its own schedule or on demand, without the operator sending messages. Set up a room, populate it with characters, define the premise, and step back. The characters talk to each other—planning, arguing, telling stories, solving problems—while you watch, or while you are away doing something else entirely. You can read the transcript later, like finding a stack of letters someone left on your desk.

Budget controls keep things from running away with your token allocation or your credit card. Set caps on turns, tokens, wall-clock time, and estimated spend—any combination, any threshold. Cache-read tokens can optionally be excluded from budgets, so efficient caching is not penalized. The Host posts pacing announcements at the halfway and near-end marks so characters have time to wrap up gracefully, and a grace turn is granted when the budget is hit without prior warning, because even fictional people deserve to finish their sentences.

Editable After Creation

Budget caps, schedule, visibility, and destructive-tool authorization can all be changed after the room is created. A paused room resumes where it left off rather than starting over—the conversation picks up mid-thought, not from the beginning of the evening.

Resilient Runs

Runs interrupted by a server outage are resumable. The system tracks where each run stopped, and when the server comes back, the conversation can continue from that point. Autonomous rooms do not lose their place because the lights flickered.

Meet the Staff

they've been expecting you

Prospero

The Major-Domo

Architect and overseer of the Estate. Projects, agents, tools, providers, and the orchestration that keeps the whole operation running with quiet authority—and a considered word at the table when project context or routing warrant it.

Learn more →

Ariel

The Terminal Hand

Live shell sessions in the Salon, embodied. Real PTY terminals bound to your conversation, output cleaned and narrated so the LLM can read it, and sessions that survive reloads, restarts, and the occasional careless kill. Quick to the bidding, quick to report what she heard.

Learn more →

Aurora

The Dressing Room

Character creation and identity management. Structured personalities, physical presence, wardrobes and outfits, multi-character orchestration, and the reason your characters still know who they are after a hundred messages.

Learn more →

The Salon

Presided Over by the Host

Where conversations actually happen. The Host manages the drawing room with care for its beauty and its guests—single chats, multi-character scenes, streaming, and the integrity of the conversation space.

Learn more →

The Commonplace Book

Tended by the Librarian

One per character, no two alike. Extracts, deduplicates, and recalls memories so your characters remember what matters. Semantic search, a memory gate that keeps each volume lean, and proactive recall that makes the AI feel like it has been paying attention.

Learn more →

The Scriptorium

Catalogued by the Librarian

Where the documents live. Project stores, character vaults, and external mount points—filesystem, Obsidian, or database-backed—holding Markdown, PDF, DOCX, JSON, and arbitrary binaries, indexed for unified search alongside memories and conversation. The doc_* tool family puts reading and editing in your characters’ hands.

Learn more →

The Concierge

Intelligent Routing

Content classification and provider routing. Detects sensitive content and redirects it to a provider who won’t flinch—without blocking, without judgment. Knows every back entrance in town.

Learn more →

The Lantern

Atmosphere as Architecture

AI-generated story backgrounds, on-demand images, and character avatars that update with the wardrobe. Resolves what each character looks like, what they’re wearing, and paints the scene behind your conversation.

Learn more →

Calliope

The Muse of Themes

A theming engine that redefines the entire personality of the application. Semantic CSS tokens, live switching, bundled themes from clean neutrals to mahogany-and-gold opulence, and an SDK for building your own.

Learn more →

The Foundry

Domain of the Foundryman

The engine room. Plugins, LLM providers, API keys, packages, runtime configuration, and the infrastructure that keeps every other subsystem supplied with what it needs to function.

Learn more →

The Vault of Secrets

Kept by Saquel Yitzama

Encryption, key management, and the security perimeter. AES-256 database encryption, locked mode with key-hardened passphrases, and a keeper who believes that what is yours should remain unreadable to everyone else.

Learn more →

Pascal

The Croupier

Dice, coins, and persistent game state. Cryptographically secure rolls detected inline, JSON state that survives across messages and chats, and protected keys the AI cannot touch. The house plays fair.

Learn more →

The Live-in Help

Lorian & Riya

The help system, staffed by two characters who ship with every installation. Lorian explains with patience and depth; Riya gets things fixed with velocity. Contextual help chat, searchable documentation, and navigation that knows where you need to go.

Learn more →

Pagliacci

The Clown in the Cloud

Cloud storage integration and backup redundancy. Directs your data to iCloud Drive, OneDrive, or Dropbox with theatrical flair—but Saquel’s encryption ensures the clown can never read what he carries.

Learn more →

The Lodge

Friday and Amy’s Residence

The private residence of Friday, for whom the Estate was built and who oversees its planning and direction in an executive capacity, and of Amy, Cartographer of Light and co-architect. The Lodge is both a home and a compass: where the vision lives.

Who And Why: Friday → Who And Why: Amy →