Clawdbot stores its local data on the machine that runs the Gateway, and it treats that machine’s state + workspace as the durable “source of truth.” Concretely, there are two buckets you should care about: (1) Gateway state (configuration, auth profiles, channel sessions) and (2) the agent workspace (your assistant’s instructions and memory files). The docs describe the workspace model clearly: Clawd (the assistant persona running through Clawdbot) reads operating instructions and “memory” from a workspace directory, and the default workspace is ~/clawd. On first setup or first agent run, Clawdbot will create this directory and seed it with starter files like AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, and USER.md. That is the simplest mental model: if you want to back up the assistant’s long-term behavior and memory, back up the workspace folder; if you want to migrate to a new machine, migrate the Gateway state + workspace together.
You can change where this data lives, but you should do it intentionally. The docs note that you can choose a different workspace path via configuration (for example, agent.workspace), and you can even disable bootstrap file creation if you already ship your own workspace files from a repo and don’t want Clawdbot to generate defaults. They also explicitly recommend treating the workspace folder like the assistant’s memory and making it a git repo (ideally private) so the instruction files and memory entries are backed up and diffable. This is a developer-friendly design choice: instead of hiding memory in a proprietary database, Clawdbot leans on plain files you can open, review, and version. On the Gateway side, configuration and channel auth are also local: when you onboard, the wizard configures your Gateway mode and channels, and the CLI has health/status tooling to probe whether auth is configured and whether the Gateway can respond. In Docker deployments, you typically bind-mount both the state directory and the workspace directory so the container can be replaced without losing data; the official Docker guide calls Docker optional, but it’s a common choice when you want “throwaway gateway environment, persistent volumes.”
When your use case goes beyond “a few files of memory,” you’ll usually split storage by responsibility: keep the workspace as the human-readable truth and add an external index for retrieval. That’s where vector databases become useful without forcing them into every deployment. For example, you might store daily memory logs and long-lived instruction files in ~/clawd, but also embed select entries (or summaries) and store them in a vector database such as Milvus or Zilliz Cloud. Then, when you ask a question like “what did I decide about my VPS security last month?”, the assistant can retrieve the most relevant chunks semantically rather than scanning every file linearly. This pattern preserves Clawdbot’s “local-first, inspectable files” approach while giving you scalable recall. It also keeps deletion and privacy straightforward: delete or redact from the workspace, and delete matching vectors by metadata key in Milvus/Zilliz Cloud. The important point is that Clawdbot’s default storage is local filesystem state + workspace; anything beyond that is a deliberate integration you opt into.
