Files
gh-whamp-whamp-claude-tools…/skills/pocketbase/references/core/data_migration.md
2025-11-30 09:06:02 +08:00

8.5 KiB
Raw Blame History

Data Migration Workflows

PocketBase does not ship with a one-click import/export pipeline, but the core project maintainers outline several supported patterns in GitHub discussion #6287. This guide explains how to choose the right workflow, hardens the existing helper scripts, and points to extension patterns you can adapt for larger migrations.


Decision Guide

Scenario Recommended Path Notes
Small/medium data sets (< 100k records) and you just need JSON dumps Web API scripts Works everywhere; slower but simplest to automate
You want transactions, schema automation, or better performance Custom CLI commands Implement in JS pb_hooks or native Go extensions
You must transform data from another live database Mini Go program bridging databases Connect to PocketBase pb_data alongside the legacy DB
You already have CSV or SQLite dumps External tooling sqlite3 .import, community tools like pocketbase-import
You need full control and understand PB internals Raw SQLite scripts Only if you know how PB stores complex field types

Tip: If you are migrating an application that already works and you do not plan on extending it, consider whether the migration effort is worth it—the PocketBase author recommends staying on the stable stack unless you need PB-specific capabilities.


Pre-flight Checklist

  1. Back up pb_data/ first. Use sqlite3 or the Backups API before experimenting.
  2. Create collections and fields up-front. Use the Admin UI, migrations (./pocketbase migrate collections), or extension code so relations, file fields, and validation rules exist before import.
  3. Map unique keys per collection. Decide which field(s) you will use for upserts (e.g., email on users).
  4. Audit data types. PocketBase stores multi-selects and relation sets as JSON arrays, and file fields expect PocketBase-managed file IDs.
  5. Plan authentication. Admin endpoints require a superuser token; scripts now prompt for credentials.
  6. Run a dry run. Use the script --dry-run flag or custom command to validate payloads before writing.

Option 1: Web API Scripts

Use the hardened Python helpers in scripts/ when you need a portable solution without custom builds.

Export

python scripts/export_data.py \
  http://127.0.0.1:8090 \
  pb_export \
  --email admin@example.com \
  --batch-size 500 \
  --format ndjson \
  --exclude _pb_users,_migrations
  • Authenticates as an admin (password prompt if omitted).
  • Enumerates collections dynamically; filter with --collections or --exclude.
  • Streams records page-by-page and writes per-collection .json or .ndjson files plus a manifest.json summary.
  • Use NDJSON for large exports where you want to stream line-by-line elsewhere.

Import

python scripts/import_data.py \
  http://127.0.0.1:8090 \
  pb_export \
  --email admin@example.com \
  --upsert users=email --upsert orders=orderNumber \
  --concurrency 4 \
  --batch-size 200 \
  --dry-run
  • Supports .json and .ndjson dumps.
  • Cleans system fields (id, created, updated, @expand).
  • Optional per-collection upserts via --upsert collection=field (use *=field as a fallback).
  • Batches and runs limited concurrency to reduce HTTP latency, with optional throttling between batches.
  • --dry-run validates payloads without writing to the database. When satisfied, re-run without the flag.
  • Fails fast if a collection is missing unless --skip-missing is set.

This approach is intentionally simple and aligns with the "v1" recommendation from the PocketBase maintainer. Expect higher runtimes for large datasets but minimal setup.


Option 2: Custom CLI Commands

Register commands inside pb_hooks/ or a Go extension to bypass the REST layer and operate inside a database transaction.

JS pb_hooks example

/// <reference path="../pb_data/types.d.ts" />
const { Command } = require("commander");

$app.rootCmd.addCommand(new Command({
  use: "data:import <file> <collection>",
  run: (cmd, args) => {
    const rows = require(args[0]);
    const collection = $app.findCollectionByNameOrId(args[1]);
    $app.runInTransaction((tx) => {
      for (const row of rows) {
        const record = new Record(collection);
        record.load(row);
        tx.save(record);
      }
    });
  },
}));

$app.rootCmd.addCommand(new Command({
  use: "data:export <collection> <file>",
  run: (cmd, args) => {
    const records = $app.findAllRecords(args[0], cmd.getOptionValue("batch") || 1000);
    $os.writeFile(args[1], JSON.stringify(records, null, 2), 0o644);
  },
}));
  • Invoke with ./pocketbase data:import ./users.json users.
  • Wrap heavy operations in runInTransaction and consider saveNoValidate only after cleaning data.
  • Extend with chunks, progress logs, or schema checks per your needs.

See also: references/go/go_console_commands.md for Go equivalents and CLI wiring tips.


Option 3: Mini Go Bridge

For zero-downtime migrations or complex transformations, create a Go program that embeds PocketBase and connects to your legacy database driver (database/sql, pgx, etc.).

High-level steps:

  1. Import github.com/pocketbase/pocketbase as a module and boot the app in headless mode.
  2. Connect to the legacy database, stream rows, and normalize data types.
  3. Use app.RunInTransaction plus app.FindCollectionByNameOrId to create records directly.
  4. Batch writes to avoid exhausting memory; reuse prepared statements for speed.

Refer to references/go/go_database.md and references/go/go_migrations.md for transaction helpers and schema management patterns.


Option 4: External Import Tools

  • sqlite3 CLI (.import, .dump, .excel): usable when the source data already matches the PocketBase schema. Ensure collections/fields exist first.
  • Community tool michal-kapala/pocketbase-import: handles CSV and flat JSON, creates text fields dynamically, and wraps operations in a transaction.
  • Custom CSV pipelines: parse CSV with your preferred language, then leverage the REST scripts or CLI commands above.

Always inspect the generated SQLite tables after import to confirm multi-value fields and relation columns are stored as expected.


Option 5: Raw SQLite Scripts

This path edits pb_data/data.db directly. Only attempt it if you fully understand PocketBases internal schema conventions:

  1. Snapshot the database before touching it.
  2. Insert _collections metadata before writing to collection tables so the Admin UI and APIs recognize the data.
  3. Convert non-SQLite dumps (PostgreSQL/MySQL) to SQLite-compatible syntax.
  4. Manually serialize multiselects, relation lists, and JSON fields.

Treat this as a last resort when other methods are impractical.


Validation & Rollback

  1. Compare counts between source and target collections (records/count endpoint or SQL).
  2. Spot-check a few complex records (relations, files, arrays).
  3. Run application-level smoke tests or automation scripts.
  4. If issues appear, restore the pre-flight backup and iterate.
  5. Document the exact command set you used for future recoveries.


Summary Checklist

  • Pick a workflow that matches the data volume and complexity.
  • Prepare schema and unique constraints before importing.
  • Run exports with authentication and pagination.
  • Test imports with --dry-run, then run again without it.
  • Validate data counts and integrity, keep a rollback plan handy.