Technology

The Little Freeze That Saves Your Data: Application‑Consistent Hot Backups with LVM Snapshots for MySQL and PostgreSQL

So there I was, late on a Wednesday night, staring at a server that absolutely refused to bring MySQL back up after a restore. The backup had run. The files looked fine. But the database? Grumpy. The logs were a soup of partial writes and half‑finished transactions. Ever had that feeling where your heart sinks because you realize the backup did its job, but you didn’t quite do yours? That was me, nursing a coffee and thinking about consistency.

Here’s the thing: backups aren’t just about copying files. They’re about copying the right state of those files. And for databases like MySQL and PostgreSQL, that means application‑consistent backups—hot backups that won’t make your database cry when you restore them. In this guide, I’ll walk you through the practical, low‑drama way to do that with LVM snapshots and fsfreeze, plus a sprinkle of database magic. We’ll cover what “application‑consistent” really means, why pausing the filesystem for a hot second works so well, and how to wire it all together with scripts you can actually run in production.

Why Application‑Consistent Beats “Copy‑Whatever’s‑There”

Think of your database like a cashier closing a register. If you yank the drawer while they’re making change, you’ll have coins all over the floor. That’s a crash‑consistent backup: the files are copied, but the state is mid‑transaction. You might recover, or you might spend your morning sweeping up. An application‑consistent backup, on the other hand, is like saying, “Hey, pause for two seconds, finish what you’re doing, and then keep going.” When you restore, everything lines up: WAL/redo logs, data files, metadata—clean and calm.

I’ve seen both sides: backups that technically worked but restored into hours of crash recovery, and backups that restored like a warm knife into butter because we took a moment to coordinate with the database. That tiny moment is the whole trick. And the nicest part? You don’t need fancy enterprise software to get there. Linux gives you LVM snapshots and fsfreeze; MySQL and PostgreSQL give you a couple of commands to quiesce safely. Combine them, and you’ve got hot backups that act like they were taken during a maintenance window—without the downtime.

The Core Idea: Quiesce, Snap, Unquiesce

Here’s the flow, at a human level: we ask the database to get into a backup‑friendly state (just for a moment), we freeze the filesystem so it stops changing, we take an LVM snapshot (a copy‑on‑write view of the volume), and then we immediately unfreeze and let the database continue. The lock is measured in seconds, not minutes. The snapshot becomes our source for a backup copy that we can mount somewhere else and copy off, without touching the live database.

If you’ve never used LVM snapshots, think of them like a time machine bookmark. The snapshot itself is tiny at first; it only grows by storing blocks that change after the snapshot. fsfreeze is the pause button—Linux tells the filesystem to flush and stop accepting new writes until we unfreeze. The result is a tidy point‑in‑time view of your database as the database intended it.

Quick caveat: if you’re on ZFS, you don’t need LVM for this. ZFS has its own snapshot magic and even send/receive for shipping backups. If that’s you, you might enjoy my piece on snapshots and replication in ZFS on Linux for Servers: the calm, no‑drama guide. The philosophy is the same—clean, atomic snapshots—but the commands are different.

What “Application‑Consistent” Looks Like for MySQL

Let’s talk MySQL first. Most production setups use InnoDB, which is transactional and pretty resilient, but we still want to help it line up the data and log files cleanly at the moment of the snapshot. The classic approach is:

One, request a global read lock with FLUSH TABLES WITH READ LOCK to briefly halt writes. Two, flush logs so we have a clean reference point. Three, freeze the filesystem and create the LVM snapshot. Four, unfreeze and unlock. The locking window is tiny—usually a couple of seconds if the snapshot size is reasonable and the system isn’t overloaded.

Yes, you can sometimes “get away with” only freezing the filesystem if InnoDB is running with safe flush settings, but I’ve learned the hard way that the extra two seconds of read lock are worth it for restores that just work. If you’re running replication, you can do this on a replica to keep production writes uninterrupted—more on that later.

MySQL: A Minimal Script You Can Adapt

Assumptions: your MySQL data dir is /var/lib/mysql, the logical volume is vg0/mysql, and you’re snapshotting 20G of COW space. Adjust to your environment.

# Variables
MYSQL_SOCK=/var/run/mysqld/mysqld.sock
MYSQL_CLI="mysql --protocol=socket --socket=$MYSQL_SOCK -uroot"
DATA_MNT=/var/lib/mysql
VG=vg0
LV=mysql
SNAP_NAME=mysql_snap_$(date +%Y%m%d%H%M%S)
SNAP_SIZE=20G
SNAP_MNT=/mnt/mysql-snap

set -euo pipefail

# 1) Ask MySQL to enter a backup-friendly state (brief, safe, coordinated)
$MYSQL_CLI <<'SQL'
FLUSH TABLES WITH READ LOCK;
FLUSH LOGS;
-- Keep the client open for this session while we snapshot.
-- We'll UNLOCK after snapshot completes using the same session if possible.
SQL

# This opens a second mysql client to keep the lock alive while we run commands.
# We use a FIFO trick to control when the session ends.
LOCK_FIFO=$(mktemp -u)
mkfifo "$LOCK_FIFO"
( $MYSQL_CLI <<SQL
SYSTEM echo "Lock in place...";
FLUSH TABLES WITH READ LOCK;
FLUSH LOGS;
SYSTEM echo "Freezing FS...";
SYSTEM fsfreeze -f $DATA_MNT;
SYSTEM lvcreate -L $SNAP_SIZE -s -n $SNAP_NAME $VG/$LV;
SYSTEM fsfreeze -u $DATA_MNT;
UNLOCK TABLES;
SQL
) < "$LOCK_FIFO" &
LOCK_PID=$!

# Feed and close FIFO (so the subshell runs). This pattern ensures ordering.
echo done > "$LOCK_FIFO"
rm -f "$LOCK_FIFO"
wait $LOCK_PID

# 2) Mount snapshot read-only elsewhere for copying
mkdir -p "$SNAP_MNT"
mount -o ro /dev/$VG/$SNAP_NAME "$SNAP_MNT"

echo "Snapshot mounted at $SNAP_MNT"
# From here, rsync or tar to your backup store.
# Example: rsync -aH --numeric-ids "$SNAP_MNT"/ /backup/mysql/$(date +%F)/
# After copying:
# umount "$SNAP_MNT" && lvremove -y /dev/$VG/$SNAP_NAME

There are a few ways to structure the lock so it persists just long enough. The pattern above uses a single client session to lock, run filesystem operations via the client’s SYSTEM command, then unlock. If your environment restricts the SYSTEM command in the client, you can do a similar orchestration by capturing a connection ID with SELECT CONNECTION_ID() and later KILLing it (which releases the lock), but that’s a little more advanced. The core idea is the same: lock, freeze, snapshot, unfreeze, unlock.

When restoring, InnoDB will see a clean state, with redo logs aligned to the data files. If you use binary logs for replication, your FLUSH LOGS step also gives you a neat cutover point.

What “Application‑Consistent” Looks Like for PostgreSQL

PostgreSQL has a friendly concept called “backup mode.” We tell Postgres we’re starting a base backup, it writes a label and coordinates WAL, and then we stop the backup after the snapshot. On modern versions (15+), you’ll see pg_backup_start() and pg_backup_stop(); on older setups, it’s pg_start_backup() and pg_stop_backup(). Both get the job done. The other key piece is WAL archiving—ensure archive_mode is enabled so the WAL segments that cover your snapshot are preserved until the stop command completes.

What I love about this flow is how predictable it is. Postgres is very explicit: “Okay, I know you’re taking a backup; here’s what I’ll make sure to keep around.” When you restore, you have everything you need for a consistent startup, and if you want, you can even replay to a specific point in time.

PostgreSQL: A Minimal Script You Can Adapt

Assumptions: data dir on /var/lib/postgresql/14/main (adjust for your version), logical volume vg0/pgdata, snapshot size 20G.

# Variables
PSQL="psql -U postgres -h /var/run/postgresql"
DATA_MNT=/var/lib/postgresql/14/main
VG=vg0
LV=pgdata
SNAP_NAME=pg_snap_$(date +%Y%m%d%H%M%S)
SNAP_SIZE=20G
SNAP_MNT=/mnt/pg-snap

set -euo pipefail

# 1) Tell Postgres to begin a backup (non-exclusive, fast)
$PSQL -c "SELECT CASE WHEN current_setting('server_version_num')::int >= 150000
  THEN pg_backup_start('lvm', true)
  ELSE pg_start_backup('lvm', true)
END;"

# 2) Freeze FS, snapshot, unfreeze
fsfreeze -f "$DATA_MNT"
lvcreate -L "$SNAP_SIZE" -s -n "$SNAP_NAME" "$VG/$LV"
fsfreeze -u "$DATA_MNT"

# 3) Stop backup
$PSQL -c "SELECT CASE WHEN current_setting('server_version_num')::int >= 150000
  THEN pg_backup_stop()
  ELSE pg_stop_backup()
END;"

# 4) Mount snapshot read-only to copy
mkdir -p "$SNAP_MNT"
mount -o ro /dev/$VG/$SNAP_NAME "$SNAP_MNT"

echo "Snapshot mounted at $SNAP_MNT"
# Now copy to backup store, e.g. rsync/tar as needed

If you haven’t set up archiving, do that first. It’s a simple setting that makes your entire backup story more reliable. The official PostgreSQL documentation on continuous archiving and backups is excellent for a quick sanity check on your configuration.

LVM Snapshots and fsfreeze: The Nuts and Bolts

Let’s demystify what’s happening on the Linux side. When you run fsfreeze, the filesystem flushes all in‑flight writes to disk and then pauses new writes. It’s not killing your app; it’s just telling the kernel to hold writes for a brief moment. That gives you a stable on‑disk image to snapshot. Then, lvcreate -s creates a snapshot logical volume that references the original volume. New writes on the origin volume are stashed in the COW area, so the snapshot keeps seeing the old blocks. The snapshot is read‑only by design in our use case—perfect for a clean backup source.

On ext4 and XFS, fsfreeze is well supported. If you’re curious, the fsfreeze manual page has a short, readable description of what it actually does. For LVM specifics like how snapshot size interacts with write rates (and why too small is a bad idea), the Red Hat LVM snapshots guide is a solid reference.

One gotcha I see a lot: make sure your database’s entire data directory (and logs if you’re keeping them on disk nearby) live on the same logical volume that you snapshot. If pieces are scattered across volumes, your snapshot won’t be consistent. My rule of thumb is to keep database storage simple and deliberate—one volume for the database data, optionally a separate one for WAL/binlogs, but then snapshot them together in a coordinated way or co‑locate them if that’s simpler for your setup.

A Full Recipe: From Snapshot to Off‑site Backup

Let’s walk through the full dance in plain English, the way I do it on actual servers.

First, prep your environment. Confirm that your database data directory is on LVM. lsblk -f and df -h are your friends. Check that fsfreeze is available (it usually is on modern distros), and test mounting a dummy snapshot on a non‑critical volume if you’ve never done it before. For PostgreSQL, sanity‑check archiving. For MySQL, ensure your storage engine is InnoDB for the tables that matter.

Second, coordinate with the database. For MySQL, that means a brief lock + log flush; for PostgreSQL, backup start. Third, freeze the filesystem and create the snapshot. Fourth, unfreeze and release locks immediately. Fifth, mount the snapshot read‑only under /mnt/whatever, and perform your file copy to a backup location.

Where should the backup go? I love shipping to object storage. If you run your own, MinIO is a wonderful S3‑compatible option, and it runs beautifully even on a modest VPS. If that sparks ideas, I’ve got a step‑by‑step on TLS and erasure coding in Production‑Ready MinIO on a VPS. The point is: get the snapshot copy off the box. Local is fine for fast restores; off‑site is insurance.

Shipping the Snapshot Copy

Once the snapshot is mounted read‑only, you can use your favorite tooling: rsync, tar to a compressed archive, or even a streaming upload directly to S3 with aws s3 cp -. I usually prefer rsync -aH --numeric-ids to keep permissions intact, then a separate process to upload or sync to object storage. Some folks like to create a file‑level archive (tar.gz) of the snapshot, then upload the single file; others push directories as‑is. Both are fine—do whichever makes restores feel easy for you.

Retention and Rotation

Here’s a nice trick: keep yesterday’s backup mounted somewhere safe for quick diffs, but make sure to clean up old snapshots (lvremove) to avoid slowly filling the COW space. For object storage, write a lightweight retention policy: keep dailies for a week, weeklies for a month, monthlies for a year. Simplicity scales.

Testing Restores (The Habit That Pays Dividends)

If there’s one drum I beat over and over: practice restores like you practice fire drills. Every so often, spin up a test VM, fetch last night’s backup, and actually start the database. You’ll catch things in minutes that would take hours to diagnose in a crisis. For MySQL, point mysqld at a throwaway data directory restored from the backup and start it on a random port with networking disabled. For PostgreSQL, initialize a test cluster, replace data files with the backup, ensure WAL is present, and bring it up to see it recover.

A fun real‑world trick I picked up: restore to a new server and run your application’s smoke tests against it. If that sounds like overkill, it probably is—until you need it. You don’t have to do this every day, but a quarterly check can save you from surprises. If your stack leans heavily on multi‑region or replicas, I’ve written about pragmatic approaches to staying online during ugly days in a friendly guide to multi‑region architectures with DNS geo‑routing.

Common Pitfalls (And How to Dodge Them)

I’ve made enough mistakes here to fill a small notebook. Here are the ones I look out for, so you don’t have to learn the hard way.

One, snapshot too small. If your snapshot COW space is undersized and the system writes heavily while you’re copying, the snapshot can become invalid. Err on the generous side for snapshot size, especially during busy hours. Two, scattered files. If key files are on different volumes, the snapshot won’t be consistent. Keep your database layout tidy or snapshot the relevant volumes in a coordinated sequence with the same freeze window.

Three, missing WAL/binlogs. PostgreSQL without archiving or MySQL without flushed logs is a recipe for trouble during restore. The whole point of application‑consistent backups is lining these up. Four, forgetting to unfreeze. Always verify fsfreeze -u succeeded—if your monitoring starts yelling that disk writes are stuck, this is the reason. Five, restoring permissions. Make sure you preserve ownership and modes during backup and restore. rsync -aH --numeric-ids helps; same goes for extracting tar archives with --same-owner.

And six, assuming containerization changes the rules. If your DB runs in a container but the storage is on the host’s LVM, you still freeze and snapshot on the host. If your storage layer is something like Longhorn or a cloud volume with its own snapshot API, use that layer’s snapshot mechanics instead of LVM. The principle remains: coordinate with the database, then snapshot the storage.

Security and Shipping: Don’t Forget the Boring Stuff

Backups contain your crown jewels. Protect them. Encrypt at rest if you’re storing snapshots locally for any duration. When shipping to object storage, use TLS and bucket policies that keep the blast radius small. Least privilege isn’t glamorous, but it’s the reason you sleep well. If you need a gentle tour through hardening a machine that actually aligns with how people work day‑to‑day, my calm guide on securing a VPS server for real people is a nice companion read.

For some shops, LUKS on the underlying block device makes sense, and snapshots inherit that encryption (since it’s at the block level). Just make sure your recovery playbook includes how to unlock that encryption on a new host—you don’t want to be looking up luksOpen syntax at 3 a.m.

Doing Backups on Replicas (The Unsung Hero Move)

If your workload is high‑traffic and you’re worried about even a 2‑second read lock, take snapshots on a replica. For MySQL, a read‑only replica is ideal; apply the same FTWRL + snapshot dance there. For Postgres, a hot standby works well—coordinate backup start/stop via the primary if needed, or snapshot the standby with WAL shipping intact. This offloads the tiny backup pause from your primary and gives you more flexibility with snapshot timing.

It pairs beautifully with a multi‑region mindset. Even if you’re not ready for multi‑region failover today, practicing backups and restores on a replica in a different AZ or region makes your recovery muscle stronger. If you’re curious about how I approach the bigger architecture conversation without losing my mind in the details, here’s that friendly multi‑region guide again for weekend reading.

Alternative: ZFS Snapshots (If You’re Already There)

I mentioned this earlier, but it’s worth repeating: if your database volumes live on ZFS, consider using native ZFS snapshots and send/receive instead of LVM. ZFS snapshots are atomic and cheerful under pressure, and the tooling for replication is excellent. I’ve written a full tour of how I use ARC, ZIL/SLOG, snapshots, and send/receive—if you’re planning or already running ZFS, bookmark the calm, no‑drama ZFS guide for later.

Putting It All Together: A Friendly, End‑to‑End Bash Script

Here’s a more complete example that wraps both MySQL and PostgreSQL logic. It’s not meant to be a one‑size‑fits‑all solution—think of it as a starting point you can bend to your world. I’ve commented it heavily so future‑you will thank past‑you.

#!/usr/bin/env bash
# app_consistent_lvm_backup.sh
# Usage examples:
#   MYSQL: DATA_MNT=/var/lib/mysql VG=vg0 LV=mysql DB=mysql ./app_consistent_lvm_backup.sh
#   POSTGRES: DATA_MNT=/var/lib/postgresql/14/main VG=vg0 LV=pgdata DB=postgres ./app_consistent_lvm_backup.sh

set -euo pipefail

DB=${DB:-mysql}                 # mysql | postgres
DATA_MNT=${DATA_MNT:?must set}
VG=${VG:?must set}
LV=${LV:?must set}
SNAP_SIZE=${SNAP_SIZE:-20G}
SNAP_NAME=${SNAP_NAME:-snap_$(date +%Y%m%d%H%M%S)}
SNAP_DEV=/dev/$VG/$SNAP_NAME
SNAP_MNT=${SNAP_MNT:-/mnt/$SNAP_NAME}

cleanup() {
  # Child scripts can trap here if they want extra cleanup
  true
}
trap cleanup EXIT

begin_mysql() {
  local mysql_cli=${MYSQL_CLI:-"mysql -uroot --protocol=socket --socket=/var/run/mysqld/mysqld.sock"}
  $mysql_cli <<SQL
FLUSH TABLES WITH READ LOCK;
FLUSH LOGS;
SQL
}

end_mysql() {
  local mysql_cli=${MYSQL_CLI:-"mysql -uroot --protocol=socket --socket=/var/run/mysqld/mysqld.sock"}
  $mysql_cli -e "UNLOCK TABLES;"
}

begin_postgres() {
  local psql=${PSQL:-"psql -U postgres -h /var/run/postgresql"}
  $psql -c "SELECT CASE WHEN current_setting('server_version_num')::int >= 150000 THEN pg_backup_start('lvm', true) ELSE pg_start_backup('lvm', true) END;"
}

end_postgres() {
  local psql=${PSQL:-"psql -U postgres -h /var/run/postgresql"}
  $psql -c "SELECT CASE WHEN current_setting('server_version_num')::int >= 150000 THEN pg_backup_stop() ELSE pg_stop_backup() END;"
}

freeze_and_snapshot() {
  echo "Freezing filesystem at $DATA_MNT"
  fsfreeze -f "$DATA_MNT"
  lvcreate -L "$SNAP_SIZE" -s -n "$SNAP_NAME" "$VG/$LV"
  fsfreeze -u "$DATA_MNT"
}

mount_snapshot() {
  mkdir -p "$SNAP_MNT"
  mount -o ro "$SNAP_DEV" "$SNAP_MNT"
  echo "Snapshot mounted at $SNAP_MNT"
}

case "$DB" in
  mysql)
    begin_mysql
    freeze_and_snapshot
    end_mysql
    ;;
  postgres)
    begin_postgres
    freeze_and_snapshot
    end_postgres
    ;;
  *)
    echo "Unknown DB: $DB" >&2
    exit 1
    ;;
esac

mount_snapshot

echo "Now copy from $SNAP_MNT to your backup destination (rsync/tar/s3)."
echo "When done: umount $SNAP_MNT && lvremove -y $SNAP_DEV"

Remember to run as root (or via sudo) because both fsfreeze and lvcreate want elevated privileges. Also, if your system has AppArmor/SELinux policies that restrict the mysql client’s SYSTEM command or socket access, adjust accordingly—sometimes it’s cleaner to keep the orchestration outside the client and use shorter lock windows with careful sequencing.

Performance and Snapshot Sizing

How big should the snapshot be? It depends on how long your copy takes and how write‑heavy your workload is. Snapshots store changed blocks since the snapshot was taken—so if your copy runs for 45 minutes and you’re writing heavily, the snapshot grows. If it fills up, the snapshot becomes invalid. I typically start with something like 10–20% of the origin LV size for a daily backup window in a moderately busy system, then watch and tune. If you’re copying to a local SSD and then pushing to object storage asynchronously, you can get away with smaller snapshots because your copy finishes faster.

And yes, snapshots have overhead on the origin volume. It’s usually modest but measurable. I try to schedule backups during periods when the system can spare a bit of I/O contention. If you can’t, that’s when backing up on a replica shines—you free the primary from that extra load entirely.

When to Use This vs. Logical Backups

There’s a place for mysqldump --single-transaction and pg_dump too. Logical backups are great for portability (migrate versions, move schemas, etc.) and for lower‑impact streaming. But when you want fast restores with the exact binary state—including indexes and tuning—file‑level backups from consistent snapshots are hard to beat. In my playbook, I keep both: nightly application‑consistent snapshots for fast recovery, and periodic logical backups for migration and schema insurance.

Bonus: Shipping to S3/MinIO the Clean Way

Once you’ve got your snapshot mounted, you can pipe a tar straight to object storage, avoid writing big archives to disk, and keep things nice and tidy:

cd /mnt/mysql-snap
# Stream backup directly to S3
sudo tar -I 'gzip -1' -cf - . | aws s3 cp - s3://my-backups/mysql/$(date +%F)/mysql.tar.gz

# Or to MinIO using mc (MinIO client)
# mc cp -r /mnt/pg-snap myminio/backups/pg/$(date +%F)/

MinIO has been a rock‑solid piece in my stack. If you want a friendly walkthrough of TLS, erasure coding, and bucket policies that don’t become a second job, here’s that deeper dive: Production‑Ready MinIO on a VPS.

Troubleshooting Checklist

If something feels off, I run through this quick checklist:

Is fsfreeze supported and returning success? Did the lvcreate -s command complete quickly, or hang (which would extend the lock window)? Is the snapshot mounted read‑only as expected? Do you see WAL/binlog files aligned with your backup time? Did you verify permissions and SELinux contexts on restore? Finally, does the database actually start from the backup on a clean test VM? Nothing beats that last one.

Also, keep logs! Write the timestamp of your backup start/stop and snapshot name to a simple log file. When you’re in detective mode, those tiny breadcrumbs save time.

Wrap‑Up: Give Your Future Self the Gift of Calm Restores

I’ve lost count of how many times a tiny bit of coordination saved me hours on the restore side. Application‑consistent backups with LVM snapshots and fsfreeze are the kind of technique you implement once, then quietly rely on for years. You get the speed of file‑level restores, the safety of database‑aware snapshots, and the peace of mind that your backups aren’t a Schrödinger’s cat situation.

If you’re already using another snapshotting filesystem like ZFS, lean into it. If you’re scaling backups to replicas or multi‑region, build it into your routine. And whatever you do, practice restores—just a little—so they’re muscle memory. If you want to go even deeper on storage internals, my longform write‑up on ZFS snapshots and send/receive pairs nicely with this approach, and if you’re thinking about resilient topologies, that multi‑region architecture story has plenty of field notes.

Hope this helped connect the dots. If it saves you a late‑night headache sometime, we both win. See you in the next post—may your snapshots be small, your restores be boring, and your coffee still warm when it’s over.

Frequently Asked Questions

Great question! Crash‑consistent is like pulling the plug and copying whatever’s on disk—restore might work, but recovery can be messy. Application‑consistent pauses the database just long enough to flush and line up data with logs. When you restore, it starts cleanly without drama.

Nope! That’s the beauty of it. You keep the database online. For MySQL, you hold a brief global read lock and freeze the filesystem for a couple of seconds while you snapshot. For PostgreSQL, you enter backup mode, freeze, snapshot, unfreeze, and stop the backup. The lock window is tiny.

It depends on how long your copy runs and how write‑heavy your workload is. The snapshot stores changes since you took it. I like starting with 10–20% of the origin LV for a daily window, then monitoring growth. If your snapshot fills up, it becomes invalid—so size it generously at first.