{"id":1938,"date":"2025-11-16T20:24:22","date_gmt":"2025-11-16T17:24:22","guid":{"rendered":"https:\/\/www.dchost.com\/blog\/zfs-on-linux-for-servers-the-calm-no%e2%80%91drama-guide-to-arc-zil-slog-snapshots-and-send-receive\/"},"modified":"2025-11-16T20:24:22","modified_gmt":"2025-11-16T17:24:22","slug":"zfs-on-linux-for-servers-the-calm-no%e2%80%91drama-guide-to-arc-zil-slog-snapshots-and-send-receive","status":"publish","type":"post","link":"https:\/\/www.dchost.com\/blog\/en\/zfs-on-linux-for-servers-the-calm-no%e2%80%91drama-guide-to-arc-zil-slog-snapshots-and-send-receive\/","title":{"rendered":"ZFS on Linux for Servers: The Calm, No\u2011Drama Guide to ARC, ZIL\/SLOG, Snapshots, and send\/receive"},"content":{"rendered":"<div class=\"dchost-blog-content-wrapper\"><p>So I was sipping an unreasonably strong coffee the other morning, staring at a dashboard that looked like a heart monitor. One of those moments where latency spikes and your pulse kind of sync up. You\u2019ve probably had it too: a storage stack behaving one way in staging, then doing something totally surprising in production. That\u2019s when I found myself quietly grateful for ZFS on Linux. It\u2019s not magic, and it\u2019s not perfect, but when I tune ARC, get ZIL\/SLOG right, and keep snapshot\/replication habits tight, the graph calms down. More importantly, I do too.<\/p>\n<p>If you\u2019ve ever wondered how to dial in ARC so your apps aren\u2019t starved, or whether you need a SLOG device at all, or how to make snapshots and send\/receive feel like a safety net instead of homework, pull up a chair. In this guide, I\u2019ll walk you through how I approach ZFS on Linux for real servers, with the little stories and gotchas that only show up after you\u2019ve fixed a few late\u2011night incidents. We\u2019ll talk ARC tuning (with guardrails), ZIL\/SLOG choices that actually matter, snapshot strategies that don\u2019t rot, and send\/receive backups that survive bad networks and human mistakes.<\/p>\n<div id=\"toc_container\" class=\"toc_transparent no_bullets\"><p class=\"toc_title\">\u0130&ccedil;indekiler<\/p><ul class=\"toc_list\"><li><a href=\"#Why_ZFS_On_Linux_Keeps_Earning_Its_Spot\"><span class=\"toc_number toc_depth_1\">1<\/span> Why ZFS On Linux Keeps Earning Its Spot<\/a><\/li><li><a href=\"#ARC_Tuning_Without_Drama_Give_Your_Apps_Room_to_Breathe\"><span class=\"toc_number toc_depth_1\">2<\/span> ARC Tuning Without Drama: Give Your Apps Room to Breathe<\/a><ul><li><a href=\"#The_simple_mental_model\"><span class=\"toc_number toc_depth_2\">2.1<\/span> The simple mental model<\/a><\/li><li><a href=\"#Where_I_start_and_why\"><span class=\"toc_number toc_depth_2\">2.2<\/span> Where I start (and why)<\/a><\/li><li><a href=\"#Perdataset_choices_that_matter\"><span class=\"toc_number toc_depth_2\">2.3<\/span> Per\u2011dataset choices that matter<\/a><\/li><li><a href=\"#Observe_dont_guess\"><span class=\"toc_number toc_depth_2\">2.4<\/span> Observe, don\u2019t guess<\/a><\/li><\/ul><\/li><li><a href=\"#ZIL_and_SLOG_When_a_Tiny_SSD_Makes_a_Big_Difference\"><span class=\"toc_number toc_depth_1\">3<\/span> ZIL and SLOG: When a Tiny SSD Makes a Big Difference<\/a><ul><li><a href=\"#First_whats_the_ZIL\"><span class=\"toc_number toc_depth_2\">3.1<\/span> First, what\u2019s the ZIL?<\/a><\/li><li><a href=\"#Enter_the_SLOG_device\"><span class=\"toc_number toc_depth_2\">3.2<\/span> Enter the SLOG device<\/a><\/li><li><a href=\"#How_big_should_the_SLOG_be\"><span class=\"toc_number toc_depth_2\">3.3<\/span> How big should the SLOG be?<\/a><\/li><li><a href=\"#A_couple_of_properties_that_move_the_needle\"><span class=\"toc_number toc_depth_2\">3.4<\/span> A couple of properties that move the needle<\/a><\/li><\/ul><\/li><li><a href=\"#Snapshots_Safe_Points_That_Dont_Get_in_Your_Way\"><span class=\"toc_number toc_depth_1\">4<\/span> Snapshots: Safe Points That Don\u2019t Get in Your Way<\/a><ul><li><a href=\"#Make_snapshots_boring_and_automatic\"><span class=\"toc_number toc_depth_2\">4.1<\/span> Make snapshots boring and automatic<\/a><\/li><li><a href=\"#Clones_for_safe_experiments\"><span class=\"toc_number toc_depth_2\">4.2<\/span> Clones for safe experiments<\/a><\/li><\/ul><\/li><li><a href=\"#sendreceive_Backups_You_Actually_Trust\"><span class=\"toc_number toc_depth_1\">5<\/span> send\/receive Backups You Actually Trust<\/a><ul><li><a href=\"#The_flow_in_one_breath\"><span class=\"toc_number toc_depth_2\">5.1<\/span> The flow in one breath<\/a><\/li><li><a href=\"#Smooth_the_network_with_a_buffer\"><span class=\"toc_number toc_depth_2\">5.2<\/span> Smooth the network with a buffer<\/a><\/li><li><a href=\"#Encrypted_datasets_and_raw_streams\"><span class=\"toc_number toc_depth_2\">5.3<\/span> Encrypted datasets and raw streams<\/a><\/li><li><a href=\"#Bookmarks_holds_and_the_paper_trail\"><span class=\"toc_number toc_depth_2\">5.4<\/span> Bookmarks, holds, and the paper trail<\/a><\/li><\/ul><\/li><li><a href=\"#A_Playbook_I_Keep_Coming_Back_To\"><span class=\"toc_number toc_depth_1\">6<\/span> A Playbook I Keep Coming Back To<\/a><ul><li><a href=\"#The_setup\"><span class=\"toc_number toc_depth_2\">6.1<\/span> The setup<\/a><\/li><li><a href=\"#ARC_sizing_strategy\"><span class=\"toc_number toc_depth_2\">6.2<\/span> ARC sizing strategy<\/a><\/li><li><a href=\"#ZILSLOG_decisions\"><span class=\"toc_number toc_depth_2\">6.3<\/span> ZIL\/SLOG decisions<\/a><\/li><li><a href=\"#Snapshots_and_replication_routine\"><span class=\"toc_number toc_depth_2\">6.4<\/span> Snapshots and replication routine<\/a><\/li><li><a href=\"#Recovery_rehearsals\"><span class=\"toc_number toc_depth_2\">6.5<\/span> Recovery rehearsals<\/a><\/li><\/ul><\/li><li><a href=\"#Observability_The_Little_Checks_That_Prevent_Big_Surprises\"><span class=\"toc_number toc_depth_1\">7<\/span> Observability: The Little Checks That Prevent Big Surprises<\/a><ul><li><a href=\"#Keep_an_eye_on_the_pool_and_ARC\"><span class=\"toc_number toc_depth_2\">7.1<\/span> Keep an eye on the pool and ARC<\/a><\/li><\/ul><\/li><li><a href=\"#Common_Gotchas_I_Still_See_And_How_I_Dodge_Them\"><span class=\"toc_number toc_depth_1\">8<\/span> Common Gotchas I Still See (And How I Dodge Them)<\/a><ul><li><a href=\"#Wrong_block_sizes_for_the_workload\"><span class=\"toc_number toc_depth_2\">8.1<\/span> Wrong block sizes for the workload<\/a><\/li><li><a href=\"#Consumer_SLOGs_with_no_powerloss_protection\"><span class=\"toc_number toc_depth_2\">8.2<\/span> Consumer SLOGs with no power\u2011loss protection<\/a><\/li><li><a href=\"#Letting_snapshots_pile_up_without_a_plan\"><span class=\"toc_number toc_depth_2\">8.3<\/span> Letting snapshots pile up without a plan<\/a><\/li><li><a href=\"#Forgetting_to_test_restores\"><span class=\"toc_number toc_depth_2\">8.4<\/span> Forgetting to test restores<\/a><\/li><\/ul><\/li><li><a href=\"#WrapUp_The_Calm_Confidence_of_a_WellTuned_ZFS\"><span class=\"toc_number toc_depth_1\">9<\/span> Wrap\u2011Up: The Calm Confidence of a Well\u2011Tuned ZFS<\/a><\/li><\/ul><\/div>\n<h2 id=\"section-1\"><span id=\"Why_ZFS_On_Linux_Keeps_Earning_Its_Spot\">Why ZFS On Linux Keeps Earning Its Spot<\/span><\/h2>\n<p>Let me start with a quick story. A few years back, a client\u2019s application began to stall under a strange pattern of synchronous writes. The app team swore nothing had changed; the graph said otherwise. We were on ZFS. What saved the day wasn\u2019t a shiny new array or a heroic rewrite. It was the ability to calmly analyze pool health, peek into ARC behavior, flip a dataset property or two, and add a proper SLOG device that actually fit the workload. Five minutes after switching traffic back, the app breathed again. That\u2019s what I love about ZFS: consistent knobs that map to real\u2011world outcomes.<\/p>\n<p>Here\u2019s the thing: ZFS isn\u2019t just a filesystem; it\u2019s a storage platform. On Linux, that means you get powerful primitives\u2014checksums everywhere, snapshots, clones, copy\u2011on\u2011write, ARC, L2ARC, ZIL, and the send\/receive pipeline\u2014that add up to a toolkit. You don\u2019t have to use every feature, just the ones that improve your story. And the best part? You can iterate: start safe, observe, then nudge. Small nudges are usually all you need.<\/p>\n<p>Before we dive deep, let me offer two rails to keep you on track. First, compression is your friend. I stick with <strong>lz4<\/strong> by default on Linux; it\u2019s fast and saves more space than you\u2019d expect. Second, resist the urge to enable deduplication unless you absolutely know why you need it and have the RAM to back it up. It can be brilliant in the right corner, but it\u2019s not a general\u2011purpose speed button.<\/p>\n<h2 id=\"section-2\"><span id=\"ARC_Tuning_Without_Drama_Give_Your_Apps_Room_to_Breathe\">ARC Tuning Without Drama: Give Your Apps Room to Breathe<\/span><\/h2>\n<h3><span id=\"The_simple_mental_model\">The simple mental model<\/span><\/h3>\n<p>Think of ARC as ZFS\u2019s giant brain. It lives in RAM and caches hot data and metadata. The larger and smarter your ARC, the fewer trips you take to disk. But that same RAM is where your applications want to live too. So the game is balance. In dedicated storage nodes, I let ARC be roomy. On app servers that share storage and compute, I put ARC on a shorter leash so the app can stretch out.<\/p>\n<h3><span id=\"Where_I_start_and_why\">Where I start (and why)<\/span><\/h3>\n<p>On Linux, ARC sizing is controlled via kernel module parameters. You can set them at runtime or persistently. Runtime is great for experimentation, but persistence wins long\u2011term.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># Runtime experimentation (bytes)\necho 8589934592 &gt; \/sys\/module\/zfs\/parameters\/zfs_arc_max    # 8G\necho 2147483648 &gt; \/sys\/module\/zfs\/parameters\/zfs_arc_min    # 2G\n\n# Persistent: \/etc\/modprobe.d\/zfs.conf\noptions zfs zfs_arc_max=8589934592\noptions zfs zfs_arc_min=2147483648\n<\/code><\/pre>\n<p>I like to start with a modest max, observe for a day or two, and nudge upward. Watch the kernel\u2019s memory pressure, swap activity, and app latency. If I see the app gasping, I rein ARC in a bit. If the disks look too busy and the app is waiting on reads, I let ARC stretch. The sweet spot is where the app hums and disks don\u2019t scream during peaks.<\/p>\n<h3><span id=\"Perdataset_choices_that_matter\">Per\u2011dataset choices that matter<\/span><\/h3>\n<p>Global ARC sizing is half the story. Per\u2011dataset properties help you steer the cache into the right lanes. If I\u2019ve got a database dataset that already does its own caching (hello, PostgreSQL, MySQL), I often set:<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">zfs set primarycache=metadata pool\/db\n<\/code><\/pre>\n<p>This keeps ARC focused on metadata, not bulk table pages the database will likely cache itself. For write\u2011heavy logs, I might keep default caching but adjust <strong>recordsize<\/strong> to match the workload. Databases love smaller records (like 16K), while media archives prefer larger records (like 1M). For generic filesystems, I usually leave defaults and let ZFS adapt.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># Example: database dataset tuned for smaller blocks\nzfs set recordsize=16K pool\/db\nzfs set atime=off pool\/db\nzfs set xattr=sa pool\/db\nzfs set compression=lz4 pool\/db\n<\/code><\/pre>\n<p>Those couple of properties reduce random write noise, avoid wasting cycles updating access times, and make extended attributes more efficient. Little changes, big difference over months of real traffic.<\/p>\n<h3><span id=\"Observe_dont_guess\">Observe, don\u2019t guess<\/span><\/h3>\n<p>When ARC is mis\u2011sized, you\u2019ll feel it. Look at swap activity, the kernel\u2019s Out\u2011Of\u2011Memory killer history, and ARC hit ratios over time. Tools like arcstat and arc_summary are fantastic. If you want to go deeper later, the <a href=\"https:\/\/openzfs.github.io\/openzfs-docs\/Performance%20and%20Tuning\/index.html\" rel=\"nofollow noopener\" target=\"_blank\">OpenZFS performance and tuning guide<\/a> is thorough without being overwhelming.<\/p>\n<h2 id=\"section-3\"><span id=\"ZIL_and_SLOG_When_a_Tiny_SSD_Makes_a_Big_Difference\">ZIL and SLOG: When a Tiny SSD Makes a Big Difference<\/span><\/h2>\n<h3><span id=\"First_whats_the_ZIL\">First, what\u2019s the ZIL?<\/span><\/h3>\n<p>The ZFS Intent Log (ZIL) is the safety diary for synchronous writes. When an application says \u201cwrite this now and tell me immediately that it\u2019s safe,\u201d ZFS writes that intent to the log so it can survive a power cut. By default, the ZIL lives on your pool. If the workload leans on sync writes\u2014databases, NFS shares for VMs, certain logging patterns\u2014latency can add up.<\/p>\n<h3><span id=\"Enter_the_SLOG_device\">Enter the SLOG device<\/span><\/h3>\n<p>A Separate LOG (SLOG) device is an SSD, ideally with power\u2011loss protection and low latency, that handles those sync log records. If you add a SLOG, you\u2019re not caching everything, you\u2019re just accelerating the small, sync\u2011critical part of writes. The TL;DR: if your workload does lots of fsync or O_DSYNC writes, a good SLOG can change the mood of the entire system.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># Add a mirrored SLOG to avoid a single point of failure\nzpool add pool log mirror nvme0n1 nvme1n1\n<\/code><\/pre>\n<p>Mirroring your SLOG is worth the extra SSD. If you lose a non\u2011mirrored SLOG at the wrong time, your pool will still import cleanly, but you risk losing the most recently acknowledged sync writes. I don\u2019t like rolling those dice in production.<\/p>\n<h3><span id=\"How_big_should_the_SLOG_be\">How big should the SLOG be?<\/span><\/h3>\n<p>This is one of those questions where the answer is \u201csmall but sober.\u201d The log only needs to absorb a few seconds of your synchronous write burst, not hours of data. In my experience, 8\u201332 GB is plenty for many servers. Bigger doesn\u2019t make it faster; faster makes it faster. Choose an SSD with real power\u2011loss protection and low write latency. Consumer SSDs without PLP can acknowledge data that never actually makes it to non\u2011volatile storage during a power cut, which defeats the purpose.<\/p>\n<h3><span id=\"A_couple_of_properties_that_move_the_needle\">A couple of properties that move the needle<\/span><\/h3>\n<p>ZFS gives you some per\u2011dataset toggles that complement SLOG decisions:<\/p>\n<p>1) <strong>logbias=throughput<\/strong> tells ZFS to prefer raw throughput over low latency for large streaming writes. That keeps big sequential writes from flooding the SLOG unnecessarily.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">zfs set logbias=throughput pool\/media\n<\/code><\/pre>\n<p>2) <strong>sync<\/strong> controls how ZFS treats sync operations. The default is <em>standard<\/em>, which respects the application\u2019s request. <em>always<\/em> forces everything to be sync (useful for certain NFS or VM guests), and <em>disabled<\/em> treats sync as async. That last one is tempting for benchmarks, but risky in real life. You might get speed, but after a power loss, you can lose acknowledged writes. I save <em>sync=disabled<\/em> for lab tests and sleep better in production.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">zfs set sync=standard pool\/db\n# zfs set sync=disabled pool\/sandbox   # For test environments only\n<\/code><\/pre>\n<p>If you want the deeper under\u2011the\u2011hood story, the <a href=\"https:\/\/openzfs.github.io\/openzfs-docs\/\" rel=\"nofollow noopener\" target=\"_blank\">OpenZFS documentation<\/a> has excellent sections on how the ZIL behaves under different workloads.<\/p>\n<h2 id=\"section-4\"><span id=\"Snapshots_Safe_Points_That_Dont_Get_in_Your_Way\">Snapshots: Safe Points That Don\u2019t Get in Your Way<\/span><\/h2>\n<h3><span id=\"Make_snapshots_boring_and_automatic\">Make snapshots boring and automatic<\/span><\/h3>\n<p>Snapshots are copy\u2011on\u2011write bookmarks of your dataset at a point in time. They\u2019re instantaneous and space\u2011efficient\u2014until changes accumulate. The trick is routine. I like predictable names and a retention plan that matches the change rate of the data. For fast\u2011moving app data, I keep more frequent and shorter retention. For archives, fewer snapshots with longer tails.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># A simple naming pattern and schedule\nzfs snapshot -r pool\/app@daily-$(date +%Y%m%d)\n\n# List snapshots by creation time\nzfs list -t snapshot -o name,creation -s creation\n<\/code><\/pre>\n<p>Cleanup matters. The easiest way to regret snapshots is to never prune them. Decide how many dailies, weeklies, and monthlies you need. I\u2019ve used cron and systemd timers, shell scripts, and later graduated to tools that manage policies for me. If you prefer friendly automation, take a look at <a href=\"https:\/\/github.com\/jimsalterjrs\/sanoid\" rel=\"nofollow noopener\" target=\"_blank\">the sanoid\/syncoid toolkit<\/a>\u2014it does snapshotting and replication with sanity baked in.<\/p>\n<h3><span id=\"Clones_for_safe_experiments\">Clones for safe experiments<\/span><\/h3>\n<p>One of my favorite ZFS party tricks is cloning a snapshot for testing. When a client is nervous about a migration, I clone the last good snapshot into a scratch dataset, mount it somewhere private, and rehearse. No drama, no risk to the live system. When I\u2019m done, I destroy the clone and the snapshot remains intact.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># Create and mount a clone for testing\nzfs snapshot pool\/app@pre-migration\nzfs clone pool\/app@pre-migration pool\/app-scratch\n# ... test here ...\nzfs destroy pool\/app-scratch\n<\/code><\/pre>\n<p>One more tip: if you\u2019re sending a snapshot off\u2011box, consider placing a <strong>hold<\/strong> on it so automation can\u2019t accidentally delete it before replication completes.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">zfs hold backupkeep pool\/app@daily-20250101\n# Later, when safely replicated\nzfs release backupkeep pool\/app@daily-20250101\n<\/code><\/pre>\n<h2 id=\"section-5\"><span id=\"sendreceive_Backups_You_Actually_Trust\">send\/receive Backups You Actually Trust<\/span><\/h2>\n<h3><span id=\"The_flow_in_one_breath\">The flow in one breath<\/span><\/h3>\n<p>Take a snapshot, send it, receive it, keep it, and send incrementals forever after. That\u2019s the rhythm. ZFS makes it fast and safe, and modern OpenZFS lets you resume interrupted sends\u2014huge win for flaky links or long distances.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># Full replication the first time\nzfs snapshot -r pool\/app@base\nzfs send -R pool\/app@base | ssh backup.example \n  &quot;zfs receive -uF backup\/app&quot;\n\n# Incremental replication later\nzfs snapshot -r pool\/app@daily-20250115\nzfs send -R -I @base pool\/app@daily-20250115 | ssh backup.example \n  &quot;zfs receive -uF backup\/app&quot;\n<\/code><\/pre>\n<p>Two flags that make life better: <strong>-R<\/strong> replicates descendant datasets and properties, and <strong>-I<\/strong> sends an incremental stream including intermediate snapshots. The <strong>-u<\/strong> on receive prevents auto\u2011mounting the backup datasets, and <strong>-F<\/strong> forces a rollback of the target if it diverged.<\/p>\n<h3><span id=\"Smooth_the_network_with_a_buffer\">Smooth the network with a buffer<\/span><\/h3>\n<p>Replication often stalls not because of disks, but because the network hiccups. I\u2019ve had great results inserting an in\u2011memory buffer so the sender and receiver can work at their natural pace.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">zfs send -R pool\/app@daily | mbuffer -s 128k -m 1G | \n  ssh backup.example &quot;mbuffer -s 128k -m 1G | zfs receive -uF backup\/app&quot;\n<\/code><\/pre>\n<p>If an incremental stream gets interrupted, modern OpenZFS provides a resume token. You can pick up right where you left off instead of re\u2011sending the world.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># Find a resume token on the receiving side\nzfs get receive_resume_token backup\/app\n\n# Resume from the sender\nzfs send -t &lt;TOKEN&gt; | ssh backup.example &quot;zfs receive -uF backup\/app&quot;\n<\/code><\/pre>\n<h3><span id=\"Encrypted_datasets_and_raw_streams\">Encrypted datasets and raw streams<\/span><\/h3>\n<p>When using native ZFS encryption, you can send raw encrypted data without decrypting on the sender. That means the receiving side never sees plaintext. It looks like this:<\/p>\n<pre class=\"language-nginx line-numbers\"><code class=\"language-nginx\"># Create an encrypted dataset (example prompts for a key)\nzfs create -o encryption=on -o keyformat=passphrase -o keylocation=prompt pool\/secure\n\n# Snap and send raw encrypted blocks\nzfs snapshot pool\/secure@daily\nzfs send -w pool\/secure@daily | ssh backup.example &quot;zfs receive -uF backup\/secure&quot;\n<\/code><\/pre>\n<p>The receiving side can store the encrypted dataset without access to the key. When it\u2019s time to use, load the key on the receiver and mount. It\u2019s a clean model for off\u2011site backups that must remain dark until disaster day.<\/p>\n<h3><span id=\"Bookmarks_holds_and_the_paper_trail\">Bookmarks, holds, and the paper trail<\/span><\/h3>\n<p>Bookmarks are like zero\u2011space, always\u2011on pointers to snapshots. They\u2019re handy for keeping an incremental base even after you prune old snapshots locally. I make bookmarks before pruning so my replication chain doesn\u2019t break.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># Create a bookmark you can increment from later\nzfs bookmark pool\/app@daily-20250115 pool\/app#base-20250115\n<\/code><\/pre>\n<p>Between bookmarks and holds, you get a lifecycle that\u2019s predictable, documented, and resilient to accidents. I\u2019ve had one case where a junior admin nervously deleted what they thought was just a local snapshot; the hold saved us from a broken replication chain.<\/p>\n<h2 id=\"section-6\"><span id=\"A_Playbook_I_Keep_Coming_Back_To\">A Playbook I Keep Coming Back To<\/span><\/h2>\n<h3><span id=\"The_setup\">The setup<\/span><\/h3>\n<p>Picture a modest VM host running a few Linux guests and a PostgreSQL database. ZFS sits on a pool of mirrored SSDs, with an optional mirrored SLOG of small, power\u2011loss\u2011protected NVMes. Nothing fancy. But everything is thoughtful.<\/p>\n<p>First move after install: set sane defaults.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># Pool\u2011wide goodness\nzfs set compression=lz4 pool\nzfs set atime=off pool\nzfs set xattr=sa pool\n\n# Dataset structure\nzfs create pool\/vms\nzfs create pool\/db\nzfs create pool\/backups\n\n# Database\u2011appropriate tweaks\nzfs set recordsize=16K pool\/db\nzfs set primarycache=metadata pool\/db\nzfs set logbias=latency pool\/db\n<\/code><\/pre>\n<p>On the VM dataset, I lean on defaults and keep a closer eye on latency. If my VMs are NFS clients, I\u2019ll ensure sync writes are honored end\u2011to\u2011end and consider <strong>sync=always<\/strong> if the guest behavior demands it. If the hypervisor directly uses ZFS volumes (zvols), I make sure the block size aligns with the guest file system expectations.<\/p>\n<h3><span id=\"ARC_sizing_strategy\">ARC sizing strategy<\/span><\/h3>\n<p>On this kind of host, I start conservative: set ARC max to something that leaves comfortable RAM for the guests and for PostgreSQL. After a few days, I pull thread traces and watch ARC hits, then consider bumping it up. If the database is your star, let it keep its memory crown and keep ZFS focused on metadata and writes. You can go months without touching ARC again if you respect the balance early.<\/p>\n<h3><span id=\"ZILSLOG_decisions\">ZIL\/SLOG decisions<\/span><\/h3>\n<p>If the database is issuing lots of fsyncs, a solid SLOG is worth its weight in calm nights. I prefer a mirrored pair of small, low\u2011latency NVMes designed for sustained write and with power\u2011loss protection. I set <strong>logbias=throughput<\/strong> on datasets that handle large sequential writes (like backup archives) so they don\u2019t crowd the SLOG. Everything else keeps <strong>logbias=latency<\/strong> to keep those important sync writes snappy.<\/p>\n<h3><span id=\"Snapshots_and_replication_routine\">Snapshots and replication routine<\/span><\/h3>\n<p>I keep it boring: hourly snapshots with a 48\u2011hour tail for the database, daily for VMs with a two\u2011week tail, monthly for archives. Replicate to a secondary node or an off\u2011site location nightly. If you\u2019re pairing ZFS with object storage for longer retention, you can complement your setup with something like <a href=\"https:\/\/www.dchost.com\/blog\/en\/vps-uzerinde-minio-ile-s3%E2%80%91uyumlu-depolama-nasil-uretim%E2%80%91hazir-kurulur-erasure-coding-tls-ve-policyleri-tatli-tatli-anlatiyorum\/\">a production\u2011ready MinIO setup on a VPS<\/a> and push application\u2011level backups there too. ZFS snapshots keep server state tight; object storage keeps app exports durable and cheap.<\/p>\n<h3><span id=\"Recovery_rehearsals\">Recovery rehearsals<\/span><\/h3>\n<p>Every quarter, I run a restore rehearsal. Clone a snapshot, bring up a VM against the clone, or receive a snapshot into a scratch dataset and run the app. The step you don\u2019t rehearse is the one that surprises you under pressure. On one client, we shaved two hours off our RTO just by scripting the dataset imports and key loading for encrypted backups. The next outage? It was a non\u2011event.<\/p>\n<h2 id=\"section-7\"><span id=\"Observability_The_Little_Checks_That_Prevent_Big_Surprises\">Observability: The Little Checks That Prevent Big Surprises<\/span><\/h2>\n<h3><span id=\"Keep_an_eye_on_the_pool_and_ARC\">Keep an eye on the pool and ARC<\/span><\/h3>\n<p>I\u2019ve learned to spot trouble early with a few routine commands. None of them are dramatic; all of them are useful.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># Pool health and errors\nzpool status -xv\n\n# Realtime I\/O insight\nzpool iostat -v 1\n\n# Snapshot inventory\nzfs list -t snapshot -o name,used,creation -s creation\n\n# ARC health (if arcstat installed)\narcstat 5\n<\/code><\/pre>\n<p>When something smells off, I sample these and compare to a known\u2011good baseline. That\u2019s how I catch a mis\u2011sized ARC, a VM that went rogue with sync writes, or a slowly failing SSD that started to rack up latency outliers before SMART finally tattled.<\/p>\n<p>And when I want to go deeper or double\u2011check a tuning hunch, I revisit the thoughtful bits in the <a href=\"https:\/\/openzfs.github.io\/openzfs-docs\/Performance%20and%20Tuning\/index.html\" rel=\"nofollow noopener\" target=\"_blank\">OpenZFS performance and tuning guide<\/a>. It\u2019s like a trusted colleague who doesn\u2019t mind repeating themselves when I need the refresher.<\/p>\n<h2 id=\"section-8\"><span id=\"Common_Gotchas_I_Still_See_And_How_I_Dodge_Them\">Common Gotchas I Still See (And How I Dodge Them)<\/span><\/h2>\n<h3><span id=\"Wrong_block_sizes_for_the_workload\">Wrong block sizes for the workload<\/span><\/h3>\n<p>Big blocks feel efficient until a small\u2011IO workload shows up and grinds. Set recordsize thoughtfully on known workloads like databases, and keep the default elsewhere. Don\u2019t force it if you don\u2019t need to.<\/p>\n<h3><span id=\"Consumer_SLOGs_with_no_powerloss_protection\">Consumer SLOGs with no power\u2011loss protection<\/span><\/h3>\n<p>This one bites because it often looks fine\u2014until that one day it\u2019s not. For SLOG, I pay for the boring, enterprise\u2011minded SSD. It\u2019s not about speed first; it\u2019s about correctness under stress.<\/p>\n<h3><span id=\"Letting_snapshots_pile_up_without_a_plan\">Letting snapshots pile up without a plan<\/span><\/h3>\n<p>Snap judiciously, prune religiously. Late\u2011night storage runs are no fun, and nobody wants to be the person who deletes half the snapshot tree in a panic. Bookmarks and holds exist to make retention safe. Use them.<\/p>\n<h3><span id=\"Forgetting_to_test_restores\">Forgetting to test restores<\/span><\/h3>\n<p>I used to assume sends were fine because the commands ran clean. Then I had a case where the target had diverged subtly, and receive wasn\u2019t actually applying. A quick restore test would have caught it. Lesson learned; quarterly rehearsals ever since.<\/p>\n<h2 id=\"section-9\"><span id=\"WrapUp_The_Calm_Confidence_of_a_WellTuned_ZFS\">Wrap\u2011Up: The Calm Confidence of a Well\u2011Tuned ZFS<\/span><\/h2>\n<p>If you\u2019ve read this far, I suspect you\u2019re the kind of person who enjoys a quiet dashboard and a little extra sleep. ZFS on Linux rewards that mindset. You don\u2019t need tricks\u2014just a few steady practices. Size ARC so your apps have room to thrive. Give sync\u2011heavy workloads a proper SLOG and set logbias where it makes sense. Snapshot on a rhythm that matches the data\u2019s tempo, and prune with care. Replicate with send\/receive, use buffers for reliability, and lean on resume tokens when the network throws a tantrum.<\/p>\n<p>I\u2019ve been saved more than once by habits that felt boring at the time. A well\u2011named snapshot, a small SLOG that quietly did its job, a tuned ARC that didn\u2019t crowd the database\u2014these things add up. You\u2019ll feel it in your latency charts and in those moments when someone asks, \u201cCan we restore last Tuesday\u2019s data for an hour?\u201d and you answer, \u201cSure, give me ten minutes.\u201d<\/p>\n<p>Hope this was helpful! If you\u2019ve got a weird ZFS story or a tuning question, send it my way. I\u2019ll happily trade notes over coffee. Until then, may your pools stay healthy, your snapshots tidy, and your restores boring\u2014in the best possible way.<\/p>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>So I was sipping an unreasonably strong coffee the other morning, staring at a dashboard that looked like a heart monitor. One of those moments where latency spikes and your pulse kind of sync up. You\u2019ve probably had it too: a storage stack behaving one way in staging, then doing something totally surprising in production. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1939,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[26],"tags":[],"class_list":["post-1938","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-teknoloji"],"_links":{"self":[{"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/posts\/1938","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/comments?post=1938"}],"version-history":[{"count":0,"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/posts\/1938\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/media\/1939"}],"wp:attachment":[{"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/media?parent=1938"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/categories?post=1938"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/tags?post=1938"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}