Technology

The Friendly Playbook: Building an Image Optimization Pipeline with AVIF/WebP, Origin Shield, and Smarter Cache Keys to Cut CDN Costs

So there I was, nursing a lukewarm coffee at 10:47 p.m., staring at a CDN bill that had quietly grown fangs. You know that feeling when something is “fine” until it isn’t? A client’s store had been crushing it (yay), but their media-heavy pages were sending bandwidth into orbit (not yay). We weren’t doing anything outrageous: just product photos, a few banners, some hero images. But when you multiply “just a few” by tens of thousands of daily visits, tiny inefficiencies turn into invoice line items you don’t want to remember.

Ever had that moment when you realize your images are driving the majority of your egress—and your cache hit ratio is getting kneecapped by accidental variation? That was me. And that night, I rolled up my sleeves and stitched together a pipeline that made modern formats like AVIF and WebP the default, built a thoughtful origin shield, and tuned cache keys so aggressively (yet safely) that the CDN started acting like the good roommate who always takes out the trash without being asked.

In this guide, I’ll walk you through that pipeline. We’ll talk about how to pick the right format per request, how to use the Accept header without blowing up your cache, why an origin shield is the quiet hero for cutting origin egress, and how cache-key tuning can be the difference between a 40% hit ratio and that sweet, comfortable 90% neighborhood. Along the way, I’ll share the mistakes I made, how I fixed them, and the parts that still make me smile when I check graphs in the morning.

Why Images Hijack Your Bills (And What We Can Do About It)

I remember an early project where we kept optimizing code and queries while pretending the image problem would solve itself. It didn’t. The truth is, images are the loudest guests at the party. They dominate payload, trigger multiple sizes across layout breakpoints, and get requested from every corner of the world. If you treat them as an afterthought, your CDN and your origin will treat you the same way.

Here’s the thing: the fixes are simple in spirit, but they require discipline. Get the formats right (AVIF and WebP where possible, fallback when needed). Normalize and version how you store and request images. Use an origin shield so that your origin only sees a tiny fraction of the requests. Then get ruthless about cache keys so you don’t split the cache for no reason.

Think of it like packing for a trip. You can bring everything “just in case,” or you can bring what’s necessary and make sure each item works with the others. Our pipeline is the second kind of suitcase: small, tidy, and weirdly satisfying.

The Format Game: AVIF, WebP, and Friendly Fallbacks

Formats are where the magic starts. In my experience, AVIF offers the best compression at the same perceived quality for photographic images, and WebP is a great runner-up with wide support and excellent results, especially for web graphics that don’t love being over-compressed. PNGs still have their place for crisp UI elements and transparency-heavy assets, but if you can serve an AVIF or a WebP to a modern browser, you should.

The trick is to let the browser tell you what it understands. That’s where the Accept header comes in. When a request arrives, it often carries hints like image/avif or image/webp. If you only take one thing from this section, let it be this: don’t hardcode format logic by user agent. Negotiate by capability. That way, when a browser adds AVIF support tomorrow, you just start serving it without a redeploy.

When I first rolled this out, I worried about complexity at the edge. It turned out simpler than I expected: the CDN or an edge worker checks Accept, picks a preferred format (AVIF if present, else WebP, else a trusted fallback), and asks our image service for that exact variant. If the service has it, it returns it. If not, it generates, stores, and serves it. Keep quality settings reasonable, and avoid the trap of over-squeezing colorful product shots until they look like watercolor paintings.

Want to experiment with conversions and quality settings without touching production? I’ve spent many late nights with Squoosh to test format and quality tradeoffs. And when you want official docs to sanity-check your assumptions about WebP features like alpha and animation, the WebP documentation is a lifesaver.

Your Source of Truth: Normalization, Storage, and Derivatives

Before you worry about cache keys, get your house in order. Create a single “source of truth” for originals. I usually store full-fidelity uploads in an object storage bucket. Originals get normalized on ingestion: consistent color profile (sRGB), rotation fixed, chroma subsampling chosen on purpose, and metadata stripped unless you truly need it. Why? Because you want every derivative to be reproducible. If your originals are messy, your cache keys won’t save you.

Then, decide how you’ll create derivatives. You have two broad paths. First, pre-generate common sizes and formats at upload time. That’s predictable and fast, but it means betting on which sizes you’ll need. Second, generate on demand, with a tiny delay on first request that pays off forever after as the result gets cached. I tend to use on-demand generation backed by a write-through store: the first request triggers the transformation, then the result is stored and future requests are instant.

One client was shipping hero images at desktop widths to mobile devices because the design team didn’t want to juggle sizes. We set breakpoints that made sense for the layout—think a handful of widths that cover your grid nicely. You don’t need twenty. Five or six, multiplied by the formats you care about, gives the browser plenty of choice via srcset without exploding the number of variants.

For the transformation engine, use something that’s fast and memory-friendly. I’ve had great luck with libvips-based stacks (Node’s sharp or direct libvips bindings) because they stream and don’t chew through RAM like some older tools. But the exact tool is less important than making sure your pipeline enforces guardrails: maximum width and height, allowed formats, and a whitelist for parameters. You do not want your origin rendering 12000-pixel-wide images because a query parameter accidentally went wild.

Origin Shield: The Quiet Hero That Saves Your Origin and Your Wallet

Let’s talk about the piece that made the biggest difference to our bills: origin shield. Think of it as a gatekeeper layer inside your CDN. Instead of every edge location hammering your origin on a miss, they funnel to a single “shield” tier. That shield tier does the heavy lifting, collects the miss once, and then distributes the hot object back to the edges. The result is fewer trips to origin, fewer duplicate renders for on-demand variants, and a calmer, cheaper world.

In my first serious rollout, we had a cluster doing on-demand AVIF/WebP generation behind the CDN. Before origin shield, a sudden traffic burst from multiple regions triggered a dogpile effect—multiple edges missed at once and all hit origin. After enabling shielding, everything lined up: one miss at the shield, a single render, and a cascade of hits across regions. That single change cut the “oh-no” spikes almost overnight.

If you’re using Cloudflare, the feature to look at is Tiered Cache. Their docs explain the shape of it better than I can in a paragraph, so here’s a pointer to the official page for a quick orientation: how Cloudflare’s Tiered Cache routes requests. The general principle applies no matter the vendor: pick a shielding region close to your origin, keep it stable, and let it absorb the chaos.

Two more things helped a lot. First, use stale-while-revalidate so that a sudden refresh doesn’t crater performance. Serving a slightly older image for a moment while the new one gets fetched is usually fine and way better than a thundering herd. Second, let the shield tier have a slightly longer TTL than the edges. That way, your shield’s cache stays warm, smoothing out global traffic.

Cache-Key Tuning: The Art of Not Splitting the Cache

This is where we earn our quiet victories. A cache key is the recipe the CDN uses to decide whether two requests are “the same.” If the key includes too much, you split the cache and kill your hit ratio. If it includes too little, you risk serving the wrong thing. The sweet spot is including exactly the variables that make a binary difference to the resulting bytes—and nothing else.

When we added AVIF and WebP, our first instinct was to slap Vary: Accept on everything. That works, but it can also fragment the cache, because Accept headers are noisy across browsers and versions. I prefer to normalize the Accept logic before it hits the cache key. In other words, map the request to a single “format” variable—say, fmt=avif, fmt=webp, or fmt=orig—based on a clean capability check, and then include only that normalized fmt in your cache key. Now you have exactly three variants instead of a hundred subtle Accept variations.

Same story for size. The browser doesn’t care whether the request had w=1201 or w=1200; if your pipeline rounds to known breakpoints, normalize to them and include the normalized width in the key. I like the cache key to contain: the canonical path to the original asset, the normalized width (or a dpr with a baseline width if that’s how you scale), the fmt, and a quality preset identifier—not the literal number—to avoid accidental fragmentation. Add a version or v parameter that you can bump on deploys when you want to invalidate old variants predictably.

And here’s a sneaky one: strip irrelevant query strings. Tracking params like utm_source don’t change the bytes. If your CDN lets you customize the cache key, drop any query that doesn’t affect the result. Early on, we were seeing dozens of keys for the same image because of analytics params leaking into image URLs. One small rule later, the cache key count dropped and the hit ratio popped.

On the origin side, make sure you send headers that reinforce the behavior you want: explicit Content-Type, strong Cache-Control with public and a healthy max-age, and ideally ETag or Last-Modified for when revalidation is necessary. But for images that truly don’t change (especially if they have a version parameter in the URL), don’t be shy about long TTLs. That’s where the savings come from.

A Real-World Flow: From Request to Bytes on Screen

Let’s walk through the exact journey a successful pipeline takes, the way I set it up on a recent project. The user requests /media/products/blue-sneaker.jpg. At the edge, a tiny worker inspects Accept and decides: this browser supports AVIF, great. It also looks for w (requested width) and rounds to your nearest supported breakpoint—say 1200. The worker constructs a normalized, signed URL to your image service: something like /img/v2/fmt=avif/w=1200/path=products/blue-sneaker.jpg. That signature prevents abuse and guarantees the request is intentional.

Your CDN’s cache key is built from those normalized parameters: fmt=avif, w=1200, v=2, and the canonical path. The edge checks its cache. If there’s a miss, it forwards to the shield. The shield checks its cache. If there’s a miss there too, only then does your origin get a call.

Your image service pulls the canonical original from storage (object storage is my go-to), converts it to AVIF at your chosen quality preset, optionally strips metadata, and writes the variant back to a derivatives bucket with a predictable path. It sends the response with strong caching headers. The shield stores it. The edge grabs it from the shield, stores it, and sends it to the user. Subsequent edges fetch from the shield instead of the origin. The whole thing feels like the internet working with you instead of against you.

This is also where stale-while-revalidate shines. If the edge sees the object is stale but still present, it can serve it immediately and refresh in the background. Users stay happy, your graphs stay boring, and your origin stays under capacity.

The Guardrails: Safety, Quality, and Escape Hatches

I’ve learned to set firm boundaries for image services. Always validate the original path against a whitelist. Keep a maximum width and height. Keep a whitelist of formats, even if your library theoretically supports more. Set a quality ceiling so no one accidentally ships a variant that looks like it’s been faxed twice. And always sign transform URLs. That one habit is the difference between a quiet weekend and a runaway render loop that wakes you up at 4 a.m.

Another gentle caution: not every image wants to be AVIF. Logos with sharp edges sometimes look better as crisp PNGs, or you can use WebP at higher quality for a nice middle ground. When in doubt, test with real assets on real devices. I like grabbing a handful of representative images—photography, UI icons, transparent overlays—and running them through conversions with my usual presets. You’ll quickly see where to draw the line.

And if you’re using responsive HTML images (srcset and sizes), remember the browser is in charge. Give it useful choices. Don’t ship twenty sizes. Pick 4–8 that make sense for your design’s breakpoints. You’ll get most of the benefit without multiplying your variants to infinity.

Choosing Your Tools: Pragmatic Options That Actually Ship

For the transform engine, I gravitate toward stacks built on libvips because they’re fast and memory-efficient. Node’s sharp is a common choice, but I’ve also used Go and Rust wrappers depending on the team. If your CDN offers built-in transforms, that can be a great path, especially if it natively understands Accept, width, and dpr. Edge transforms are fantastic for reducing origin round-trips, but you still want a shield-style pattern so you’re not doing the same work in ten regions at once.

For storage, an object store with versioned paths is your friend. I stick to a predictable folder structure: /originals/ for the full-fidelity uploads, /variants/ for derivatives keyed by normalized parameters. That makes purging painless. When you bump v=3 in your transform URLs, nothing weird happens with old variants; they’ll age out naturally and live happily beside the new ones while caches transition.

If this is part of a WordPress setup, or anything CMS-like, you already know media handling can get messy. When I need to move media off the app server and control caching end-to-end, I use an object store and a CDN in front. If you’re doing something similar, you might enjoy how I approach the basics in moving WordPress media to S3-compatible storage with signed URLs and cache invalidation. It pairs nicely with the pipeline we’re building here.

Measurement: Because Wins Don’t Count Unless You See Them

I once thought I’d “know” when the pipeline was working. Then a dashboard proved me wrong. A good measurement setup makes your progress obvious. First, watch your hit ratio at the edge and the shield. They should climb as you normalize cache keys. If you see a drop, it usually means a parameter slipped into the key or a new variant type got introduced by accident.

Second, watch origin egress after enabling shielding. If you tightened TTLs but didn’t add stale-while-revalidate, you might see more origin traffic than expected. Third, look at errors from the image service. If you start rejecting more requests because of parameter validation, that’s actually good—those were the requests trying to push your origin into overdrive.

Finally, test on slow devices and networks. Optimizing formats and caching is great, but the real win is user-perceived speed. Images should arrive quickly and look clean. I keep a little ritual where I load the top 10 pages on a throttled connection after major changes. It’s amazing how quickly you can hear when a page “feels” better.

Common Mistakes I Keep Seeing (And How to Dodge Them)

First up: over-fragmented cache keys. If your cache key includes raw Accept, a dozen vanity query params, and a fingerprint of the moon phase, your hit ratio will crater. Normalize early, include only what changes the bytes, and throw out the rest. Second: skipping an origin shield. Without it, multiple regions all miss at once, and your origin carries the pain. Third: compressing logos like photos. Vector-like art hates heavy compression. Keep a line between photographic content and UI graphics.

Fourth: no guardrails on transform parameters. It’s only a matter of time before a strange combination of w, h, fit, and bg turns into an expensive render. Validate and sign everything. Fifth: forgetting that not all browsers support the same formats. That’s why we negotiate with Accept instead of guessing. If you need a refresher on WebP’s nuances and helpers, the WebP docs are still my go-to bookmark.

One more that’s sneaky: leaving animations as GIFs. If you can, move them to video containers or modern alternatives; even a short looping MP4 or WebM can be a massive quality and size upgrade over a chunky GIF. Your users and your CDN will both breathe easier.

Quality Settings Without the Anxiety

Everyone asks for the “right” quality values. The answer, annoyingly, is “it depends.” But here’s a calm way to pick them. Grab a set of representative images: a portrait with skin tones, a product with crisp edges, something with gradients. Convert to AVIF and WebP at a handful of presets—think low, medium, and high. View them on a phone and a laptop at normal distances. If you can’t tell the difference from the original at the preset you want to ship, that’s your number.

In practice, I often end up with AVIF at a slightly lower numerical quality than WebP for similar perceived detail. Photographic content tolerates more compression than line art. And remember: if your HTML is doing srcset correctly, you’re already saving bandwidth by serving smaller images to smaller viewports. Quality and size work together. Don’t let either carry the whole burden.

Edge vs Origin Transforms: What I Actually Choose

People expect me to say “always edge transforms,” but I don’t. If your CDN’s transform engine is reliable and you’re comfortable with the cost model, edge transforms can be wonderful—lower latency, fewer round-trips, and nice built-in Accept handling. I use them when the app is simple or when the team doesn’t want to maintain a separate image service.

But on bigger teams, or when I need custom logic (like per-merchant branding rules or watermarking), I still love an origin-side service behind a strong shield and smart cache keys. I sleep well knowing I can roll changes, run canary tests, and version presets without touching the edge logic too often. Both paths can be excellent. Pick based on your team’s comfort and your project’s shape.

Putting It All Together: The Checklist I Keep Reusing

Normalize and store originals

Settle on sRGB. Strip metadata unless required. Keep originals pristine and predictable. If you ever need to regenerate variants, you’ll be grateful you were picky early.

Decide on breakpoints and presets

Pick a sane set of widths for your layout, and decide on quality presets for AVIF and WebP. Fewer, smarter choices beat a giant matrix.

Build the negotiation layer

Use the Accept header to choose a format without relying on user agent strings. Normalize that choice into a clean “fmt” variable for your cache key.

Tune cache keys

Include only normalized width (or DPR), fmt, a quality preset name, the canonical original path, and a version parameter. Strip anything that doesn’t change the bytes.

Add an origin shield

Turn on shielding so only one tier talks to your origin on cache misses. Pair it with stale-while-revalidate to keep edges snappy during refreshes.

Secure the transforms

Validate parameters, sign URLs, enforce maximums. Limit formats. If it’s not in your allowlist, it doesn’t happen.

Measure and iterate

Watch cache hit ratios, origin egress, and your image service error logs. When numbers jump, figure out why—celebrate the good jumps, fix the bad ones.

A Small Story About a Big Win

One of my clients was gearing up for a product launch with a homepage that was basically a love letter to high-resolution imagery. Gorgeous, but heavy. We went live with a basic WebP setup and saw decent gains, but the origin was still getting jabbed during traffic spikes. Two nights later, we added an origin shield, normalized the cache key to a simple set of variables, and turned on stale-while-revalidate. The next spike looked boring. Beautifully boring.

We followed up by introducing AVIF for browsers that could handle it, keeping WebP as fallback. That change, plus a small tweak to our quality presets after visual testing, brought payload down further without sacrificing the vibe. Somewhere in there, I realized our cache key still included some stray marketing query strings. We stripped those, hit ratios climbed again, and the CDN bill took a quiet step down.

The team didn’t brag about it. They just got back to designing. The best infrastructure wins are the ones that turn into background noise.

If You’re Starting Tomorrow, Start Here

If all this feels like a lot, take the first step that gives you leverage. Enable format negotiation and serve WebP or AVIF where supported. Then add a shield. Then clean up your cache key. Each step pays for itself in smoother traffic and fewer surprises. You don’t need a perfect setup out of the gate. You need a good one that you can iterate on.

If you want to explore format behavior hands-on, I still recommend a few quick experiments in Squoosh to build your intuition. And for a deeper look at routing and transport-level performance that nicely complements image work, remember that protocols matter too—HTTP/2 and HTTP/3 can make big pages feel lighter, especially when your images are already efficient.

Wrap-Up: The Calm, Cheap, Fast Way to Ship Images

Crooked coffee moments aside, I’ve come to enjoy building image pipelines. They reward patience with compounding wins. The pattern is simple: let the browser tell you what it can handle, generate a perfect-fit version just once, teach the CDN to cache that exact thing everywhere, and keep your origin blissfully unaware of most of the traffic.

To recap: use AVIF and WebP where you can, but keep friendly fallbacks for the holdouts. Normalize your cache key so you only include what changes the bytes. Turn on an origin shield so your origin sees a fraction of the requests. Add stale-while-revalidate to keep edges peppy. Guard your transform service with parameter validation and signed URLs. And measure enough to know when you’ve won.

I hope this gave you a practical blueprint—and maybe a little confidence—to build a pipeline that makes your pages load fast and your bills feel fair. If you try this and run into a quirky edge case, you’re not alone; we’ve all shipped a squishy logo or two. Iterate, keep the guardrails, and you’ll get there. Hope this was helpful! See you in the next post, and may your caches stay warm and your graphs pleasantly boring.

Frequently Asked Questions

Great question! I recommend supporting both. AVIF often delivers the smallest files for photos, while WebP has wide support and shines for many graphics. Negotiate by the Accept header and serve AVIF first when available, then WebP, and fall back to the original format if neither is supported.

Both can work, but I usually generate on demand with a write-through cache. The first request pays for the transform, then it’s cached at the shield and edges, so subsequent requests are instant. If your use case is ultra-predictable, pre-generating a handful of common sizes is also a solid choice.

Normalize before caching. Instead of varying directly on the raw Accept header, map it to a clean fmt value like avif, webp, or orig and include only that in your cache key. Do the same for width by rounding to known breakpoints. Strip unrelated query strings so tracking params don’t fragment your cache.