Technology

How I Host Node.js in Production Without Drama: PM2/Systemd, Nginx, SSL, and Zero‑Downtime Deploys

So there I was one Thursday night, sipping cold coffee and waiting for a Node.js app to “just restart” after a deploy. You know that hold-your-breath moment when you hit restart, stare at the terminal, and silently promise the internet you’ll never cowboy deploy again? Yeah, that one. The site blinked out for a few seconds, a customer pinged me on chat, and I felt that sting I’ve felt a dozen times before: we could do better. That was the night I got serious about a no-drama production setup for Node—PM2 or systemd to supervise the process, Nginx in front like a polite bouncer, TLS locked down, and a deploy routine that doesn’t shrug and hope.

Ever had that moment when the app works great on your laptop but collapses in the wild? Or you’re stuck deciding between PM2 and systemd, doubting your Nginx config, or dreading the next deploy? In this friendly tour, I’ll walk you through the exact mental model and toolkit I keep reusing. We’ll set expectations, wire Nginx as a reverse proxy, grab clean SSL, and build a truly zero‑downtime release flow—without turning your server into a science project. I’ll share the small decisions that save weekends, the gotchas that bit me, and the little habits that make production feel boring in the best possible way.

The Mental Model: App, Proxy, Pipeline

Where everything lives (and why it matters)

When I think about a production setup, I picture three layers having a quiet conversation. The app is your Node process running on an internal port or a Unix socket—simple, focused, and kept safe from the public internet. Then you have Nginx out front, speaking HTTP/2 or HTTP/3 to the world, terminating TLS, handling keep-alives, and passing requests along with the right headers. Finally, there’s the deployment pipeline—your releases, logs, secrets, and all the guardrails that make changes feel uneventful.

It all clicks when you remember that Nginx is the gatekeeper. It knows how to do TLS well, it’s comfortable proxying WebSockets, and it’s happy to retry if a backend stumbles for a second. Your Node app, meanwhile, can concentrate on what it does best—serving your business logic—and let a process manager keep it alive. In practice that means either PM2 or systemd will watch the process, auto-start on boot, and make sure memory blips don’t become outages.

Ports, sockets, and the trust boundary

Here’s the thing: exposing your Node app directly on the public internet is like leaving a half-closed front door. I’ve done it in pinch situations, but it never feels right. Instead, bind your app to 127.0.0.1:3000 or to a Unix socket file that only Nginx can access. Nginx then becomes the boundary where TLS happens, where client IPs are recorded correctly, and where you can add rate limiting or caching later without re-architecting your app. It’s not glamorous, but it’s the kind of foundation that saves you from late-night debugging.

PM2 vs Systemd: Pick Your Style (Both Work)

When PM2 shines

I reach for PM2 when I want a tiny bit of Node-flavored magic. It’s dead simple to start, supports cluster mode out of the box, and has a reload command that actually replaces workers gracefully. If you’ve ever tried to keep multiple Node processes in sync across deploys, you’ll appreciate just typing pm2 reload and watching the handoff happen with no downtime. The PM2 documentation is refreshingly practical, and for a lot of solo projects and small teams, PM2 covers 95% of what you need.

When systemd is your steady friend

On the other hand, systemd is already on your server, and it’s battle-tested. I like systemd when I want fewer moving parts, tight integration with the OS, tidy logs, and straightforward resource controls. It’s the boring choice, but boring is a compliment in production. The trade-off? You don’t get PM2’s cluster orchestration built-in. You can still do zero‑downtime deploys with systemd—think blue/green units or socket activation—but it’s a touch more DIY. If you’re already running a platform with consistent systemd units, it’s an elegant fit.

Graceful shutdown: the one habit that saves you

No matter which supervisor you choose, teach your app to leave the stage politely. That means catching SIGTERM and SIGINT, stopping new requests, and letting in-flight requests finish. Without this, restarts can murder active connections. With it, the difference is night and day.

const http = require('http');
const express = require('express');
const app = express();

// Your routes here
app.get('/healthz', (req, res) => res.status(200).send('ok'));

const server = http.createServer(app);
server.listen(process.env.PORT || 3000, '127.0.0.1', () => {
  console.log('Server listening');
});

// Graceful shutdown
let shuttingDown = false;

function shutDown() {
  if (shuttingDown) return;
  shuttingDown = true;
  console.log('Received signal, shutting down gracefully...');
  server.close(err => {
    if (err) {
      console.error('Error during shutdown', err);
      process.exit(1);
    }
    console.log('Closed out remaining connections');
    process.exit(0);
  });

  // Optional hard timeout in case something hangs
  setTimeout(() => process.exit(1), 10000).unref();
}

process.on('SIGTERM', shutDown);
process.on('SIGINT', shutDown);

If you want the deeper why and how behind signals, the Node.js signal events page is a nice quick read. The gist is: let the process manager signal the app, and let your app exit naturally once it’s safe.

PM2 quickstart I keep coming back to

Here’s a tiny ecosystem file I reuse. It sets your app to cluster mode (using all available CPU cores), caps memory, and plays nicely with graceful reloads:

// ecosystem.config.js
module.exports = {
  apps: [
    {
      name: 'myapp',
      script: './server.js',
      exec_mode: 'cluster',
      instances: 'max',
      env: {
        NODE_ENV: 'production',
        PORT: 3000
      },
      max_memory_restart: '500M',
      listen_timeout: 8000,
      kill_timeout: 5000
    }
  ]
};
# Start, save, and enable startup on boot
pm2 start ecosystem.config.js
pm2 save
pm2 startup

# Later, deploy updates with zero downtime
pm2 reload myapp

In my experience, this fits small to medium apps beautifully. You get multi-core performance and a reload that swaps workers without dropping connections. It’s the happy path for a lot of teams.

Systemd unit I trust as a baseline

Prefer systemd? This unit covers the essentials: environment, working directory, restart policy, and lifecycle:

# /etc/systemd/system/myapp.service
[Unit]
Description=My Node.js App
After=network.target

[Service]
Type=simple
WorkingDirectory=/var/www/myapp/current
ExecStart=/usr/bin/node server.js
Restart=always
RestartSec=2
Environment=NODE_ENV=production
Environment=PORT=3000
# Optional: tune file descriptors
LimitNOFILE=65535

# Let Node handle graceful shutdown on SIGTERM
KillSignal=SIGTERM
TimeoutStopSec=15

[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl enable --now myapp
# Redeploys typically do:
systemctl restart myapp

With systemd, zero‑downtime depends on your strategy. A simple restart can be effectively seamless if your app is quick to boot and shuts down gracefully. If you need true never-a-blip deploys under heavy load, you can add a blue/green pattern with two templated units, then switch traffic at Nginx. It’s more steps, but it’s rock solid once scripted.

Nginx Reverse Proxy: The Quiet Hero

Why a reverse proxy makes life easier

Think of Nginx as your venue security: checks IDs (TLS), keeps the line moving (keep-alive, HTTP/2), and quietly handles weirdness (timeouts, buffering). Your Node app doesn’t have to juggle TLS, static assets, and upstream headers all on its own. Nginx is light, reliable, and happily runs forever in the background.

A production-ready Nginx snippet

This is the configuration I’d paste into a fresh server and sleep well. It handles WebSockets, preserves real client IPs, and sets sane timeouts. You’ll bolt TLS onto this in the next section.

# /etc/nginx/conf.d/myapp.conf
upstream myapp_upstream {
    server 127.0.0.1:3000;
    keepalive 64;
}

server {
    listen 80;
    listen [::]:80;
    server_name example.com www.example.com;

    # ACME challenge for Let's Encrypt
    location /.well-known/acme-challenge/ {
        root /var/www/letsencrypt;
    }

    location / {
        return 301 https://$host$request_uri;
    }
}

server {
    listen 443 ssl http2;
    listen [::]:443 ssl http2;
    server_name example.com www.example.com;

    # ssl_certificate and ssl_certificate_key will be added by Certbot (next section)

    # If you serve static assets directly
    location /assets/ {
        alias /var/www/myapp/current/public/assets/;
        access_log off;
        expires 7d;
    }

    location / {
        proxy_pass http://myapp_upstream;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket support
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";

        # Timeouts and buffering
        proxy_connect_timeout 5s;
        proxy_send_timeout 30s;
        proxy_read_timeout 60s;
        send_timeout 60s;

        # Optional: tune buffering for large payloads
        client_max_body_size 20m;
    }
}

If you’re keen to squeeze every drop of performance, enabling HTTP/2 and even HTTP/3 with QUIC is a treat. I walked through that in more detail in the end-to-end guide to HTTP/2 and HTTP/3 on Nginx + Cloudflare, and the same mindset maps nicely to Node backends.

Real HTTPS: Let’s Encrypt, TLS 1.3, and HSTS without the Panic

Getting a certificate the easy way

I remember doing certs by hand years ago with OpenSSL commands sprawled across my notes. These days, Let’s Encrypt and Certbot make it feel like autopilot. You point Certbot at Nginx, it validates via a quick HTTP challenge, and then it wires the certificates into your server blocks automatically. If you haven’t tried it yet, the Certbot guide is blissfully straightforward.

# On Ubuntu/Debian
apt-get update
apt-get install -y certbot python3-certbot-nginx

# Obtain and install certs for your domain
certbot --nginx -d example.com -d www.example.com

# Test renewal
certbot renew --dry-run

Once you’ve got TLS in place, you can raise the bar with TLS 1.3, OCSP stapling, and HSTS. I won’t drag you through cipher lists here, but if you want a friendly deep dive, I wrote up the exact playbook I reuse in TLS 1.3 Without Tears: OCSP Stapling, HSTS Preload, and PFS on Nginx/Apache. It’s the kind of setup that keeps modern browsers happy and keeps you off the SSL error screens.

Polish that helps in production

A couple small touches pay off over time. Redirect HTTP to HTTPS cleanly. Serve static assets with caching headers (just enough to be useful, not enough to trap stale files forever). If you’re uploading bigger payloads—images, CSVs—nudge client_max_body_size to something comfortable. And always keep a health endpoint, like /healthz, that returns quickly even under load. Nginx can bypass auth and route to it directly, which makes uptime checks painless.

Zero‑Downtime Deploys: Releases, Symlinks, and Safe Rollbacks

That first time you deploy without a blip

One of my clients once DM’d me a skeptical “Did you deploy? I didn’t feel anything.” That’s the level we’re aiming for. The recipe that works for me is simple: each release gets its own folder, we point a current symlink at the active one, and the process manager reloads gracefully. If something smells off, flip the symlink back and reload again. No fishing in Git history at 2 a.m., no “what build is on the server?” mysteries.

The release layout I reuse everywhere

/var/www/myapp/
  releases/
    2024-08-19-1500/
    2024-08-20-0910/
  current -> /var/www/myapp/releases/2024-08-20-0910
  shared/
    .env
    uploads/

During deploy, rsync the new build into a fresh release folder, link shared paths like uploads and your .env, run any migrations, and only then switch current to the new release. That last step takes a blink. After the swap, reload the app. With PM2, that’s a clean pm2 reload. With systemd, a restart paired with graceful shutdown is often near-seamless, especially if your app boots fast.

A tiny deploy script for PM2

# Assume you run this from your CI/CD runner
set -euo pipefail

APP=myapp
ROOT=/var/www/$APP
RELEASES=$ROOT/releases
RELEASE=$(date +%Y-%m-%d-%H%M%S)
NEW=$RELEASES/$RELEASE

ssh myserver "mkdir -p $RELEASES $ROOT/shared"
rsync -az --delete ./dist/ myserver:$NEW/
ssh myserver "ln -sfn $ROOT/shared/.env $NEW/.env; ln -sfn $NEW $ROOT/current; cd $ROOT/current && pm2 startOrReload ecosystem.config.js --only $APP"

That last command is the magic: startOrReload means PM2 will reload workers if the app is already running, or start it if it’s not, all without dropping traffic. If you want a fuller walkthrough, I wrote a detailed, VPS-friendly recipe in my zero‑downtime CI/CD guide with rsync, symlinks, and systemd. The ideas carry over 1:1 to Node.js.

Doing it with systemd (the calm way)

With systemd, you can keep a single unit and rely on your app’s graceful shutdown, or go the extra mile with blue/green units. The simple path looks like this: build release, flip symlink, systemctl restart myapp. If your app starts quickly and your Nginx proxy has short keepalive timeouts, most users won’t notice a thing. If you need ironclad zero‑downtime under heavy load, spin up two units—myapp@blue and myapp@green—have Nginx point to both upstreams, warm the new one, then drain the old one, and finally stop it. It’s extra ceremony, but entirely scriptable.

Health checks and rollbacks

Set up a health check endpoint that tells the truth—ideally confirming your app can speak to its database and critical services. After a deploy, hit it a few times via Nginx, not just localhost. If anything feels off, flip the symlink back and reload. Rollbacks should be boring. When they are, you’re free to be brave with small, frequent releases instead of fear-driven big bangs.

If you’re still mapping out your end-to-end dev-to-live motion, I shared my go-to routines in the no‑stress dev–staging–production workflow. Different stack, same calm principles.

Logs, Monitoring, and Other Quiet Superpowers

Logs you actually read

PM2 writes its own logs; systemd streams to the journal. Either way, keep things tidy and searchable. Don’t dump chatty debug logs into production unless you’re in the middle of a known incident. I like JSON logs for services that funnel into a centralized tool, but plain text is fine if you’re tailing locally. The point is consistency. Make sure your errors land where you’ll actually see them, not just in a folder you swear you’ll check later.

Uptime and metrics without drama

I’m a big fan of pragmatic monitoring. A simple uptime checker hitting /healthz every 30 seconds is better than a thousand unmonitored dashboards. When you’re ready to grow up your observability stack, this getting-started guide to Prometheus, Grafana, and Uptime Kuma shows how I wire up alerts and graphs without turning it into a second job. Tie alerts to the same channels you actually read, and don’t drown yourself in noise. Calm, useful monitoring beats aggressive, ignored monitoring every day of the week.

Security and the little locks that matter

Don’t forget the basics. Open only the ports you need (80/443 for Nginx, maybe SSH on a custom port). Keep Node behind Nginx on localhost. Rotate secrets—.env files belong in your shared folder, not baked into builds or Git. If your app processes uploads, scan and validate them. And of course, patch your system occasionally; boring maintenance is better than exciting incidents.

Putting It All Together: A Calm Runbook

A day in the life of your production stack

Let’s walk through a normal day. Your app boots under PM2 or systemd, bound to 127.0.0.1:3000. Nginx sits in front, handling TLS and proxying requests with the right headers. A user lands on your homepage over HTTP/2, Nginx keeps the connection warm, and your Node app does its thing, logging to a stream you’ll actually read later. If a worker hiccups, PM2 restarts it; if the process crashes, systemd brings it back. You sleep.

Later, a deploy comes along. CI builds into a fresh release folder, links shared files, flips the current symlink, and triggers a reload. PM2 rotates workers without dropping requests. Systemd restarts with a graceful shutdown hook, and Nginx politely retries. Your uptime alert sits there quietly, because there’s nothing to report. Meanwhile, TLS certs renew themselves behind the scenes—another job you’re no longer doing at 1 a.m.

If you want to go further on TLS polish, I’ve got a handy checklist in my practical TLS 1.3 + Brotli tune‑up for Nginx. It pairs perfectly with the Nginx block we sketched earlier.

Where to nudge for speed

You don’t need to over-optimize from day one, but there are easy wins. Serve static files directly from Nginx. Cache unchanging assets for a few days. Put your health checks on a separate path that’s fast by design. If you rely on external APIs, set sensible timeouts and fallbacks so a slow vendor doesn’t freeze your entire app. And keep an eye on memory—restart policies are your safety net, not a symptom of failure.

Small Gotchas I Learned the Hard Way

Headers and real client IPs

Make sure your app trusts Nginx as a proxy and reads the right headers. In Express, that’s app.set(‘trust proxy’, 1) when you’re behind Nginx. Otherwise, you’ll think every visitor is 127.0.0.1 and your rate limits or logs will be way off.

WebSockets and the secret handshake

WebSockets need the Upgrade and Connection headers set. Forgetting those is a classic “works locally, dies in prod” moment. The Nginx snippet above takes care of it.

Body sizes and timeouts

Uploads bigger than a few megabytes? Bump client_max_body_size on Nginx and make sure your Node parsing middleware isn’t too strict either. Also, ensure your Node server has reasonable timeouts. Hanging sockets can look like random slowness for users.

Graceful shutdown really is the secret sauce

I’ll repeat this because it cured so many phantom bugs for me: listen for SIGTERM, stop accepting new connections, let in-flight requests finish, and exit. PM2 reloads and systemd restarts go from scary to boring the day you wire this in.

Wrap‑Up: A Calm, Friendly Production Stack You’ll Reuse

There’s no one true way to host a Node app in production, but there is a way to make it feel calm. Let Nginx guard the door and speak TLS fluently. Let PM2 or systemd keep your process upright. Teach your app to leave the stage gracefully. Deploy with symlinked releases so rollbacks are a snap. And keep a humble health check that tells you, in plain language, that everything is OK.

If you want to keep building from here, layer in monitoring and a few simple alerts. Make your TLS strong but not fussy. And don’t be afraid to practice your deploy and rollback routine on a quiet afternoon; it’s amazing how much confidence you get when you’ve rehearsed. If you’re curious about tightening the bolts further, check out my beginner-friendly monitoring setup and the earlier links on TLS and HTTP/3. I hope this guide saves you from at least one midnight restart. If it did, I’ll count that as a win. See you in the next post, and may your deploys be pleasantly boring.

Frequently Asked Questions

Great question! You can’t really go wrong with either. PM2 is fantastic when you want easy clustering and true zero‑downtime reloads with a single command. Systemd is rock‑solid, already on your server, and integrates well with logs and resource limits. If you value simplicity, PM2 feels like home. If you prefer fewer dependencies and OS‑native tooling, go with systemd—just be ready to script blue/green or socket activation if you need perfect zero‑downtime.

Here’s the deal: set proxy headers correctly and upgrade the connection. At minimum, add X‑Forwarded‑For, X‑Forwarded‑Proto, and Host so your app sees the real client and scheme. For WebSockets, include Upgrade and Connection headers. Also enable keep‑alive on your upstream to keep things snappy. With those in place, Nginx will happily proxy both HTTP and WebSocket traffic to your Node app.

The simplest path is PM2 with a symlinked release structure. Ship to a new release folder, flip the current symlink, then run pm2 reload—workers will roll over gracefully without dropped requests. If you’re using systemd, you can still get near‑zero downtime with fast boots and proper SIGTERM handling, or go all‑in with blue/green units and switch traffic in Nginx. Whichever route you choose, a /healthz endpoint and a quick rollback plan are your best friends.