Technology

MySQL and PostgreSQL Replication on VPS: High Availability and Automatic Failover

Running MySQL or PostgreSQL on a single VPS works fine until the first serious incident: a kernel panic, disk failure, data center network issue, or even a bad OS update. At that moment you do not care about benchmark numbers; you only care about how fast you can recover and how much data you might lose. That is exactly what database replication, high availability (HA) and automatic failover are designed to protect you from. In this guide, we will walk through how to design and implement MySQL and PostgreSQL replication on VPS servers in a way that is realistic, maintainable and testable, using patterns we also apply on dchost.com infrastructure. We will cover core concepts, practical topologies, MySQL and PostgreSQL specific settings, connection routing (proxies, VIP and DNS), and the operational side: backups, monitoring and disaster recovery. The goal is simple: after reading this, you should be able to sketch a concrete HA plan for your own VPS databases and know which tools and patterns to adopt first.

Why Database Replication and HA Matter on a VPS

On a VPS, CPU, RAM and NVMe storage are just the starting point. For any application beyond a hobby project, the real questions quickly become: how much downtime can we tolerate (RTO), and how much data can we afford to lose (RPO)? Replication and HA are your main tools to reduce both.

Replication is the process of continuously copying data changes from one database instance (primary) to one or more others (replicas). High availability adds orchestration around that replication: health checks, automatic promotion, and a way for applications to always connect to the right node.

Without replication, your only protection is backups. Backups are essential but slow to restore, and they always imply some data loss. With replication, you can combine near real-time copies of your data with off-site backups to reach much tighter RPO/RTO targets. We have seen this especially for WooCommerce, SaaS dashboards and internal ERP systems: once revenue or operations depend on the database, a single VPS is simply not enough.

If you are still deciding whether to scale vertically or build HA, it is worth reading our article High Availability vs one big server for e‑commerce and SaaS. For most growing projects, a small HA setup on multiple VPSs becomes safer and often more cost-effective than one oversized machine.

Core Concepts: Replication, Failover and Topologies

Primary / Replica (Master / Slave) Basics

Both MySQL and PostgreSQL follow the same high-level model:

  • Primary: accepts all writes (INSERT/UPDATE/DELETE, DDL). Clients may also read from it.
  • Replica: receives changes from the primary and replays them. Usually read-only, serving analytics or read-heavy traffic.

In MySQL, classic replication is statement-based or row-based using binary logs (binlog). Modern deployments typically use row-based binary logging with GTID (Global Transaction ID) for easier recovery and failover.

In PostgreSQL, HA uses physical streaming replication: the primary writes WAL (Write-Ahead Log) segments, and replicas stream and replay them. You can also use logical replication for more advanced use cases (partial tables, cross-version migrations), but physical replication is the standard for HA.

Synchronous vs Asynchronous Replication

Replication timing directly affects your RPO:

  • Asynchronous: primary commits locally and returns success to the client; replicas catch up in the background. Fast, but a sudden primary failure might lose the last few transactions.
  • Synchronous: primary waits for at least one replica to confirm it has received (and sometimes flushed) the change before acknowledging the commit. Slower, but RPO can be close to zero.

MySQL offers semi-synchronous replication plugins and group replication. PostgreSQL supports synchronous replication with synchronous_commit and synchronous_standby_names. We often recommend starting with asynchronous replication on VPS setups, and then selectively enabling synchronous replication for the most critical workloads once you understand the performance impact.

Typical HA Topologies on VPS

On 2–4 VPS servers, these topologies are common:

  • Single primary + one replica: minimal HA. You get fast failover (often manual), plus offloaded reads and safer backups from the replica.
  • Single primary + two replicas: better survivability (you can lose one node), room for analytics, and flexibility to do rolling OS/database upgrades.
  • Primary/replica with quorum and fencing: usually with a coordination layer (etcd/Consul for PostgreSQL Patroni, or orchestrator-style tools for MySQL) to safely decide who becomes primary.

MySQL also has Group Replication and MySQL InnoDB Cluster (multi-primary). PostgreSQL has Patroni or repmgr-based clusters. For many VPS-based workloads, a well-tuned primary + replica topology with a proxy in front is simpler to operate than a full multi-primary cluster. If you are curious about more advanced MySQL clustering, we share our experience in Beyond backups: choosing between MariaDB Galera Cluster and MySQL Group Replication for real HA.

Choosing the Right VPS Architecture for HA

How Many VPS Servers Do You Really Need?

At dchost.com we usually start the conversation around three roles:

  • Database nodes: 2–3 VPSs running MySQL or PostgreSQL.
  • Proxy/load balancer: 1–2 VPSs running HAProxy/ProxySQL/PgBouncer, optionally with VRRP/keepalived for a floating IP.
  • Application nodes: your web/app servers, ideally separate from the database VPSs for cleaner scaling and isolation.

If budgets are tight, you can start with 2 database VPSs + 1 app/proxy VPS, then grow towards fully separated roles as traffic increases. Our article When to separate database and application servers for MySQL and PostgreSQL goes deeper into this decision.

Single Region vs Multi-Region Replication

Most HA setups start inside a single data center or region: lower latency, simpler networking and fewer moving parts. Once your RTO/RPO requirements include regional disasters or long outages, you can extend replication across regions.

For that, we like patterns similar to those described in multi‑region architectures with DNS geo‑routing and database replication: keep a primary cluster in one region, stream replication to another region, and use DNS or traffic manager logic to fail over clients. On dchost.com, you can place VPSs in different locations and layer this on top of a proper backup and DR plan.

Network and Storage Considerations

Replication traffic is mostly sequential writes (WAL/binlog streaming), but under heavy write load it can be significant. On a VPS, ensure:

  • Your nodes share a low-latency network (same region, preferably same data center).
  • You use NVMe SSD or at least high-quality SSD for the databases, as replication intensifies disk IO. Our NVMe SSD vs SATA SSD vs HDD comparison explains why NVMe is usually the right choice for write-heavy databases.

MySQL Replication on VPS: Practical Setup and HA Patterns

1. Preparing MySQL for Replication

Assume two VPSs (primary and replica) running a recent MySQL 8.x or compatible server, on Linux (Ubuntu/Debian/AlmaLinux/Rocky).

  1. Configure unique server IDs in my.cnf on each node:
[mysqld]
server_id = 1           # primary
log_bin = mysql-bin
binlog_format = ROW
gtid_mode = ON
enforce_gtid_consistency = ON

On the replica, use server_id = 2 and similar binlog settings (if you also want it to act as a source later).

  1. Create a replication user on the primary:
CREATE USER 'repl'@'10.%' IDENTIFIED BY 'strong-password';
GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'repl'@'10.%';
FLUSH PRIVILEGES;
  1. Take a consistent backup of the primary and restore it on the replica.

You can use mysqldump --single-transaction, physical backup tools, or an LVM snapshot. For a deeper dive into backup choices, see MySQL backup strategies: mysqldump, XtraBackup and snapshots.

  1. Point the replica at the primary using GTID:
CHANGE MASTER TO
  MASTER_HOST = '10.0.0.1',
  MASTER_USER = 'repl',
  MASTER_PASSWORD = 'strong-password',
  MASTER_AUTO_POSITION = 1;
START SLAVE;

Check status with SHOW SLAVE STATUSG and ensure Seconds_Behind_Master stays low under normal load.

2. Semi-Synchronous Replication (Optional)

If you want stronger consistency, enable MySQL semi-synchronous replication:

  • Install and enable the semi-sync plugin on both nodes.
  • On the primary, set rpl_semi_sync_master_enabled=ON.
  • On the replica, set rpl_semi_sync_slave_enabled=ON.

This does not make replication fully synchronous, but it reduces the window of unreplicated transactions. Be sure to test the latency impact on your workload.

3. Handling Failover for MySQL

Replication alone does not give you HA. You need a way to:

  • Detect primary failure.
  • Promote a replica to primary (STOP SLAVE; RESET SLAVE ALL; SET GLOBAL read_only = OFF; and adjust users/GTID as needed).
  • Redirect application connections to the new primary.

Common patterns on VPS setups:

  • Application-level failover: apps know both nodes, try primary first, then replica. Simple but pushes complexity into every app.
  • HAProxy or ProxySQL in front: apps connect to a proxy, which knows which backend is primary and which are replicas.
  • Floating IP (VRRP/keepalived): a virtual IP always points to the current primary; proxies/apps connect to that IP.

For MySQL, ProxySQL is particularly powerful: it can route writes to the primary and reads to replicas, handle connection pooling, and integrate with MySQL monitoring. We shared a complete, real-world scenario in ProxySQL read/write split and pooling for WooCommerce and Laravel. The same patterns apply on a VPS-based HA setup.

PostgreSQL Streaming Replication on VPS

1. Preparing PostgreSQL for Streaming Replication

Assume two VPSs (primary and standby) running PostgreSQL 14+ on Linux.

  1. Configure WAL and replication settings on the primary’s postgresql.conf:
wal_level = replica
max_wal_senders = 10
max_replication_slots = 10
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/wal_archive/%f'

Adjust paths and numbers to your needs. Set hot_standby = on on the standby so it can accept read-only queries.

  1. Allow replication connections in pg_hba.conf on the primary:
host replication replicator 10.0.0.0/24 md5
  1. Create a replication role:
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'strong-password';

2. Base Backup and Standby Configuration

On the standby:

  1. Stop PostgreSQL and clear the data directory.
  2. Run pg_basebackup from the primary:
pg_basebackup -h 10.0.0.1 -D /var/lib/postgresql/data 
  -U replicator -Fp -Xs -P -R

The -R flag creates a standby.signal (or recovery.conf on older versions) and a primary_conninfo entry that tells PostgreSQL how to connect back to the primary for streaming.

After starting PostgreSQL on the standby, check pg_stat_replication on the primary to confirm replication is working.

3. Synchronous Replication (Optional)

To enable synchronous replication (for near-zero RPO), on the primary set:

synchronous_commit = on
synchronous_standby_names = '1 (standby1)'

Replace standby1 with the actual application_name of your standby (see primary_conninfo). PostgreSQL will now wait for that standby to confirm before completing commits. Test carefully; network issues can block writes if the synchronous standby is unavailable.

4. Automatic Failover with Patroni or repmgr

Manual failover is fine for internal apps, but customer-facing workloads usually need automatic promotion. For PostgreSQL on VPS, two popular approaches are:

  • Patroni: uses a distributed configuration store (etcd, Consul, or even PostgreSQL itself) to manage who is primary, handle automatic failover, and keep configuration in sync.
  • repmgr: a lighter-weight tool that manages replication, monitoring and failover, often combined with external tools for VIP/DNS changes.

Both integrate well with PgBouncer or HAProxy in front of the cluster, so applications simply connect to a stable endpoint. On dchost.com, we see many teams pair Patroni with PgBouncer on a small VPS, then direct app traffic to PgBouncer instead of connecting to PostgreSQL nodes directly.

If you are planning a serious PostgreSQL deployment on a VPS, also look at our performance-focused article The friendly VPS playbook for PostgreSQL performance and our autovacuum tuning guide. Poorly tuned vacuum and WAL settings can silently damage replication lag and failover times.

Automatic Failover Patterns: Proxy, VIP and DNS

1. Proxy-Led HA (HAProxy, ProxySQL, PgBouncer)

In this model, your applications talk to one or two proxy endpoints:

  • For MySQL: ProxySQL or HAProxy.
  • For PostgreSQL: PgBouncer and/or HAProxy.

The proxy periodically checks backends, knows which one is primary, and routes traffic accordingly. During failover, your orchestration tool (Orchestrator/Patroni/repmgr/custom scripts) promotes a replica; the proxy detects the change and updates its backend roles.

This centralizes connection logic and lets you change the cluster topology without touching application code. The cost is an extra VPS (or two, if you want redundant proxies) but in practice this is one of the cleanest VPS HA architectures.

2. Floating IP / VRRP with keepalived

A floating IP (virtual IP) always points to the current primary. keepalived uses VRRP to move this IP between nodes based on health checks. Applications connect to that IP via standard MySQL/PostgreSQL ports.

Advantages:

  • Very simple for applications: they see only one IP.
  • Can be combined with proxies (VIP pointing at the active proxy instead of the database).

Disadvantages:

  • Requires L2/L3 network support for VRRP, which every VPS platform implements differently.
  • Failover is at the IP level; you still need application-level reconnection logic and safe promotion steps to avoid split-brain.

3. DNS-Based Failover

Another approach is to have a DNS name (e.g. db.example.com) that points to the current primary. When failover happens, automation updates the DNS record.

Pros:

  • Simpler from the network perspective (no VRRP).
  • Works well for multi-region or multi-data-center replication where IP-level failover is harder.

Cons:

  • TTL and caching mean failover is not instant.
  • Clients must honor DNS TTLs and reconnect properly.

For high-end architectures, combining DNS geo‑routing with multi-region replication is powerful; our article How Anycast DNS and automatic failover keep your site up explains how DNS‑level strategies complement database HA.

4. Avoiding Split-Brain

Split-brain happens when two nodes both believe they are primary and accept writes independently. Merging that data is painful or impossible.

To avoid it:

  • Always use quorum-based decisions when promoting a new primary (e.g. Patroni with etcd, or orchestration tools that check majority).
  • Ensure the old primary is fenced (shut down, or blocked from accepting connections) before advertising the new primary.
  • Keep health checks conservative; treat network partitions as serious incidents, not something to automatically heal without checks.

Operational Best Practices: Backups, Monitoring and DR

Replication Is Not a Backup

Replication will happily replicate mistakes: dropped tables, bad migrations, or mass updates. You still need proper backups with versioning and point‑in‑time recovery.

We strongly recommend:

  • Regular logical or physical backups from a replica (to lighten load on the primary).
  • Off‑site / off‑VPS backups, ideally to S3‑compatible object storage.
  • Periodic restore drills, not just “backup completed” checks.

Two resources that pair perfectly with this article:

Monitoring Replication Lag and Health

Good HA is impossible without good observability. At minimum, monitor:

  • Replication delay (Seconds_Behind_Master in MySQL, WAL replay lag in PostgreSQL).
  • Replica health (IO/SQL thread status in MySQL, pg_stat_replication in PostgreSQL).
  • Disk space, IOPS, CPU, RAM and connection counts on each node.

You can build this with Prometheus + exporters + Grafana on a small VPS. We have a friendly starting point in VPS monitoring and alerts with Prometheus, Grafana and Uptime Kuma. Alert not only on hard failures, but on trend warnings like increasing replication lag or WAL archive backlog.

Disaster Recovery and Failover Drills

HA that is never tested is just a diagram. At dchost.com, we encourage customers to schedule small “game days” where they:

  • Simulate primary failure (stop the service, block the IP, or power off the VPS in a controlled window).
  • Observe how orchestration tools behave: who becomes primary, how proxies react.
  • Measure RTO/RPO against your targets.

Our general DR planning article How I write a no‑drama DR plan is a good companion read if you want to formalize this process.

When Replication on a VPS Is (and Is Not) the Right Choice

When HA Replication Makes Sense

On a VPS, database replication and automatic failover are worth the complexity when:

  • Your application generates direct revenue (e‑commerce, SaaS, booking platforms).
  • Downtime during working hours is not acceptable for customers or internal teams.
  • You already invested in caching and query optimization but still hit single-node limits.
  • You need maintenance windows for OS/database upgrades without full outage.

In those scenarios, starting with a 2‑node cluster + proxy on dchost.com VPSs is often the right balance. As requirements grow, you can move database roles to dedicated NVMe VPSs or even dedicated servers and keep the same logical architecture.

When a Single VPS + Strong Backups Is Enough

For small blogs, internal tools with low business impact, or early MVPs, HA may be overkill. In those cases:

  • Run MySQL or PostgreSQL on a single, well‑sized VPS.
  • Invest in good backups, automated restore tests, and careful schema migration practices.
  • Plan an upgrade path: know when to add a read replica or separate database and application servers.

The key is to design with future HA in mind: use connection strings, environment variables and configuration management that allow you to introduce proxies and replicas later without rewriting everything.

Wrapping Up: A Practical Path to HA MySQL/PostgreSQL on VPS

High availability on a VPS does not mean you need an enterprise-scale cluster from day one. It means being honest about your RPO/RTO needs, designing a simple but solid topology, and gradually adding pieces: replication, a proxy, backup automation, monitoring, and tested failover procedures.

For MySQL, start with GTID-based replication between two dchost.com VPSs, add ProxySQL or HAProxy in front, and practice manual promotion before introducing full automatic failover. For PostgreSQL, start with physical streaming replication, then evaluate Patroni or repmgr plus PgBouncer once you are comfortable with the basics. In both worlds, never forget that replication does not replace backups; pair your HA setup with robust, tested backup pipelines and, if your risk profile requires it, cross-region replicas as described in our multi‑region and Anycast DNS articles.

If you are planning an HA rollout and you are not sure how to map these patterns onto dchost.com VPS, dedicated or colocation infrastructure, our team can help you choose the right mix of servers, storage and network features. The important part is to start with a design you can understand, operate and test—not a fragile diagram that only looks good on slides. From there, scaling your MySQL or PostgreSQL clusters on VPS becomes a matter of iteration, not a leap of faith.

Frequently Asked Questions

Replication is worth the complexity when database downtime or data loss directly impacts revenue or operations. If you run e‑commerce, SaaS, customer portals, booking systems or critical internal tools, relying on a single VPS means a hardware issue, kernel panic or bad migration can take you offline for hours and force you to restore from backups with data loss. Replication plus a sensible failover plan reduces both downtime (RTO) and data loss (RPO). For small blogs or low‑risk internal tools, a single VPS with strong, tested backups can be enough—but design your stack so that adding replication later does not require a full rewrite.

In asynchronous replication, the primary node commits changes locally and returns success to the client; replicas receive and replay those changes later. This is fast and popular, but if the primary fails suddenly you may lose the last few committed transactions. In synchronous replication, the primary waits for at least one replica to confirm it has received (and possibly flushed) the change before acknowledging the commit. This greatly reduces potential data loss but adds latency and introduces new failure modes if the synchronous replica is slow or unreachable. On VPS setups, many teams start with asynchronous replication, then enable synchronous or semi‑synchronous replication only for the most critical workloads or between specific node pairs.

Split‑brain happens when two nodes both think they are primary and accept writes independently, creating diverging datasets that are extremely hard to merge. To avoid this on VPS deployments, you need coordination and fencing. Use tools that rely on quorum (e.g. Patroni with etcd/Consul for PostgreSQL, or orchestration layers for MySQL) so a node only promotes itself if it sees a majority. Automate fencing of the old primary during failover—for example by shutting down the database service or blocking client access at the firewall or proxy layer. Also, be conservative with health checks: treat network partitions and intermittent packet loss as serious events, not something to auto‑heal without careful validation.

A minimal HA setup is 2 database VPSs (primary and replica) plus somewhere to run a proxy (which can share a VPS with your application at the beginning). This already gives you faster recovery than backups alone and room for rolling upgrades. For stronger guarantees and automatic failover, 3 database nodes plus 1–2 proxy nodes is more robust: you can maintain quorum, fence failed nodes and survive the loss of a single VPS without downtime. As your workload grows, you can move the database to dedicated NVMe VPSs or dedicated servers while keeping the same logical architecture—primary/replica, proxy endpoint and automated failover logic.