{"id":1435,"date":"2025-11-06T18:22:16","date_gmt":"2025-11-06T15:22:16","guid":{"rendered":"https:\/\/www.dchost.com\/blog\/vps-monitoring-and-alerts-without-tears-getting-started-with-prometheus-grafana-and-uptime-kuma\/"},"modified":"2025-11-06T18:22:16","modified_gmt":"2025-11-06T15:22:16","slug":"vps-monitoring-and-alerts-without-tears-getting-started-with-prometheus-grafana-and-uptime-kuma","status":"publish","type":"post","link":"https:\/\/www.dchost.com\/blog\/en\/vps-monitoring-and-alerts-without-tears-getting-started-with-prometheus-grafana-and-uptime-kuma\/","title":{"rendered":"VPS Monitoring and Alerts Without Tears: Getting Started with Prometheus, Grafana, and Uptime Kuma"},"content":{"rendered":"<div class=\"dchost-blog-content-wrapper\"><div id=\"toc_container\" class=\"toc_transparent no_bullets\"><p class=\"toc_title\">\u0130&ccedil;indekiler<\/p><ul class=\"toc_list\"><li><a href=\"#The_Quiet_Panic_That_Made_Me_Take_Monitoring_Seriously\"><span class=\"toc_number toc_depth_1\">1<\/span> The Quiet Panic That Made Me Take Monitoring Seriously<\/a><\/li><li><a href=\"#What_Were_Building_The_Trio_That_Keeps_a_VPS_Calm\"><span class=\"toc_number toc_depth_1\">2<\/span> What We\u2019re Building: The Trio That Keeps a VPS Calm<\/a><\/li><li><a href=\"#Prometheus_Setup_The_Simple_Path_That_Actually_Works\"><span class=\"toc_number toc_depth_1\">3<\/span> Prometheus Setup: The Simple Path That Actually Works<\/a><ul><li><a href=\"#The_mental_model\"><span class=\"toc_number toc_depth_2\">3.1<\/span> The mental model<\/a><\/li><li><a href=\"#Install_Node_Exporter_on_the_VPS_you_want_to_monitor\"><span class=\"toc_number toc_depth_2\">3.2<\/span> Install Node Exporter on the VPS you want to monitor<\/a><\/li><li><a href=\"#Install_Prometheus\"><span class=\"toc_number toc_depth_2\">3.3<\/span> Install Prometheus<\/a><\/li><li><a href=\"#Scrape_multiple_VPS_nodes\"><span class=\"toc_number toc_depth_2\">3.4<\/span> Scrape multiple VPS nodes<\/a><\/li><li><a href=\"#Optional_HTTP_checks_from_the_inside_with_Blackbox_Exporter\"><span class=\"toc_number toc_depth_2\">3.5<\/span> Optional: HTTP checks from the inside with Blackbox Exporter<\/a><\/li><\/ul><\/li><li><a href=\"#Grafana_Paint_the_Picture_and_Make_Alerts_You_Actually_Trust\"><span class=\"toc_number toc_depth_1\">4<\/span> Grafana: Paint the Picture and Make Alerts You Actually Trust<\/a><ul><li><a href=\"#Connect_Grafana_to_Prometheus\"><span class=\"toc_number toc_depth_2\">4.1<\/span> Connect Grafana to Prometheus<\/a><\/li><li><a href=\"#Starter_panels_that_tell_a_real_story\"><span class=\"toc_number toc_depth_2\">4.2<\/span> Starter panels that tell a real story<\/a><\/li><li><a href=\"#Alerts_that_whisper_until_they_need_to_shout\"><span class=\"toc_number toc_depth_2\">4.3<\/span> Alerts that whisper until they need to shout<\/a><\/li><li><a href=\"#Keep_the_dashboards_calm\"><span class=\"toc_number toc_depth_2\">4.4<\/span> Keep the dashboards calm<\/a><\/li><\/ul><\/li><li><a href=\"#Uptime_Kuma_External_Checks_and_a_Status_Page_People_Actually_Read\"><span class=\"toc_number toc_depth_1\">5<\/span> Uptime Kuma: External Checks and a Status Page People Actually Read<\/a><ul><li><a href=\"#Why_you_want_the_outside_view_too\"><span class=\"toc_number toc_depth_2\">5.1<\/span> Why you want the outside view too<\/a><\/li><li><a href=\"#Quick_install_and_first_monitors\"><span class=\"toc_number toc_depth_2\">5.2<\/span> Quick install and first monitors<\/a><\/li><li><a href=\"#Notifications_that_fit_your_life\"><span class=\"toc_number toc_depth_2\">5.3<\/span> Notifications that fit your life<\/a><\/li><li><a href=\"#Status_pages_that_reduce_support_pings\"><span class=\"toc_number toc_depth_2\">5.4<\/span> Status pages that reduce support pings<\/a><\/li><\/ul><\/li><li><a href=\"#Wrap-Up_A_Calm_Helpful_Monitoring_Setup_Youll_Actually_Keep\"><span class=\"toc_number toc_depth_1\">6<\/span> Wrap-Up: A Calm, Helpful Monitoring Setup You\u2019ll Actually Keep<\/a><\/li><\/ul><\/div>\n<h2 id=\"section-1\"><span id=\"The_Quiet_Panic_That_Made_Me_Take_Monitoring_Seriously\">The Quiet Panic That Made Me Take Monitoring Seriously<\/span><\/h2>\n<p>It wasn\u2019t a dramatic outage. No sirens. No frantic calls. Just a quiet Monday morning, a lukewarm coffee, and a weirdly slow site that refused to speed up. I remember staring at a graph that didn\u2019t exist yet\u2014because I hadn\u2019t set one up. That was the moment I realized how blind I was flying. Ever had that sinking feeling when something\u2019s off, but you don\u2019t know where to look? That was me. CPU felt fine, memory looked okay when I pinged in manually, and yet requests were queuing somewhere in the dark.<\/p>\n<p>Here\u2019s the thing about <a href=\"https:\/\/www.dchost.com\/vps\">VPS<\/a> monitoring: it\u2019s not for when things go wrong\u2014it\u2019s for knowing things are <strong>about to<\/strong> go wrong and catching them quietly, before anyone else notices. The trio that changed my day-to-day is dead simple: Prometheus for metrics, Grafana for dashboards and alerts, and Uptime Kuma for external checks and friendly status pages. In this post, I\u2019ll walk you through how I set them up, how I keep the alerts friendly (not noisy), and the small tweaks that make a big difference. Think of it like showing a friend how to keep their house cozy without obsessing over every draft.<\/p>\n<h2 id=\"section-2\"><span id=\"What_Were_Building_The_Trio_That_Keeps_a_VPS_Calm\">What We\u2019re Building: The Trio That Keeps a VPS Calm<\/span><\/h2>\n<p>When folks ask me where to start with VPS monitoring and alerts, I always suggest two views of reality: the inside view and the outside view. Prometheus and Grafana give you the inside\u2014CPU, memory, disks, network, app behavior\u2014while Uptime Kuma gives you the outside\u2014can people reach your site, do APIs respond quickly, is TLS valid? Together, they feel like finally turning on the lights in a room you\u2019ve been walking through in the dark.<\/p>\n<p>Prometheus is your metric collector. It \u201cscrapes\u201d numbers from exporters (like Node Exporter on your VPS) and stores time-series data. It\u2019s fast, reliable, and amazingly honest. If there\u2019s a spike in IOwait or a sneaky memory leak, Prometheus doesn\u2019t just tell you\u2014it draws the picture. If you want a friendly deep dive into Node Exporter and why it\u2019s worth the extra minute to install, I\u2019ve shared a complete playbook in <a href=\"https:\/\/www.dchost.com\/blog\/en\/vps-izleme-ve-uyari-nasil-kurulur-prometheus-grafana-ve-node-exporter-ile-sessiz-alarmlari-konusturmak\/\">the stack I trust: Prometheus + Grafana + Node Exporter<\/a>.<\/p>\n<p>Grafana sits on top like your mission control. I like it because it doesn\u2019t make you feel dumb. You can build panels that just make sense: CPU usage with a shaded average, RAM with a subtle threshold line, disk latency with a red \u201cdanger zone.\u201d The new alerting system lets you send notifications where you actually live\u2014Slack, Telegram, email\u2014and adjust rules so you\u2019re not waking up for nothing. The biggest win is designing dashboards that are calm by default and urgent when needed.<\/p>\n<p>Then there\u2019s Uptime Kuma, the friendliest uptime monitor I\u2019ve ever rolled out. It checks your sites and services from the outside, pings your ports, measures response times, and even handles push-based checks if you want your app to say \u201cI\u2019m alive\u201d on a schedule. And the status page? It\u2019s like a little window you can show your team or clients so they don\u2019t have to ask what\u2019s going on\u2014they can see it.<\/p>\n<p>Put these together and you\u2019ve got a layered safety net. Metrics tell you why, uptime tells you whether, and alerts tie it into action. And if you\u2019re running a busy store or an app with bursty traffic, pairing this stack with good capacity planning is a lifesaver. I\u2019ve seen it over and over, especially with shops that suddenly hit a promotion wave\u2014if this is you, take a peek at my thoughts on <a href=\"https:\/\/www.dchost.com\/blog\/en\/woocommerce-kapasite-planlama-rehberi-vcpu-ram-iops-nasil-hesaplanir\/\">right-sizing vCPU, RAM, and IOPS for WooCommerce without guesswork<\/a>.<\/p>\n<p>If you\u2019ve ever wondered how storage plays into this, you\u2019re not alone. Disk performance is sneaky, and IOwait numbers can look like ghosts until you track them properly. If you\u2019re curious, I\u2019ve unpacked what really moves the needle in <a href=\"https:\/\/www.dchost.com\/blog\/en\/nvme-vps-hosting-rehberi-hizin-nereden-geldigini-nasil-olculdugunu-ve-gercek-sonuclari-beraber-gorelim\/\">my NVMe VPS hosting guide\u2014where speed actually comes from<\/a>. Seeing IO wait time alongside CPU usage in Grafana tells a story you can act on.<\/p>\n<h2 id=\"section-3\"><span id=\"Prometheus_Setup_The_Simple_Path_That_Actually_Works\">Prometheus Setup: The Simple Path That Actually Works<\/span><\/h2>\n<h3><span id=\"The_mental_model\">The mental model<\/span><\/h3>\n<p>Prometheus works by scraping endpoints that expose metrics in plain text. Your VPS will run Node Exporter to expose system metrics. Prometheus itself can live on the same server for a small setup, or on a separate monitoring box if you want to scale later. Start small, keep it simple, and you\u2019ll be fine.<\/p>\n<h3><span id=\"Install_Node_Exporter_on_the_VPS_you_want_to_monitor\">Install Node Exporter on the VPS you want to monitor<\/span><\/h3>\n<p>I usually start with Node Exporter because it\u2019s lightweight and instantly useful. It exposes CPU loads, memory, disks, filesystems, network, and even systemd status. On Debian\/Ubuntu:<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># Create a user\nsudo useradd --no-create-home --shell \/usr\/sbin\/nologin node_exporter\n\n# Download latest (check GitHub releases for the newest version)\nwget https:\/\/github.com\/prometheus\/node_exporter\/releases\/download\/v1.7.0\/node_exporter-1.7.0.linux-amd64.tar.gz\n\ntar -xzf node_exporter-1.7.0.linux-amd64.tar.gz\nsudo cp node_exporter-1.7.0.linux-amd64\/node_exporter \/usr\/local\/bin\/\n\n# Systemd service\nsudo tee \/etc\/systemd\/system\/node_exporter.service &gt;\/dev\/null &lt;&lt; 'EOF'\n[Unit]\nDescription=Node Exporter\nWants=network-online.target\nAfter=network-online.target\n\n[Service]\nUser=node_exporter\nGroup=node_exporter\nType=simple\nExecStart=\/usr\/local\/bin\/node_exporter\n\n[Install]\nWantedBy=multi-user.target\nEOF\n\nsudo systemctl daemon-reload\nsudo systemctl enable --now node_exporter\n\n# Default port is 9100\n# Confirm it works\ncurl http:\/\/127.0.0.1:9100\/metrics | head -n 5\n<\/code><\/pre>\n<p>Open your firewall if needed. I usually allow port 9100 only to the Prometheus server\u2019s IP, not the whole internet. A tiny bit of paranoia goes a long way.<\/p>\n<h3><span id=\"Install_Prometheus\">Install Prometheus<\/span><\/h3>\n<p>You can run Prometheus on the same machine while testing, or spin up a small VPS just for monitoring. For a single VPS, same-box is fine. For a few servers or anything customer-facing, I separate it.<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># Create user and directories\nsudo useradd --no-create-home --shell \/usr\/sbin\/nologin prometheus\nsudo mkdir -p \/etc\/prometheus \/var\/lib\/prometheus\n\n# Download (check releases for the latest)\nwget https:\/\/github.com\/prometheus\/prometheus\/releases\/download\/v2.53.1\/prometheus-2.53.1.linux-amd64.tar.gz\n\ntar -xzf prometheus-2.53.1.linux-amd64.tar.gz\nsudo cp prometheus-2.53.1.linux-amd64\/prometheus \/usr\/local\/bin\/\nsudo cp prometheus-2.53.1.linux-amd64\/promtool \/usr\/local\/bin\/\nsudo cp -r prometheus-2.53.1.linux-amd64\/consoles \/etc\/prometheus\/\nsudo cp -r prometheus-2.53.1.linux-amd64\/console_libraries \/etc\/prometheus\/\n\n# Basic config\nsudo tee \/etc\/prometheus\/prometheus.yml &gt;\/dev\/null &lt;&lt; 'EOF'\nglobal:\n  scrape_interval: 15s\n  evaluation_interval: 15s\n\tscrape_configs:\n  - job_name: 'node'\n    static_configs:\n      - targets: ['127.0.0.1:9100']\nEOF\n\nsudo chown -R prometheus:prometheus \/etc\/prometheus \/var\/lib\/prometheus\n\n# Systemd service\nsudo tee \/etc\/systemd\/system\/prometheus.service &gt;\/dev\/null &lt;&lt; 'EOF'\n[Unit]\nDescription=Prometheus\nWants=network-online.target\nAfter=network-online.target\n\n[Service]\nUser=prometheus\nGroup=prometheus\nType=simple\nExecStart=\/usr\/local\/bin\/prometheus \n  --config.file=\/etc\/prometheus\/prometheus.yml \n  --storage.tsdb.path=\/var\/lib\/prometheus \n  --storage.tsdb.retention.time=15d \n  --web.listen-address=0.0.0.0:9090\n\n[Install]\nWantedBy=multi-user.target\nEOF\n\nsudo systemctl daemon-reload\nsudo systemctl enable --now prometheus\n\n# Visit http:\/\/&lt;server-ip&gt;:9090 to confirm\n<\/code><\/pre>\n<p>A note on retention: the default is quite generous. I prefer to set something like 15 days to keep the disk footprint reasonable at first, then tune later. If you have a beefy disk or remote storage plans, stretch it. If you\u2019re on a tiny VPS, keep it conservative.<\/p>\n<h3><span id=\"Scrape_multiple_VPS_nodes\">Scrape multiple VPS nodes<\/span><\/h3>\n<p>To add more servers, install Node Exporter on each, and update Prometheus:<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">scrape_configs:\n  - job_name: 'node'\n    static_configs:\n      - targets: ['10.0.0.11:9100', '10.0.0.12:9100', '10.0.0.13:9100']\n<\/code><\/pre>\n<p>Or if you like organizing by role, add labels:<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">scrape_configs:\n  - job_name: 'node'\n    static_configs:\n      - targets: ['10.0.0.11:9100']\n        labels:\n          role: 'web'\n      - targets: ['10.0.0.12:9100']\n        labels:\n          role: 'db'\n<\/code><\/pre>\n<p>Labels are your future self\u2019s best friend. A month from now you\u2019ll be grateful you can filter dashboards by role or environment.<\/p>\n<h3><span id=\"Optional_HTTP_checks_from_the_inside_with_Blackbox_Exporter\">Optional: HTTP checks from the inside with Blackbox Exporter<\/span><\/h3>\n<p>If you want Prometheus to probe endpoints (HTTP, TCP, ICMP) from inside the network, add Blackbox Exporter:<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\"># Download and install similarly to Node Exporter\nwget https:\/\/github.com\/prometheus\/blackbox_exporter\/releases\/download\/v0.24.0\/blackbox_exporter-0.24.0.linux-amd64.tar.gz\n...\n\n# Example Prometheus config\nscrape_configs:\n  - job_name: 'blackbox'\n    metrics_path: \/probe\n    params:\n      module: [http_2xx]\n    static_configs:\n      - targets:\n          - https:\/\/example.com\n          - https:\/\/api.example.com\/health\n    relabel_configs:\n      - source_labels: [__address__]\n        target_label: __param_target\n      - target_label: __address__\n        replacement: 127.0.0.1:9115  # blackbox_exporter\n<\/code><\/pre>\n<p>Prometheus is the backbone here. If you\u2019re curious about the core philosophy and what makes it tick, the official docs are short and sweet: <a href=\"https:\/\/prometheus.io\/docs\/introduction\/overview\/\" rel=\"nofollow noopener\" target=\"_blank\">the Prometheus overview explains the model clearly<\/a>.<\/p>\n<h2 id=\"section-4\"><span id=\"Grafana_Paint_the_Picture_and_Make_Alerts_You_Actually_Trust\">Grafana: Paint the Picture and Make Alerts You Actually Trust<\/span><\/h2>\n<h3><span id=\"Connect_Grafana_to_Prometheus\">Connect Grafana to Prometheus<\/span><\/h3>\n<p>Grafana installation is straightforward. Most distros have a package, but the downloads page is also easy to follow. Once it\u2019s running, add Prometheus as a data source by pointing Grafana at your Prometheus URL, usually http:\/\/YOUR_PROM_SERVER:9090. That\u2019s it\u2014now you can query metrics using PromQL.<\/p>\n<p>In my experience, the first panel I build is CPU load averaged over five minutes, per instance. Then a memory usage panel that subtracts cache and buffers, because raw \u201cused memory\u201d is misleading. Disk IO time and IOwait follow, and finally network throughput. Each panel gets a threshold or two, but I keep the colors gentle. The idea is to keep the dashboard calm. You should feel your shoulders drop when you open it.<\/p>\n<h3><span id=\"Starter_panels_that_tell_a_real_story\">Starter panels that tell a real story<\/span><\/h3>\n<p>Here\u2019s a simple set of queries I reach for:<\/p>\n<p><strong>CPU usage (per instance):<\/strong><\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">100 - (avg by (instance) (irate(node_cpu_seconds_total{mode=&quot;idle&quot;}[5m])) * 100)\n<\/code><\/pre>\n<p><strong>Memory used (excluding cache\/buffers):<\/strong><\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) \/ node_memory_MemTotal_bytes * 100\n<\/code><\/pre>\n<p><strong>Disk IO time (per device):<\/strong><\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">avg by (instance, device) (irate(node_disk_io_time_seconds_total[5m]) * 100)\n<\/code><\/pre>\n<p><strong>Network receive\/transmit:<\/strong><\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">sum by (instance) (irate(node_network_receive_bytes_total[5m]))\nsum by (instance) (irate(node_network_transmit_bytes_total[5m]))\n<\/code><\/pre>\n<p>Once you have these, you\u2019ll start spotting patterns. Maybe CPU is fine, but IO time spikes during backups. Or memory usage creeps up daily until a service restart resets it. It\u2019s like having timestamps on your headaches\u2014you can finally explain them.<\/p>\n<h3><span id=\"Alerts_that_whisper_until_they_need_to_shout\">Alerts that whisper until they need to shout<\/span><\/h3>\n<p>Alerting is where good intentions go to die if you\u2019re not careful. Too many alerts and you\u2019ll start ignoring all of them. Too few and you\u2019ll miss the early warnings. I prefer a layered approach: warnings that nudge you (Slack, email), and critical alerts that break through (Pager, SMS, Telegram). Set sensible durations: CPU over 90% for 10 minutes is a warning; over 95% for 20 minutes is critical. Spikes happen; sustained pressure is the danger.<\/p>\n<p>Here\u2019s a sample CPU alert in Grafana\u2019s new alerting system (you can also define alerts in Prometheus, but Grafana\u2019s workflow is a little friendlier if you\u2019re just starting):<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">Expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode=&quot;idle&quot;}[5m])) * 100) &gt; 90\nFor: 10m\nLabels: severity = 'warning'\nAnnotations: summary = 'High CPU on {{ $labels.instance }}', description = 'CPU over 90% for 10m'\n<\/code><\/pre>\n<p>Duplicate it, bump the threshold and duration, and you\u2019ve got your critical version. Do the same for memory (e.g., over 90%), disk space (e.g., 80% warning, 90% critical), and IOwait. Don\u2019t forget the \u201cdead man\u2019s switch\u201d\u2014an alert that fires if no data comes in for a few minutes. That one catches the \u201cPrometheus died\u201d scenario, which is easy to miss.<\/p>\n<p>For documentation and nuances like contact points, mute times, and silences, I find the official guide genuinely helpful. If you\u2019re bringing teammates on board, share <a href=\"https:\/\/grafana.com\/docs\/grafana\/latest\/\" rel=\"nofollow noopener\" target=\"_blank\">Grafana\u2019s alerting and dashboard docs<\/a>\u2014they\u2019re readable and practical.<\/p>\n<h3><span id=\"Keep_the_dashboards_calm\">Keep the dashboards calm<\/span><\/h3>\n<p>It\u2019s tempting to build the dashboard equivalent of a cockpit with a thousand blinking lights. Resist. Start with a single row per host, one row for storage, one row for networking, and a top row for a heatmap-style overview. Color is for urgency; everything else should be easy on the eyes. Your future self, opening this at 2 a.m., will thank you.<\/p>\n<p>By the way, if you\u2019re running Laravel, WordPress, or any framework that has its own quirks, blend app-level metrics into your system dashboard when possible. Seeing queue depth next to CPU, or cache hit rate next to disk IO, connects the dots. If you\u2019re optimizing PHP-FPM, OPcache, Horizon, or Redis, the \u201cthen this, then that\u201d chain becomes obvious. I covered a lot of that practical tuning in a real-world story here: <a href=\"https:\/\/www.dchost.com\/blog\/en\/laravel-prod-ortam-optimizasyonu-nasil-yapilir-php%E2%80%91fpm-opcache-octane-queue-horizon-ve-redisi-el-ele-calistirmak\/\">the Laravel production tune-up I do on every server<\/a>. Even if Laravel isn\u2019t your stack, the logic applies.<\/p>\n<h2 id=\"section-5\"><span id=\"Uptime_Kuma_External_Checks_and_a_Status_Page_People_Actually_Read\">Uptime Kuma: External Checks and a Status Page People Actually Read<\/span><\/h2>\n<h3><span id=\"Why_you_want_the_outside_view_too\">Why you want the outside view too<\/span><\/h3>\n<p>Internal metrics tell you why a server is struggling, but they can\u2019t answer the simplest user question: can I reach it? That\u2019s where Uptime Kuma shines. It feels like a friendly craftsman tool\u2014easy to deploy, easy to use, and surprisingly capable. I often run it on a small separate VPS, because outside-in checks should come from, well, the outside. If your DNS breaks or a firewall rule goes rogue, you\u2019ll catch it fast.<\/p>\n<h3><span id=\"Quick_install_and_first_monitors\">Quick install and first monitors<\/span><\/h3>\n<p>You can run Uptime Kuma via Docker or as a standalone Node.js app. Docker is my default because upgrades are easy:<\/p>\n<pre class=\"language-bash line-numbers\"><code class=\"language-bash\">docker run -d \n  --name uptime-kuma \n  -p 3001:3001 \n  -v uptime-kuma:\/app\/data \n  louislam\/uptime-kuma:latest\n<\/code><\/pre>\n<p>Open the web UI at http:\/\/YOUR_MONITORING_VPS:3001, set your admin account, and add your first monitor. Start with HTTP(s) checks to your website, your admin panel, and any critical APIs. Then add TCP\/Port checks for MySQL, Redis, or any service you care about. If you have a cron-driven health endpoint, add an HTTP keyword check to make sure specific text is present\u2014perfect for \u201creadiness\u201d or \u201cI\u2019m healthy\u201d messages.<\/p>\n<p>For push-based scenarios (where a job must check in on time), Uptime Kuma\u2019s \u201cPush\u201d monitor is incredibly handy. Your backup job can hit a URL after it completes. If Uptime Kuma doesn\u2019t receive the push within the interval you set, it flags it. I once caught a backup job stuck on a permissions issue this way\u2014without the push, I might not have noticed until it was too late.<\/p>\n<h3><span id=\"Notifications_that_fit_your_life\">Notifications that fit your life<\/span><\/h3>\n<p>Set up notifications where you actually look. Telegram is fast and reliable; Slack is great if your team lives there; email is a good fallback. Keep priority in mind: low-urgency issues can be Slack-only; critical issues might ping your phone. And remember maintenance windows: schedule downtime in Uptime Kuma so you don\u2019t get spurious alerts during planned work.<\/p>\n<h3><span id=\"Status_pages_that_reduce_support_pings\">Status pages that reduce support pings<\/span><\/h3>\n<p>I love plain-language status pages. \u201cEurope API degraded\u201d is better than a wall of metrics\u2014include short updates and action steps only if necessary. Uptime Kuma lets you curate which monitors appear on a public page, so you can share the right level of detail. It\u2019s not about oversharing; it\u2019s about being helpful. If you manage client sites, this can reduce \u201cIs it down?\u201d messages by a lot.<\/p>\n<p>If you want to peek under the hood, the project is open-source and easy to follow. Here\u2019s the repo: <a href=\"https:\/\/github.com\/louislam\/uptime-kuma\" rel=\"nofollow noopener\" target=\"_blank\">Uptime Kuma on GitHub<\/a>. I send folks here when they want to automate deploys or tweak advanced settings.<\/p>\n<h2 id=\"section-6\"><span id=\"Wrap-Up_A_Calm_Helpful_Monitoring_Setup_Youll_Actually_Keep\">Wrap-Up: A Calm, Helpful Monitoring Setup You\u2019ll Actually Keep<\/span><\/h2>\n<p>So that quiet Monday morning? It doesn\u2019t scare me anymore. With Prometheus collecting the story, Grafana drawing it clearly, and Uptime Kuma watching from the outside, I feel like I know my servers the way a barista knows their espresso machine\u2014by the sound, the timing, the tiny shifts that tell you when it needs a little love. That\u2019s the real win of a practical VPS monitoring and alerts setup: fewer surprises, faster fixes, and more confidence when traffic gets weird.<\/p>\n<p>If you\u2019re just starting, keep it simple. Install Node Exporter, stand up Prometheus, add Grafana, and build four panels you understand at a glance. Add three alerts that match your life: CPU sustained, disk space, and downtime. Then expand slowly\u2014IOwait, queue depth, HTTP probe latency, status pages for your team. When you tighten the screws later (alert thresholds, mute times, silences, service-level burn alerts), you\u2019ll be tuning something that already works, not fixing chaos.<\/p>\n<p>And don\u2019t forget the bigger picture: metrics aren\u2019t the goal; calmer days are. If you\u2019re dealing with DDoS noise or bot swarms on a WordPress site, monitoring pairs beautifully with smart edge protection. I\u2019ve shared my approach to layering Cloudflare, ModSecurity, and Fail2ban in <a href=\"https:\/\/www.dchost.com\/blog\/en\/waf-ve-bot-korumasi-cloudflare-modsecurity-ve-fail2bani-ayni-masada-baristirmanin-sicacik-hikayesi\/\">the layered shield I trust for real projects<\/a>. It all works together: watch carefully, defend wisely, and act early.<\/p>\n<p>Wherever you start, start. Set the first panel, wire the first alert, and let the data teach you. Hope this was helpful! If you\u2019ve got a story from your own setup\u2014or a dashboard you\u2019re proud of\u2014I\u2019d love to hear it next time. Until then, keep your graphs calm and your pages fast.<\/p>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>\u0130&ccedil;indekiler1 The Quiet Panic That Made Me Take Monitoring Seriously2 What We\u2019re Building: The Trio That Keeps a VPS Calm3 Prometheus Setup: The Simple Path That Actually Works3.1 The mental model3.2 Install Node Exporter on the VPS you want to monitor3.3 Install Prometheus3.4 Scrape multiple VPS nodes3.5 Optional: HTTP checks from the inside with Blackbox [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1436,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[26],"tags":[],"class_list":["post-1435","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-teknoloji"],"_links":{"self":[{"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/posts\/1435","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/comments?post=1435"}],"version-history":[{"count":0,"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/posts\/1435\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/media\/1436"}],"wp:attachment":[{"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/media?parent=1435"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/categories?post=1435"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dchost.com\/blog\/en\/wp-json\/wp\/v2\/tags?post=1435"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}