Post

Deploying a Monitoring Stack with Prometheus, Grafana, and Alertmanager

Deploying a Monitoring Stack with Prometheus, Grafana, and Alertmanager

In my homelab, I use a Docker-based monitoring stack powered by Prometheus, Grafana, and Alertmanager. It’s deployed and configured with Ansible, making it easy to replicate across environments.

Stack Overview

  • Prometheus: collects metrics from targets (including node_exporter on each host)
  • Grafana: visualizes Prometheus data
  • Alertmanager: routes alerts via Discord webhook

Network Layout

  • Each monitored node runs prometheus-node-exporter
  • Prometheus lives on automation1.cal
  • All services run in Docker containers
  • Ports exposed: 9090 (Prometheus), 3000 (Grafana), 9093 (Alertmanager)

Ansible Playbooks

1. install-docker.yml

Installs Docker on all target hosts (if not already present).

1
2
3
4
5
6
7
8
9
10
11
- name: Install Docker engine
  hosts: docker_hosts
  become: true
  tasks:
    - name: install docker engine
      apt:
        name: 
          - docker-ce 
          - docker-ce-cli 
          - containerd.io
        state: present

2. prometheus_exporter.yml

Installs node exporter on all monitored hosts.

1
2
3
4
5
6
7
8
- name: Install and configure Prometheus Node Exporter
  hosts: node-exporter-monitored
  become: true
  tasks:
    - name: Install Node Exporter
      apt:
        name: prometheus-node-exporter
        state: present

3. grafana-promethus-alertmanager-monitoring-stack.yml

Sets up Prometheus, Grafana, and Alertmanager on automation1.cal.

Prometheus scrapes node_exporters and Alertmanager

Alertmanager alerts are routed via Discord webhook

Grafana is pre-provisioned and persists dashboards

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
- name: Create Prometheus config
  copy:
    dest: /opt/prometheus/config/prometheus.yml
    content: |
      global:
        scrape_interval: 15s
      scrape_configs:
        - job_name: 'prometheus'
          static_configs:
            - targets: ['localhost:9090']
        - job_name: 'alertmanager'
          static_configs:
            - targets: ['localhost:9093']
        - job_name: 'node-exporter'
          static_configs:
            - targets: [
                
              ]

You can also define alert rules:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
- name: Create alert rules
  copy:
    dest: /opt/prometheus/config/alert_rules.yml
    content: |
      groups:
        - name: example
          rules:
            - alert: InstanceDown
              expr: up == 0
              for: 1m
              labels:
                severity: critical
              annotations:
                summary: "Instance  down"

And Alertmanager configuration (Discord):

1
2
3
4
5
6
7
8
9
10
11
12
13
- name: Create Alertmanager config
  copy:
    dest: /opt/alertmanager/config/alertmanager.yml
    content: |
      global:
        resolve_timeout: 5m
      route:
        receiver: 'discord'
      receivers:
        - name: 'discord'
          webhook_configs:
            - url: ''
              send_resolved: true

Inventory

The Ansible inventory groups hosts like this:

1
2
3
4
5
6
7
8
9
node-exporter-monitored:
  hosts:
    vm-1.cal:
    vm-2.cal:
    gameserver-01.cal:

docker_hosts:
  children:
    node-exporter-monitored:

Final Notes

Once deployed, Grafana is available on port 3000. Log in with the default admin credentials and start creating dashboards. Prometheus scrapes metrics automatically, and alerts trigger Discord webhooks.

📁 Full playbooks and config: https://github.com/rhysmcqueen/Learning/tree/main/Ansible-Examples/playbooks

This post is licensed under CC BY 4.0 by the author.