CCCHH/ansible-infra

Author	SHA1	Message	Date
June	7f1afef50d	move secrets from sops lookup plugin to sops vars plugin Some checks failed / Ansible Lint (push) Failing after 1m54s Details This makes secret configuration and usage a good bit cleaner.	2025-05-04 16:50:15 +02:00
June	bbe4cc131a	eh22-netbox: remove eh22-netbox as its being decommissioned Some checks failed / Ansible Lint (push) Failing after 1m44s Details	2025-05-03 23:40:03 +02:00
June	97b8386878	grafana(host): move secrets to SOPS Some checks failed / Ansible Lint (push) Failing after 1m49s Details	2025-05-03 22:18:26 +02:00
c6ristian	e183f1a2c3	prometheus remote write with alloy using it Some checks failed / Ansible Lint (push) Failing after 1m53s Details	2025-04-30 01:11:17 +02:00
c6ristian	e21ff26f36	fix: alertmanager Some checks failed / Ansible Lint (push) Failing after 1m56s Details the message template now just give out simple string if the list of alerts is to long	2025-04-28 23:02:13 +02:00
c6ristian	456117a789	adding loki Some checks failed / Ansible Lint (push) Failing after 1m55s Details	2025-04-28 20:31:55 +02:00
June	fce4c2f73b	grafana(host): account in Prom. hyperv. disk alerts for longer backups All checks were successful / Ansible Lint (push) Successful in 1m39s Details Set duration for Prometheus hypervisor disk rw rate and hard disk io alerts to 2h to account for the very long running (over 90m) backup job.	2025-02-18 15:38:07 +01:00
June	07511ef723	grafana(host): remove decomissioned nix-box-june from Prometheus targets All checks were successful / Ansible Lint (push) Successful in 1m42s Details	2025-02-18 04:51:26 +01:00
June	79012fb7f8	eh22-netbox: setup EH22 NetBox All checks were successful / Ansible Lint (push) Successful in 1m44s Details	2025-02-17 01:23:35 +01:00
June	ac7e8bb6f2	grafana: set dur. for Prom. hyperv. disk rw rate and hdd io aler. to 90m All checks were successful / Ansible Lint (push) Successful in 1m43s Details Set duration for Prometheus hypervisor disk rw rate and hard disk io alerts to 90m to account for the very long running (over an hour) backup job.	2025-02-15 06:08:37 +01:00
June	40cddb67b4	grafana: account for long backup jobs in Prom. hyperv. disk rw rate al. All checks were successful / Ansible Lint (pull_request) Successful in 1m35s Details / Ansible Lint (push) Successful in 1m34s Details	2025-02-06 19:17:21 +01:00
June	c4e35c1adf	grafana: pull out prom. net. rec. err. alerts for OPNs. to ex. wg int. All checks were successful / Ansible Lint (push) Successful in 1m32s Details / Ansible Lint (pull_request) Successful in 1m30s Details Pull out prometheus network receive error alerts for OPNsense to exclude its WireGuard interfaces, which like to throw errors, but which aren't of importance.	2025-02-06 01:34:45 +01:00
June	ee66631c2d	grafana: diff. prometheus disk io alerts by host task and disk type All checks were successful / Ansible Lint (push) Successful in 1m34s Details / Ansible Lint (pull_request) Successful in 1m32s Details Differentiate by host task (hypervisor or not) and disk (hard disk or not) type not by whether or not the host is physical and virtual and then by disk type. This is in line with the disk rate alerts changes and allows for fine-grained adjustments based on the host task type, which actually matters for these alerts.	2025-02-06 01:13:10 +01:00
June	9e77a41e3c	grafana: differentiate prometheus disk rate alerts by host task type All checks were successful / Ansible Lint (push) Successful in 1m38s Details / Ansible Lint (pull_request) Successful in 1m37s Details Not by a mix of host task type (CI server or not) and whether or not the host is virtual or physical. Also only differentiate on the duration not the rate, to not accidentally exclude slow hard disks.	2025-02-06 01:05:05 +01:00
June	5016407cef	grafana: group prometheus alert rules for better organization All checks were successful / Ansible Lint (push) Successful in 1m40s Details / Ansible Lint (pull_request) Successful in 1m37s Details	2025-02-06 00:12:50 +01:00
c6ristian	6fa896dd3f	Remove jobe for mumble.c3lingo.org since the the endpoint appears to dont exsists anymore All checks were successful / Ansible Lint (push) Successful in 1m49s Details	2025-01-19 21:03:38 +01:00
June	07dbbf055c	reorganize (config) files and templates into one "resources" dir This groups the files and templates for each host together and therefore makes it easier to see all the (config) files for a host. Also clean up incorrect, unused docker_compose config for mumble and clean up unused engelsystem configs.	2024-12-08 02:55:25 +01:00

17 commits