application-alerts |
171ms ago |
1.473ms |
||
| Rule | State | Error | Last Evaluation | Evaluation Time |
| alert: HighHTTPErrorRateLeadGen expr: rate(http_requests_total{service="lead-gen-saas",status=~"5.."}[5m]) > 0.1 for: 5m labels: severity: critical annotations: description: 5xx error rate is {{ $value | printf "%.2f" }}/s summary: High 5xx error rate on Lead Gen API | ok | 178ms ago | 719us | |
| alert: HighHTTPErrorRateInsightful expr: rate(container_fs_read_bytes_total{image=~".*insightful.*"}[5m]) > 1e+07 for: 5m labels: severity: warning annotations: summary: High I/O on Insightful Booking container | ok | 177ms ago | 392.4us | |
| alert: HighAPILatency expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket{service="lead-gen-saas"}[5m])) > 2 for: 5m labels: severity: warning annotations: description: 95th percentile latency is {{ $value | printf "%.2f" }}s summary: High API latency on {{ $labels.instance }} | ok | 177ms ago | 329.2us | |
container-alerts |
1.037s ago |
21.78ms |
||
| Rule | State | Error | Last Evaluation | Evaluation Time |
| alert: ContainerDown expr: container_spec_image_tag == 0 for: 1m labels: severity: critical annotations: summary: 'Container down: {{ $labels.name }}' | ok | 1.037s ago | 334us | |
| alert: ContainerHighCPU expr: rate(container_cpu_usage_seconds_total[5m]) * 100 > 90 for: 5m labels: severity: warning annotations: description: 'CPU usage: {{ $value | printf "%.1f" }}%' summary: Container {{ $labels.name }} high CPU | ok | 1.036s ago | 1.548ms | |
| alert: ContainerHighMemory expr: container_memory_usage_bytes / container_spec_memory_limit_bytes > 0.9 for: 5m labels: severity: warning annotations: description: 'Memory usage: {{ $value | printf "%.1f" }}%' summary: Container {{ $labels.name }} high memory | ok | 1.035s ago | 12.38ms | |
| alert: ContainerRestarting expr: changes(container_last_seen[5m]) > 5 for: 1m labels: severity: warning annotations: summary: Container {{ $labels.name }} restarting frequently | ok | 1.023s ago | 7.425ms | |
database-alerts |
11.13s ago |
2.949ms |
||
| Rule | State | Error | Last Evaluation | Evaluation Time |
| alert: PostgresTooManyConnections expr: pg_stat_activity_count > 80 for: 5m labels: severity: warning annotations: description: 'Active connections: {{ $value }}' summary: PostgreSQL connections high on {{ $labels.database }} | ok | 11.131s ago | 870.5us | |
| alert: PostgresLongRunningQueries expr: pg_stat_activity_count{datname!~"template.*"} > 10 for: 5m labels: severity: warning annotations: summary: Many active queries on {{ $labels.database }} | ok | 11.13s ago | 655.3us | |
| alert: PostgresDeadTuples expr: pg_stat_user_tables_n_dead_tup > 10000 for: 1h labels: severity: warning annotations: description: Table {{ $labels.relname }} has {{ $value }} dead tuples summary: High dead tuple count on {{ $labels.database }} | ok | 11.129s ago | 1.029ms | |
| alert: PostgresExporterDown expr: up{job=~"postgres.*"} == 0 for: 5m labels: severity: critical annotations: summary: PostgreSQL exporter down for {{ $labels.job }} | ok | 11.128s ago | 364.1us | |
exporter-alerts |
3.778s ago |
1.699ms |
||
| Rule | State | Error | Last Evaluation | Evaluation Time |
| alert: ExporterDown expr: up == 0 for: 5m labels: severity: warning annotations: description: '{{ $labels.job }} has been down for 5 minutes' summary: 'Exporter down: {{ $labels.job }}' | ok | 3.778s ago | 948.9us | |
| alert: PrometheusTargetMissing expr: up{job!="prometheus"} == 0 for: 10m labels: severity: critical annotations: description: Target {{ $labels.instance }} has been down for 10 minutes summary: 'Prometheus target missing: {{ $labels.job }}' | ok | 3.777s ago | 724.3us | |
nginx-alerts |
5.415s ago |
622.3us |
||
| Rule | State | Error | Last Evaluation | Evaluation Time |
| alert: NginxHigh5xxErrors expr: rate(nginx_http_requests_total{status=~"5.."}[5m]) > 0.05 for: 5m labels: severity: critical annotations: description: '5xx rate: {{ $value | printf "%.2f" }}/s' summary: High 5xx error rate on Nginx | ok | 5.415s ago | 374.6us | |
| alert: NginxHigh4xxErrors expr: rate(nginx_http_requests_total{status=~"4.."}[5m]) > 0.5 for: 5m labels: severity: warning annotations: description: '4xx rate: {{ $value | printf "%.2f" }}/s' summary: High 4xx error rate on Nginx | ok | 5.415s ago | 136.5us | |
| alert: NginxConnectionFailures expr: rate(nginx_connections_failed[5m]) > 0.1 for: 5m labels: severity: warning annotations: description: 'Failed connections: {{ $value | printf "%.2f" }}/s' summary: Nginx connection failures | ok | 5.415s ago | 79.92us | |
redis-alerts |
6.562s ago |
1.04ms |
||
| Rule | State | Error | Last Evaluation | Evaluation Time |
| alert: RedisHighMemory expr: redis_memory_used_bytes / redis_memory_max_bytes > 0.9 for: 5m labels: severity: warning annotations: description: Memory usage is {{ $value | printf "%.1f" }}% summary: Redis memory usage high | ok | 6.562s ago | 753us | |
| alert: RedisTooManyConnections expr: redis_connected_clients > 100 for: 5m labels: severity: warning annotations: description: 'Connected clients: {{ $value }}' summary: Redis connected clients high | ok | 6.561s ago | 134.2us | |
| alert: RedisManyExpiredKeys expr: rate(redis_expired_keys_total[5m]) > 1000 for: 5m labels: severity: info annotations: summary: High rate of expired keys | ok | 6.561s ago | 128.1us | |
system-alerts |
5.276s ago |
962.2us |
||
| Rule | State | Error | Last Evaluation | Evaluation Time |
| alert: HighCPUUsage expr: 100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90 for: 5m labels: severity: warning annotations: description: CPU usage is {{ $value | printf "%.1f" }}% summary: High CPU usage on {{ $labels.instance }} | ok | 5.276s ago | 480.3us | |
| alert: HighMemoryUsage expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 90 for: 5m labels: severity: warning annotations: description: Memory usage is {{ $value | printf "%.1f" }}% summary: High memory usage on {{ $labels.instance }} | ok | 5.276s ago | 132.3us | |
| alert: LowDiskSpace expr: (1 - (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"})) * 100 > 85 for: 10m labels: severity: warning annotations: description: Disk usage is {{ $value | printf "%.1f" }}% summary: Low disk space on {{ $labels.instance }} | ok | 5.276s ago | 229us | |
| alert: HighLoadAverage expr: node_load1 / count(node_cpu_seconds_total{mode="idle"}) > 1.5 for: 10m labels: severity: warning annotations: description: Load average is {{ $value | printf "%.2f" }} summary: High load average on {{ $labels.instance }} | ok | 5.276s ago | 96.62us | |