OOTB Application Monitoring, Alert & User workload monitoring

OOTB Application Monitoring, Alert & User workload monitoring

Prerequisite

Complete Deploy application to openshift with s2i
Go to your project (same as your username)
Open Web Terminal by click '>_' on top of OpenShift Web Console
use web terminal to run command line

OpenShift Default Monitoring

view defalt monitoring per deployment, click topology, click duke icon (backend deployment), in backend deployment, select observe tab
- view CPU usage,
- view Memory usage
view Project Monitoring, click Observe in left menu, select Dashboard Tab
this page will show Monitoring Information of current project (all resources in this project)
can filter by workload, click at dashboard dropdownlist and select Kubernetes/Compute Resources/Workload, type deployment, workload backend
select tab Metrics to view performance/metrics information by type, click select query dropdown list to select default metrics information such as cpu usage, memory usage, filesystem usage, etc.
select CPU usage, click check box 'Stacked'
change to another metrics such as memory usage.
OpenShift Monitoring base on Prometheus Technology, you can use PromQL for retrive metric information, in select query dropdown list, select Custom query and type 'cpu' and wail auto suggestion,
select 'pod:container_cpu_usage:sum' and type 'enter' button to view this metrics from PromQL
click Alerts tab to view all alert (the Alerting UI enables you to manage alerts, silences, and alerting rules, we will create alert in next step in this session)
click Events Tab to view All event in this project or filter by resource

Review Application Performance Metric Code

Developer can enable monitoring for user-defined projects in addition to the default platform monitoring. You can now monitor your own projects in OpenShift Container Platform without the need for an additional monitoring solution.

review application metric code

backend application use quarkus microprofile metrics libraly to generate application metrics
example code: https://raw.githubusercontent.com/chatapazar/openshift-workshop/main/src/main/java/org/acme/getting/started/BackendResource.java

example custom metrics in code:

@Counted(
    name = "countBackend", 
    description = "Counts how many times the backend method has been invoked"
    )
@Timed(
    name = "timeBackend", 
    description = "Times how long it takes to invoke the backend method in second", 
    unit = MetricUnits.SECONDS
    )
@ConcurrentGauge(
    name = "concurrentBackend",
    description = "Concurrent connection"
    )
public Response callBackend(@Context HttpHeaders headers) throws IOException {

review example metrics of backend application, go to web terminal

call default quarkus microprofile metrics example

oc exec $(oc get pods -l app=backend | grep backend | head -n 1 | awk '{print $1}') \
-- curl -s  http://localhost:8080/q/metrics

example result

...
# HELP vendor_memoryPool_usage_max_bytes Peak usage of the memory pool denoted by the 'name' tag
# TYPE vendor_memoryPool_usage_max_bytes gauge
vendor_memoryPool_usage_max_bytes{name="CodeHeap 'non-nmethods'"} 1352064.0
vendor_memoryPool_usage_max_bytes{name="CodeHeap 'non-profiled nmethods'"} 1018240.0
vendor_memoryPool_usage_max_bytes{name="CodeHeap 'profiled nmethods'"} 5218944.0
vendor_memoryPool_usage_max_bytes{name="Compressed Class Space"} 3856880.0
vendor_memoryPool_usage_max_bytes{name="Metaspace"} 3.1625864E7
vendor_memoryPool_usage_max_bytes{name="PS Eden Space"} 1.6777216E7
vendor_memoryPool_usage_max_bytes{name="PS Old Gen"} 1.8408896E7
vendor_memoryPool_usage_max_bytes{name="PS Survivor Space"} 5705344.0

call custom quarkus microprofile metrics example

oc exec $(oc get pods -l app=backend | grep backend | head -n 1 | awk '{print $1}') \
-- curl -s  http://localhost:8080/q/metrics/application

example result

# TYPE application_org_acme_getting_started_BackendResource_timeBackend_seconds summary
application_org_acme_getting_started_BackendResource_timeBackend_seconds_count 1.0
# TYPE application_org_acme_getting_started_BackendResource_timeBackend_seconds_sum gauge
application_org_acme_getting_started_BackendResource_timeBackend_seconds_sum 2.503457774
application_org_acme_getting_started_BackendResource_timeBackend_seconds{quantile="0.5"} 2.503457774
application_org_acme_getting_started_BackendResource_timeBackend_seconds{quantile="0.75"} 2.503457774
application_org_acme_getting_started_BackendResource_timeBackend_seconds{quantile="0.95"} 2.503457774
application_org_acme_getting_started_BackendResource_timeBackend_seconds{quantile="0.98"} 2.503457774
application_org_acme_getting_started_BackendResource_timeBackend_seconds{quantile="0.99"} 2.503457774
application_org_acme_getting_started_BackendResource_timeBackend_seconds{quantile="0.999"} 2.503457774

Add Application Performance Metric to OpenShift

create ServiceMonitor, go to Search in left menu,
in search page, in resources drop down, type 'servicemonitor' for filter, click 'SM ServiceMonitor'
Click Create ServiceMonitor button

in Create ServiceMonitor Page, input below YAML for create ServiceMonitor to backend application

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: backend-monitor
  name: backend-monitor
spec:
  endpoints:
  - interval: 30s
    port: 8080-tcp
    path: /q/metrics
    scheme: http
  - interval: 30s
    port: 8080-tcp
    path: /q/metrics/application
    scheme: http
  selector:
    matchLabels:
      app: backend

example:

click create and review your ServiceMonitor, click YAML tab to view your yaml code.
test call you backend application

go to web terminal. test call your backend 2-3 times

BACKEND_URL=https://$(oc get route backend -o jsonpath='{.spec.host}')
curl $BACKEND_URL/backend

click Monitor in left menu, select Metrics Tab
in select query, change to custom query, type 'app' and wait auto suggesstion
Remark: if you don't found metrics 'application*' in auto suggession, wait a few minute and retry again
select 'application_org_acme_getting_started_BackendResource_countBackend_total', type enter.
change your custom PromQL such as average tocal call backend service in 1 minute is
- type: 'rate(application_org_acme_getting_started_BackendResource_countBackend_total[1m])'
- enter
Optional: test call backend 2-3 times and check metrics change in Monitoring Pages

Add Alert to OpenShift

Check PrometheusRule

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: backend-app-alert
  namespace: <username>
  labels:
    openshift.io/prometheus-rule-evaluation-scope: leaf-prometheus
spec:
  groups:
  - name: backend
    rules:
    - alert: HighLatency
      expr: application_org_acme_getting_started_BackendResource_timeBackend_max_seconds>1
      labels:
        severity: 'critical'
      annotations:
        message: ' response time is  sec'

HightLatency Alert will fire when response time is greateer than 1 sec

Create backend-app-alert click add icon (+) to open yaml editor
input PrometheusRule yaml in editor and create (change namespace before run)

Test HightLatency , run k6 as pod on OpenShift, change user1 to your username

BACKEND_URL=https://$(oc get route backend -n <your username> -o jsonpath='{.spec.host}')/backend
curl -o load-test-k6.js https://raw.githubusercontent.com/rhthsa/openshift-demo/main/manifests/load-test-k6.js
oc run load-test -n <your username> -i \
--image=loadimpact/k6 --rm=true --restart=Never \
--  run -  < load-test-k6.js \
-e URL=$BACKEND_URL -e THREADS=25 -e DURATION=2m -e RAMPUP=30s -e RAMPDOWN=30s

25 threads
Duration 2 minutes
Ramp up 30 sec
Ramp down 30 sec

Check for alert in Developer Console by select Menu Observe then select Alerts

Next Step

Complex Cloud-Native Application with Live Flight Track Demo

Application Monitoring