WebLogic modernization on Oracle Cloud Infrastructure — Part 3
In this article, I am going to focus on monitoring of WebLogic domain and action in case of any issue in the environment. WebLogic product team has been developed a new tool, WebLogic Monitoring Exporter, which was implemented to capture runtime metrics for specific WebLogic Server instance. WebLogic Monitoring Exporter is Prometheus-compatible exporter. In this article I will first deploy Prometheus and Grafana in our Kubernetes cluster and then deploy WebLogic Monitoring Exporter in existing domain to capture server’s metrics and finally create a dashboard in Grafana to view these metrics and create some alert and actions. The exporter using WebLogic 12.2.1.x RESTful Management Interface to access runtime metrics.
Exporter is available in two forms:
- A web application that you can deploy to the servers that you want to extract the metrics.
- A separate container that is run alongside a server instance inside WebLogic pod. this only support with WebLogic Kubernetes Operator versions 3.2 and later.
Make sure WebLogic RESTful Management Interface has been enabled in your source domain (WLS Exporter sidecar using WLS RESTful services to pull metrics).
RestfulManagementServices:
Enabled: true
In this article, I will explain second model of deployment (deploy exporter as separate container inside the pod).
Solution Architecture
Here is high level architecture diagram of solution.
In this architecture, I assume we have WebLogic cluster deployed in k8s (in my case Oracle Container Engine for Kubernetes — OKE). and we only have one managed server inside the cluster.
Install Prometheus and Grafana in Kubernetes Cluster.
The first step is installation of monitoring resources (Prometheus, Grafana, Alert manager and Prometheus adapter).
you can use following ansible playbook to install monitoring resources:
#Clone Prometheus Repository in the server
- name: Clone Prometheus Repository
shell: "git clone {{prometheus_repo_url}}"
args:
executable: /bin/bash
#Install prometheus
- name: "Install prometheus"
shell: kubectl create -f kube-prometheus/manifests/setup; until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done; kubectl create -f kube-prometheus/manifests; kubectl label nodes --all kubernetes.io/os=linux
environment:
OCI_CLI_AUTH: "{{ oci_auth }}"
#Provide external access to Grafana (32100)
- name: "Provide external access to Grafana (32100)"
shell: "kubectl patch svc grafana -n monitoring --type=json -p '[{\"op\": \"replace\", \"path\": \"/spec/type\", \"value\": \"NodePort\" },{\"op\": \"replace\", \"path\": \"/spec/ports/0/nodePort\", \"value\": 32100 }]'"
environment:
OCI_CLI_AUTH: "{{ oci_auth }}"
#Provide external access to Prometheus (32101)
- name: "Provide external access to Prometheus (32101)"
shell: "kubectl patch svc prometheus-k8s -n monitoring --type=json -p '[{\"op\": \"replace\", \"path\": \"/spec/type\", \"value\": \"NodePort\" },{\"op\": \"replace\", \"path\": \"/spec/ports/0/nodePort\", \"value\": 32101 }]'"
environment:
OCI_CLI_AUTH: "{{ oci_auth }}"
#Provide external access to alertmanager (32102)
- name: "Provide external access to alertmanager (32102)"
shell: "kubectl patch svc alertmanager-main -n monitoring --type=json -p '[{\"op\": \"replace\", \"path\": \"/spec/type\", \"value\": \"NodePort\" },{\"op\": \"replace\", \"path\": \"/spec/ports/0/nodePort\", \"value\": 32102 }]'"
environment:
OCI_CLI_AUTH: "{{ oci_auth }}"
This ansible playbook, will clone Prometheus git repository and install resources inside k8s cluster. finally, patch k8s services created to enable external access (using NodePort) to Prometheus, Grafana and alert manager.
here are parameters that use by playbook.
oci_auth: instance_principal
user: opc
prometheus_repo_url: https://github.com/coreos/kube-prometheus.git
Add WebLogic Monitoring Exporter sidecar.
In this step, we will WebLogic exporter sidecar to the existing domain.
I am using following ansible playbook and parameters to configure Prometheus.
#Copy exporter configuration to the server
- name: Copy exporter configuration to the server
copy:
src: ../files/exporter-config-sidecar.yaml
dest: "/home/{{user}}/"
owner: "opc"
mode: '0755'
#Copy script to the server
- name: Copy script to the server
copy:
src: ../files/{{ python_script }}
dest: "/home/{{user}}/"
owner: "opc"
mode: '0755'
#Update domain resource (add monitoringExporter sidecar)
- name: Update domain resource
command:
argv:
- python3
- "/home/{{user}}/{{ python_script }}"
user: opc
python_script: addWLSExporterSidecar.py
This ansible playbook first using following python script to update domain resource of existing WebLogic domain to add exporter side card.
import yaml
import sys
from yaml.loader import SafeLoader
def read_yaml(filename):
with open(f'{filename}.yaml','r') as f:
output = list(yaml.load_all(f, Loader=SafeLoader))
return output
def write_yaml(filename, domain):
with open(f'{filename}.yaml','w') as f:
output = yaml.dump_all(domain, f, sort_keys=False)
def updateSourceDomain():
#Read domain resource yaml file
domain = read_yaml('domain')
#Read WLS exporter sidecar config
monitoringExporter = read_yaml('exporter-config-sidecar')
spec = domain[0]["spec"]
if "monitoringExporter" in domain[0]:
domain[0]["spec"]["monitoringExporter"].update(monitoringExporter)
else:
print(spec)
print(monitoringExporter[0]["monitoringExporter"])
spec["monitoringExporter"] = monitoringExporter[0]["monitoringExporter"]
#Write WLS domain resource yaml file
write_yaml("domain", domain)
updateSourceDomain()
and here is WLS exporter config (includes WL domain metrics that you want to export to Prometheus including JVM, ThreadPool, etc. metrics).
metricsNameSnakeCase: true
queries:
- key: name
keyName: location
prefix: wls_server_
applicationRuntimes:
key: name
keyName: app
componentRuntimes:
prefix: webapp_config_
type: WebAppComponentRuntime
key: name
values: [deploymentState, contextRoot, sourceInfo, openSessionsHighCount, openSessionsCurrentCount, sessionsOpenedTotalCount, sessionCookieMaxAgeSecs, sessionInvalidationIntervalSecs, sessionTimeoutSecs, singleThreadedServletPoolSize, sessionIDLength, servletReloadCheckSecs, jSPPageCheckSecs]
servlets:
prefix: weblogic_servlet_
key: servletName
- JVMRuntime:
prefix: wls_jvm_
key: name
- executeQueueRuntimes:
prefix: wls_socketmuxer_
key: name
values: [pendingRequestCurrentCount]
- workManagerRuntimes:
prefix: wls_workmanager_
key: name
values: [stuckThreadCount, pendingRequests, completedRequests]
- threadPoolRuntime:
prefix: wls_threadpool_
key: name
values: [executeThreadTotalCount, queueLength, stuckThreadCount, hoggingThreadCount]
- JMSRuntime:
key: name
keyName: jmsruntime
prefix: wls_jmsruntime_
JMSServers:
prefix: wls_jms_
key: name
keyName: jmsserver
destinations:
prefix: wls_jms_dest_
key: name
keyName: destination
- persistentStoreRuntimes:
prefix: wls_persistentstore_
key: name
- JDBCServiceRuntime:
JDBCDataSourceRuntimeMBeans:
prefix: wls_datasource_
key: name
- JTARuntime:
prefix: wls_jta_
key: name
After applying these changes you can see each WLS pods will have two container (weblogic-server and monitoring-exporter).
kubectl get pods -l "weblogic.domainUID=app-domain" -n app-domain
NAME READY STATUS RESTARTS AGE
app-domain-apps-adminserver 2/2 Running 0 13h
app-domain-apps-server-1 2/2 Running 0 13h
kubectl get pods app-domain-apps-server-1 -n app-domain -o jsonpath='{.spec.containers[*].name}'
monitoring-exporter weblogic-server
Configure Prometheus and Grafana.
Final step, configure Prometheus to pull metrics from WebLogic domain.
I am using following ansible playbook and parameters to configure Prometheus to read metrics from WebLogic domain. this include create service account, ClusterRole and ClusterRoleBinding to enable Prometheus to read metrics and ServiceMonitor to pull metrics from exporter sidecar.
#Copy prometheus configuration to the server
- name: Copy prometheus configuration to the server
copy:
src: ../files/prometheus-configuration.yaml
dest: "/home/{{user}}/"
owner: "opc"
mode: '0755'
#Update prometheus configuration variables
- name: Update prometheus configuration variables
ansible.builtin.replace:
path: "/home/{{user}}/prometheus-configuration.yaml"
regexp: '{{item.itemName}}'
replace: '{{item.itemValue}}'
with_items:
- "{{prometheus_configuration_variables}}"
#Install prometheus configuration
- name: "Install prometheus"
shell: kubectl apply -f "/home/{{user}}/prometheus-configuration.yaml"
environment:
OCI_CLI_AUTH: "{{ oci_auth }}"
ignore_errors: true
Prometheus configuration
#ClusterRole uses to give permission to Prometheus to access pods, services, etc.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups:
- extensions
resources:
- ingresses
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: ##service_account_name##
namespace: ##prometheus_namespace##
---
#Kubernetes secret to keep WLS credentials that use by ServiceMonitor to access WLS metrics
apiVersion: v1
kind: Secret
metadata:
name: wls-exporter-auth
namespace: ##prometheus_namespace##
data:
password: "##weblogic_credential##"
user: "##weblogic_user##"
type: Opaque
---
#Prometheus uses ServiceMonitor to auto-detect target pods based on label selector
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: wls-exporter
namespace: ##prometheus_namespace##
labels:
k8s-app: wls-exporter
release: prometheus
spec:
namespaceSelector:
matchNames:
- ##domain_namespace##
selector:
matchLabels:
weblogic.domainUID: ##domain_name##
endpoints:
- basicAuth:
password:
name: wls-exporter-auth
key: password
username:
name: wls-exporter-auth
key: user
port: metrics
targetPort: ##exporter_port##
relabelings:
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
replacement: '$1'
interval: 10s
honorLabels: true
path: ##exporter_path##
oci_auth: instance_principal
user: opc
prometheus_configuration_variables:
- {itemName: "##weblogic_credential##", itemValue: "<Base64 encoded of WLS credentials>"}
- {itemName: "##weblogic_user##", itemValue: "<Base64 encoded of WLS username>"}
- {itemName: "##domain_namespace##", itemValue: "<WLS Domain namespace>"}
- {itemName: "##domain_name##", itemValue: "<WLS Domain name>"}
- {itemName: "##exporter_port##", itemValue: "<WLS Exporter Port Name>"}
- {itemName: "##exporter_path##", itemValue: "/metrics"}
- {itemName: "##prometheus_namespace##", itemValue: "<Prometheus namespace>"}
- {itemName: "##service_account_name##", itemValue: "<Prometheus Service Account>"}
After executing this script, following resources will be create:
- ClusterRole: This Role will grant access to Prometheus operator to read pods, services, endpoints, and namespaces.
- ClusterRoleBinding: This resource will bind created ClusterRole with Prometheus Service account.
- Kubernetes Secret: This secret will store WebLogic username and password and use by Prometheus ServiceMonitor resource to read metrics from WLS exporter.
- ServiceMonitor: Prometheus uses ServiceMonitor to auto-detect target pods based on label selector. we will use weblogic.domainUID label to detect WLS pods by ServiceMonitor.
- Prometheus resource: Prometheus resource uses to associate Service Monitor with Prometheus operator.
Finally, if you complete all above steps successfully and connect to Prometheus UI, you will see WLS metrics (you can access to Prometheus UI using NodePort and Node IP http://<Node IP>:<NodePort>).
We can use these metrics to scale WebLogic cluster automatically. for example, if total number of open sessions exceeded specific amount, add new node to the cluster.
I will use Prometheus Adapter and Horizontal Pod Autoscaler (HPA) to implement this solution. HPA can scale pods based on observed CPU utilization and memory usage. we could use other metrics for deciding the scaling too. for example, number of open sessions in the web application.
Prometheus Adapter helps query and leverage custom metrics collected by Prometheus, and then use this metrics to make scaling decisions.
Please refer to links provided in references section to get more information regarding to Prometheus Adapter and HPA.
In our example, I will use the Prometheus adapter to collects current number of open sessions (“webapp_config_open_sessions_current_count”) for application “SimpleApp” (webapp_config_open_sessions_current_count{app=”SimpleApp”}). adapter then convert them into custom metric with specific name (“open_sessions_count” in my example). finally, HPA will use this custom metric to scale the cluster accordingly.
Following is ansible code and configuration files to create these resources (Prometheus adapter and HPA) in k8s cluster.
Copy prometheus adapter values to the server
- name: Copy prometheus Adapter Values to the server
copy:
src: ../files/prometheus-adapter.yaml
dest: "/home/{{user}}/"
owner: "opc"
mode: '0755'
#Update prometheus adapter values variables
- name: Update prometheus adapter values variables
ansible.builtin.replace:
path: "/home/{{user}}/prometheus-adapter.yaml"
regexp: '{{item.itemName}}'
replace: '{{item.itemValue}}'
with_items:
- "{{prometheus_adapter_variables}}"
#Add Prometheus Helm Repo URL
- name: "Add Prometheus Helm Repo URL"
shell: helm repo add prometheus-community {{prometheus_helm_repo_url}}
environment:
OCI_CLI_AUTH: "{{ oci_auth }}"
#Install Prometheus Adapter
- name: "Install Prometheus Adpter"
shell: helm install {{prometheus_adapter_chart_name}} prometheus-community/prometheus-adapter --namespace {{prometheus_namespace}} --values prometheus-adapter.yaml --set "prometheus.port={{prometheus_port}}" --set "prometheus.url={{prometheus_url}}"
environment:
OCI_CLI_AUTH: "{{ oci_auth }}"
#Copy wls hpa resource to the server
- name: Copy wls hpa resource to the server
copy:
src: ../files/wls-hpa.yaml
dest: "/home/{{user}}/"
owner: "opc"
mode: '0755'
#Update wls hpa variables
- name: Update wls hpa variables
ansible.builtin.replace:
path: "/home/{{user}}/wls-hpa.yaml"
regexp: '{{item.itemName}}'
replace: '{{item.itemValue}}'
with_items:
- "{{hpa_variables}}"
#Install hpa resource
- name: "Install hpa resource"
shell: kubectl apply -f "/home/{{user}}/wls-hpa.yaml"
environment:
OCI_CLI_AUTH: "{{ oci_auth }}"
ignore_errors: true
Prometheus Adapter Configuration Values that use by adapter helm chart to create adapter resource:
# Copyright (c) 2022, Oracle and/or its affiliates.
# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl.
rules:
default: false
custom:
- seriesQuery: '{__name__=~"##metrics_name##",pod!="",namespace!="", app="##application_name##"}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: ^(.*)
as: "##custom_metrics_name##"
metricsQuery: ##metrics_name##{<<.LabelMatchers>>,app='##application_name##'}
HPA Configuration:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: ##hap_name##
namespace: ##hpa_namespace##
spec:
scaleTargetRef:
apiVersion: weblogic.oracle/v1
kind: Cluster
name: ##cluster_name##
behavior:
scaleDown:
stabilizationWindowSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
minReplicas: ##min_cluster_size##
maxReplicas: ##max_cluster_size##
metrics:
- type: Pods
pods:
metric:
name: ##metrics_name##
target:
type: AverageValue
averageValue: ##open_sessions_avg##
ansible variable file to replace configuration variables:
oci_auth: instance_principal
user: opc
prometheus_helm_repo_url: https://prometheus-community.github.io/helm-charts
prometheus_adapter_chart_name: prometheus-adapter-wls-metrics
prometheus_namespace: monitoring
prometheus_port: 9090
prometheus_url: http://prometheus-k8s.monitoring.svc
domain_namespace: app-domain
domain_name: app-domain
cluster_name: app-cluster
prometheus_adapter_variables:
- {itemName: "##metrics_name##", itemValue: "webapp_config_open_sessions_current_count"}
- {itemName: "##custom_metrics_name##", itemValue: "open_sessions_count"}
- {itemName: "##application_name##", itemValue: "<Application Name in my example SimpleApp>"}
hpa_variables:
- {itemName: "##metrics_name##", itemValue: "open_sessions_count"}
- {itemName: "##hap_name##", itemValue: "wls-hpa"}
- {itemName: "##hpa_namespace##", itemValue: "<Namespace to create HPA resource>"}
- {itemName: "##cluster_name##", itemValue: "<WLS Cluster resource to scale>"}
- {itemName: "##min_cluster_size##", itemValue: "1"}
- {itemName: "##max_cluster_size##", itemValue: "3"}
- {itemName: "##open_sessions_avg##", itemValue: "5"}
1- Prometheus adapter will query Prometheus server to read “webapp_config_open_sessions_current_count” metrics (using Prometheus URL and Port that provided) and expose new custom k8s metric “open_sessions_count”. you can see query this custom metric using following command:
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/namespaces/app-domain/pods/app-domain-apps-server-1/open_sessions_count | jq .
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {},
"items": [
{
"describedObject": {
"kind": "Pod",
"namespace": "app-domain",
"name": "app-domain-apps-server-1",
"apiVersion": "/v1"
},
"metricName": "open_sessions_count",
"timestamp": "2023-10-03T21:45:23Z",
"value": "0",
"selector": null
}
]
}
2- HPA resource will read this metric in specific interval and change cluster size accordingly (I have configured one node as minimum size of cluster and three nodes as maximum size of cluster and scaling happening if average value of custom metric is more than 5).
before creating any application session, I can see current size of cluster and HPA status by following commands.
kubectl describe cluster app-cluster -n app-domain
Name: app-cluster
Namespace: app-domain
Labels: weblogic.domainUID=app-domain
Annotations: <none>
API Version: weblogic.oracle/v1
Kind: Cluster
Metadata:
Creation Timestamp: 2023-10-03T07:30:57Z
Generation: 4
Resource Version: 34905903
UID: 0d080af6-9f5f-44dc-a76f-3ec9bd201487
Spec:
Cluster Name: app-cluster
Replicas: 1
kubectl get pods -l "weblogic.domainUID=app-domain" -n app-domain
NAME READY STATUS RESTARTS AGE
app-domain-apps-adminserver 2/2 Running 0 13h
app-domain-apps-server-1 2/2 Running 0 13h
Now, I am using following command (several times) to create new sessions in WLS cluster (I have created curl pod to do this testing).
kubectl exec curl -n app-domain -- curl http://app-domain-apps-server-1:9073/SimpleApp/index.jsp
if I query custom metrics now, I can see value increased.
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/namespaces/app-domain/pods/app-domain-apps-server-1/open_sessions_count | jq .
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {},
"items": [
{
"describedObject": {
"kind": "Pod",
"namespace": "app-domain",
"name": "app-domain-apps-server-1",
"apiVersion": "/v1"
},
"metricName": "open_sessions_count",
"timestamp": "2023-10-03T22:18:06Z",
"value": "9",
"selector": null
}
]
}
and after few seconds, you will see HPA scale up the cluster.
kubectl get hpa -n app-domain -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
wls-hpa Cluster/app-cluster 9/5 1 3 2 19h
kubectget pods -l "weblogic.domainUID=app-domain" -n app-domain .
NAME READY STATUS RESTARTS AGE
app-domain-apps-adminserver 2/2 Running 0 14h
app-domain-apps-server-1 2/2 Running 0 14h
app-domain-apps-server-2 1/2 Running 0 8s
Now, we have two WL managed server inside the cluster.
WebLogic dashboard in Grafana
We can use dashboard that created for Grafana to view WebLogic metrics in Oracle Enterprise Manager (OEM) style dashboard. first we need to download the dashboard form this link and import it in Grafana. first, you need to make sure that you have Prometheus data source configured in Grafana.
Then update WebLogic dashboard json file to use same data source to connect to Prometheus.
Finally, import dashboard json file in Grafana as below (you can use Grafana Rest APIs to import dashboard too).
Conclusion
In this article, I have explained how to monitor WebLogic domain metrics and how to use these metrics to dynamically scale up and down the cluster in case of increase or decrease of workload. you can see how simple and quickly you can add new managed server to WL cluster compared to virtualized environment.
In next articles, I will continue to explore other features like view WebLogic logs in OpenSearch. Stay tune.
References
Prometheus: https://prometheus.io/docs/introduction/overview/
Grafana: https://grafana.com/docs/grafana/latest/
Prometheus Adapter: https://github.com/kubernetes-sigs/prometheus-adapter/tree/master
WebLogic Monitoring Exporter: https://github.com/oracle/weblogic-monitoring-exporter
Horizontal Pod Autoscaler: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/