GCP Checklist 6 — Logging ,Monitoring and Alerting (maintaining reliability)
When it comes to maintaining reliability in your systems understanding how your systems typically behave is crucial. Once you understand the typical behaviour you will be in a position to identify the anomalies and act upon them . This requires setting up an appropriate framework for Logging, monitoring & alerting
Logging — you need to collect and analyse logs to look for application anomalies and to audit your application and environments.
Monitoring — is closely related to logging and often goes hand in hand with logging. A typical monitoring solution consists of some way to collect metrics, dashboards to view the status of your systems and applications and a way to send alerts. You need to instrument your system to provide meaningful metrics.
GCP has logging and monitoring services that are available as part of the platform
Here are some References that are good place to start:
https://cloud.google.com/logging/docs/
https://cloud.google.com/monitoring/audit-logging
https://cloud.google.com/monitoring/docs/
https://cloud.google.com//monitoring/alerts/using-alerting-ui
https://cloud.google.com/solutions/design-patterns-for-exporting-stackdriver-logging
And here’s your Check list:
A list of all the checklists in the series can be found here