Monitoring is the action of observing and checking the behaviour and outputs of a system and it’s components over time
monitoring is a process to ensure stable operations by tracking performance status and swiftly detecting any issues
Related: Understanding Kubernetes Resources, Kubernetes + Docker, Best Practice Production
How should we do monitoring?
- Important things for the monitoring
- observability
- a measure indicating whether the state of the system can be understood and explained regardless of what state the system is in
- observability
- Imagine you, as an engineering and become responsible for a particular service
How can we enhance observability
- Understand the internal structure of application
- Comprehend the system state of the application even when encountering application
Key of observability
- Metrics
- System health status
- capruted overtime as time data series
- Charcteristic
- Numerical and measurable
- represent trends and patters in system
- Aggregatable
- Logs
- text-based records generated by system records and applications
- logs capture the history events and provide visibility into the system state
- non-distributed events
- chronological order is very important
- log level
- Debug
- Provide detail information for debugging system, analyzing root cause for an issue
- should be enabled as necessary
- Info
- Information about basic progress operation of the system
- successful events
- warning
- potential issue, but no actual error failured
- error
- Debug
- Traces
- Request scoped distributed events
- Data that tracks and visualize the flow of request and processes within system
- essential for identifying where delays or errors occur in distributed system or Microservice
- display in flamegraph
- Char
- Logs the flow of request across the entire system
- highlight performance bottlenecks and error locs
- Collections of spans that have unique ID
Datadog
Integrations
have many features to connect with other applications, we can collect metrics, logs and traces events from aws, or other services
Best practice
Output structure logs, Output like json or others key values format
APM (Application Performance Monitoring)
- App that provides deep visibility
- helps monitor trace and troubleshoot
- Identify performance bottlenecks and optimize application performance
- Compare the current with other