visit
In my previous article, I talked about the importance of logs and the differences between structured and unstructured logging. Logs are easy to integrate into your application and provide the ability to represent any type of data in the form of strings.
Metrics, on the other hand, are a numerical representation of data. These are often used to count or measure a value and are aggregated over a period of time. Metrics give us insights into the historical and current state of a system. Since they are just numbers, they can also be used to perform statistical analysis and predictions about the system’s future behaviour. Metrics are also used to trigger alerts and notify you about issues in the system’s behaviour.Format
Logs are represented as strings. They can be simple texts, JSON payloads, or key-value pairs (like we discussed in structured logging).Metrics are represented as numbers. They measure something (like CPU usage, number of errors, etc.) and are numeric in nature.Resolution
Logs contain high-resolution data. This includes complete information about an event and can be used to correlate the flow (or path) that the event took through the system.In case of errors, logs contain the entire stack trace of the exception, which allows us to view and debug issues originating from downstream systems as well. In short, logs can tell you what happened in the system at a certain time.
Metrics contain low-resolution data. This may include a count of parameters (such as requests, errors, etc.) and measures of resources (such as CPU and memory utilization). In short, metrics can give you a count of something that happened in the system at a certain time.
Cost
Logs are expensive to store. The storage overhead of logs also increases over time and is directly proportional to the increase in traffic.Metrics have a constant storage overhead. The cost of storage and retrieval of metrics does not increase too much with the increase in traffic. It is, however, dependent on the number of variables we emit with each metric.(name=pod.cpu.utilization, host=A)
(name=pod.cpu.utilization, host=B)
(name=pod.cpu.utilization, host=C)
Golden signals
Golden signals are an effective way of monitoring the overall state of the system and identifying problems.Resource metrics
Resource metrics are almost always made available by default from the infrastructure provider (AWS CloudWatch or Kubernetes metrics) and are used to monitor infrastructure health.Business metrics
Business metrics can be used to monitor granular interaction with core APIs or functionality in your services.This is the second part of my Microservice Observability series. Take a look at the first installment if you haven’t already. I’ll be adding links to the next articles when they go live. Stay tuned!
Also published at