In order to get the most out of Loom, We've collected some best practices for logging.
#1 Understand your audiences
When discussing logs, the first thing to understand is that your application logs have two very different audiences: humans and machines.
Machines are good in processing large amounts of structured data quickly and automatically.
Humans, on the other hand, are not as good in processing large amounts of data, and it takes them time to read through logs. On the other hand, humans deal with unstructured data well.
In order to get the most out of your logs, you need to make your logs both readable for humans and structured for machines.
#2 Have a Standard Structure across all logs
A perquisite for good logging is to have a standard structure of your log file, which would be consistent across all log files.
Each log line should represent one single event, and contain at least the timestamp, the hostname, the service and the logger name.
Additional values can be the thread or process Id, event Id, session and user id.
Other Important values may be environment related such as: instance Id, deployment name, application version, or any other key-value pairs related to the event.
You should use a high-precision timestamp in resolution of milliseconds and make sure your timestamp format includes timezone data.
These are important to track or correlate issues in different components and across your architecture.
#3 Understand Metrics
A core concept in logging is Metrics.
A metric is a specific value of a property in a specific time, usually measured at regular intervals.
There are all kinds of different metric types:
- Meter - measures the rate of events
- Timer - measures the time taken for event to process
- Counter - increment and decrement an integer value
- Gauge - measure an arbitrary value
As mentioned above, Each metric describes a state of some property of the system.
The cool thing about metrics is having lots of them, and being able to correlate different metrics together.
For example, If we find that whenever users in our applications are using the "Get Cat Photo" method, the "Time Spent On Web Page" is increasing - we can infer our users prefer cat photos over other photos.
Once you understand that metrics are important, you will start writing logs which contain metrics, or just export metrics separately.
#4 Reporting alerts and Exception Handling:
If within your code something happens, and you already know for sure what happened and perhaps what should be done, don't log and then set an alarm on that specific log. Instead, fire an alert directly from within the code.
Also, When logging an exception, while the stacktrace is useful, it's hard to read. Use libraries like Apache ExceptionUtils to summarize the stack trace and make it easier to consume
#5 Use Log severity levels
Different events have different severity implications. This is important to be able to differentiate severe and important events from irregular or even regular events.
Do not dismiss lower severity issues, they can be used as data points when trying to create a baseline for the application behavior.
Your log files should contain mostly Debug, Info and warn messages, and very few Error messages.
#6 Always provide context
Developers are writing logs inline with the code. This means that when writing the logs in the code, the developers base the log on the context of the code. Unfortunately, the person reading the log the doesn’t have that context, and sometimes even don’t have access to the source code.
For example, let's compare the following two log lines:
- "The database is down"
- "Failed to Get users preferences for user id=1. Configuration Database not responding. Will retry again in 5 minutes."
Reading the second log line, we easily understand what the application was trying to do, what component failed, and if there's some kind of resolution for this issue.
Each log line should contain enough information to make it easy to understand exactly what was going on, and what was the state of the application during that time.
#7 Use a standard logging framework and use its advanced features
Do not try and roll your own logging framework. There are plenty of excellent logging libraries for every programming language you may be using.
Logging frameworks enable you to set up different appenders, each with its output formats and its custom log pattern.
Other standard features include automatically adding the logger name and a timestamp, support for multiple severity levels and filtering by these levels.
Logging frameworks also have the following advanced features that you should be using:
Configure different log-level thresholds for different components in your code.
Lossy appenders which drops lower-level events if queues are full.
Logs-summarizing appenders which will log: "the following message repeated X times” instead of repeating it multiple times.
Putting a Threshold on the log level, and configure it to also output N lower-level log lines when the higher severity log occurs.
#8 Log a lot and then log some more
For us humans, it can be quite frustrating to search for some specific log messages in huge log files, and we may be reluctant to write a lot of logs.
When we process logs automatically, having more logs to process and more data is a good thing. When you have lots of log data you can analyze your application better and get better insights.
It is important to routinely inspect which parts of your application have too many or insufficient logs
Inspect your application logs from time to time and decide which components should have more logs and which should be configured to append less logs.