Ever been working on a project and it fails, but you just don’t know why? A good logging practice can be the bread crumb trail to follow that will ensure you know what went wrong and how to fix it fast.
In an effort to continually improve logging practices, our Product Traction’s engineering team has landed on several useful key methodologies. Fundamentally, it’s important to use the right tool for the job, enforce clear and consistent log formats, and properly utilize log levels.
For tools, our team recommends using Grafana or DataDog. Both centralize logs and offer a suite of features that allow for filtering, categorizing, and graphing your output. Using dashboards, for example, your team can get a clear picture of success rates by sorting on HTTP status, critical errors by filtering on trace levels, or hunt down known issues by searching for a specific error message.
It’s also critical to be consistent with your logging practice. Regarding your logging procedure, entry and exit points should be logged once, such as at the beginning and end of an API request. It's also a good idea to denote where logs are coming from using tags, as it makes parsing and searching logs much simpler. For example, using a marker <example?> one can easily ascertain that this takes place <somewhere>. Where applicable, a child component could also be created within the logger and assigned a unique name. And, as always, the principle of LOWYN, “Log Only What You Need”, should be adhered to at all times. We always include a static and unique message, with no dynamic data, to make finding logs in your code as easy as CTRL+F. Determining exactly what to include beyond that can be a challenge.
A first step to determining what context to include in the log is usually to evaluate the log level. These typically include TRACE, DEBUG, INFO, WARN, and ERROR, and what’s included in these can vary drastically. By default logs will likely be INFO, and include only the most essential context, such as a request ID and processing time. Below this would be DEBUG, where you could include further information such as a request object. At the TRACE level you should be logging the control flow of your system, such as the start and end of functions. WARN and ERROR are special cases, and should log in error cases of various severity. It’s also important to have the ability to switch log levels easily, preferably via an external tool. Log levels aren’t helpful if changing them means redeploying the entire codebase.
Our team has identified these practices as helpful in the process of logging. They should not be viewed as strict rules, but rather flexible guidelines for improving logging practices. We’ve found using the right tools, consistent log formats, and proper use of log levels have made debugging easier, offered insight into critical areas of code, and overall made our development experience faster and more efficient.