It seems like most IT professionals today are so busy keeping up with day-to-day tasks, not to mention myriad unexpected issues that arise; that project planning and time management often take a back seat., Besides, “planning” sounds like such an “open-ended” thing that is frequently deemed a waste of time and lost in the shuffle.
In fact, a little proactive planning could save a lot of time and stress in the future. And it does not have to take up a lot of time, e.g. like planning how to better deal with production issues.
Another production issue!
Ever notice how many hours a single production issue consumes? When there is a production issue, there is a tendency to get everybody involved including the technical team, management team, and even people who may only be on the periphery of the issue, just to show support or to speculate how to fix the problem. Not really a good use of time and resources. The longer it takes to detect, isolate and recover from an issue, the more productive hours are wasted. The key is to plan ahead an effective way to detect issues and a procedure to recover quickly with minimal time and resource allocation.
Production issues are at times unpredictable, but frequently preventable, minimized or even avoided in the future. A small investment on post-mortem analysis and planning pays big dividends in the long run.
Use your application logs effectively
The key to proactive planning is log management. Log data is an underutilized yet extremely important data resource for troubleshooting issues and supporting broader business objectives. By effectively managing and monitoring application logs, IT professionals put themselves in an optimal position to detect issues and resolve them more quickly and efficiently.
Here are some suggestions:
|Document the cause of the issue and what was done to recover from it||15-30 minutes|
|If it is an application issue, identify the relevant errors or warning messages in the application logs||30-60 minutes|
|Even it is not an application issue, at least identify the early symptoms of the problem when they surface.
Are there any EMS messages that should be monitored?
|Identify the recovery procedure when the symptom is detected: Notify someone? Stop or start a process? Raise an alarm?||30-60 minutes|
|Review and refine the procedure||Variable|
So, with less than half a day’s time invested in analysis and preparation, a plan is in place to deal with production issues through early detection and quick resolution. Not a bad investment to save yourself from many hours of stress and wasted hours in the future.
Automate log monitoring
To be even better prepared, find out how TIC LogWatch can help you automate this log monitoring process. TIC LogWatch is a Guardian program that “watches”, log files for new entries. LogWatch captures log records, looks for configurable patterns, and generates alerts when anomalies are detected.
Learn more about how LogWatch can save you time and effort.
Do you find this tutorial blog helpful? Let us know what you think, and how we can make it even better. Don’t forget, you can subscribe to our blogs (top right-hand corner of the home page) to get automatic email notification when a new blog is available.
Phil Ly is the president and founder of TIC Software, a New York-based company specializing in software and services that integrate NonStop with the latest technologies, including Web Services, .NET and Java. Prior to founding TIC in 1983, Phil worked for Tandem Computer in technical support and software development.