Problems are a cause for incidents to arise, which in turn interrupts the IT services of a company and results in losses.
Problem management in ITIL deals with solving the intrinsic issue in the IT infrastructure which is responsible for causing the problem in the first place. It is also involved in preventing the problems from recurring and to minimize the losses caused by incidents which can’t be avoided.
It is the responsibility of problem management to document all the details of the incident so that it can be used to support other problem management activities in the future.
Problem management is classified into 2 types:
Reactive: This type of problem management is set in motion as a response to an incident which has already occurred such as an IT service failure.
The incident management team after restoring the systems to normal functioning levels passes on information about the incident to the problem management team, who in turn tackles the source to ensure that it doesn’t recur.
For Example: Implementation of a data backup server in response to a data loss incident.
Proactive: This type of problem management is set in motion as a preventive measure to make sure that a potential problem does not materialize.
It uses trending threats, analysis of prior incidents, industry information, analysis of data to predict the lurking problems and intuition to predict a future problem.
For Example: Implementing enhanced security measures in anticipation of a potential virus attack
An event is any discernable change in state that has some significance for the management of an IT service or a configuration item.
Events are identified through the notifications created by a monitoring tool, configuration item or an IT service. The IT personnel then take action on these events and the incidents are logged in the system.
Event management in ITIL deals with such events which take place in an organization. The main goal of event management is to manage an event through its lifecycle by analyzing the events and finding out the relevant processes to deal with them.
An incident is an unplanned interruption or a sudden reduction in the performance of an IT service. If an incident occurs, it means that something is already wrong.
An event is a slight change in the state of the system or service in the IT infrastructure. If an event occurs, it is an indication of a possible system failure occurring in the future.
All incidents are events, but not all events are incidents.
The events which do not require any follow-up action are called as information events. They are generally notifications, updates or generated data such as logs. They are the events which just pass on the information.
A warning event signifies that something is different from the normal operating conditions. They indicate that it is time to take action before the condition of the system worsens and approaches a point of failure. For example, a warning can show that the computer systems are taking longer than usual to process data, which may be an indication of upcoming system failure.
The exception events are an indication that something has gone awry in the IT system and it impacts the business activities of an organization. For example, a computer system failure is an exception when compared to a system having lower performance which is an example of a warning.
The main purpose is to identify the occurrence of an event by detecting the change of state in an IT service.
The warnings and exception events can be made use of to automate periodic activities.
A mechanism should be provided which permits early detection of events.
Event management can be applied to the facets of service management which can be automated.
Event notifications are received using monitoring tools or a configuration item.
The event must be logged in the system so that there is a proper record of the cause, effect and solutions implemented.
Notified events should be filtered as there will be certain events which do not need any action to be taken and other which need immediate attention.
Filtration and prioritization ensure that the most important events are responded to at the earliest.
An action plan should be formulated to control the event appropriately, and the plan should be conveyed to the relevant departments.
After the required action is taken and the solution is implemented, the event has to be officially closed by logging the necessary details and the measures taken.
A few examples of events are as listed below.
A user launching a particular application from an unauthorized system
A system crashing during normal operations
The network connection going down
The server experiencing power outages
Thus problem management in ITIL solves the intrinsic issue in the IT infrastructure which causes problems and also prevents the problems from recurring to minimize the losses caused. Event management identifies the occurrence of an event by detecting the change of state in an IT service, thus enabling timely intervention.