How about ‘Predictive Analytics’ for IT System failure alerts, before they actually happen?

‘Predictive Analytics’ uses complex mathematical formulas to predict certain events like hurricanes, weather, etc. With the kind of processing power and technology available today, these algorithms have a fair degree of accuracy. Why not use ‘Predictive Analytics’ to predict IT systems / Network failure, a couple of hours before they actually happen? If you are managing a large IT set-up, you might want to have a look at this technology.

How are IT Systems/ Network faults identified today?

There are multiple disparate monitoring systems in place for many individual functions of IT systems. For example, for the network there is a Network Management Software (NMS). Similarly, Virtualization systems have its own management module, Servers have their monitoring softwares, Databases/ Applications have separate monitoring softwares, Storage systems have their own monitoring software, etc.

So, people need to continuously monitor all the above disparate management modules individually in any large organization. More importantly, IT Systems/ Network faults are intimated after they happen. Sometimes, people get to know IT issues only after the help-desk starts getting calls from the users that something is not functioning.

Of course manual thresholds can be set and monitored for, but most of the time, these alerts are either ignored or the fault happens due to a totally different parameter that was overlooked/ due to wrong threshold level settings. And, there are just too many parameters to monitor.

What is Predictive Analytics?

Predictive Analytics uses mathematical algorithms like Regression modeling techniques, for example to describe the relationship between the various variables that contribute to the functioning of a system. They learn the behavior of the variables under normal circumstances and monitor their behavior continuously to find out if there are significant changes from their normal behavior. More precisely, they monitor for certain behavior patterns that precede major trouble causing scenarios.

For example, while determining if loans can be given to certain applicants or not, banks use predictive analytics to find out the relationship between various variables and the risks involved. For example, they see if candidates with certain age, marital status, credit history, employment profile, etc are more prone to defaulting loans than others and then decide if they want to give loans to them.

Predictive analytics is also used in fields like risk assessment for insurance, sales forecasting, demand forecasting, weather forecasting, etc. Can this technology be used in IT systems?

How Predictive Analytics helps to detect faults and downtime in IT Systems?

When applied to an IT scenario, the predictive analytics system can integrate with existing monitoring tools (NMS, etc) and collects data about all the possible variables being monitored by them like CPU, Disk space, Primary memory utilization, I/O activity, etc. Based on this, they automatically determine the normal operating behavior of these variables and keep analyzing live data continuously to determine if any of these variables significantly deviate from their normal behavior in a certain pattern that might indicate performance problems in the near future.

So, predictive analytics gathers as much data as possible from various sources (for the system it needs to monitor) and uses mathematical algorithms to understand the relationship between the variables in the current state. Based on this information, it can forecast what is likely to happen next, including any potential trouble causing situations. This way it tries to identify network downtime/ IT systems malfunction, etc hours before they actually happen.

The main advantage with predictive analytics is none of this data needs to be manually entered, nor is there a requirement to set manual thresholds.  Predictive Analytics systems claim to do this automatically.

Of course, the system needs to integrate with the current monitoring tools running in the organization. One way the predictive analytics systems can be tested is by feeding it with actual values of the variables (of a certain duration in the past) and monitor if it is able to predict major faults that actually happened in the past. This can, to an extent say how well a predictive analytics system can integrate within a particular environment.

Predictive Analytics can also help to forecast IT systems capacity. For example, it can predict the number of servers needed for a cloud based data center/ large organizations based on the past/ present trends of application utilization.

Of course, Predictive Analytics can never be 100% accurate and tends to have some degree of false positives. But for large data centers / geographically dispersed organizations where even a small downtime in IT systems can cause huge financial/ reputation losses, this technology might be worth a try? There is at least one company involved in developing Predictive Analytics software for IT/Network systems.

excITingIP.com

You can stay up to date on the various computer networking/ enterprise IT technologies by subscribing to this blog with your email address in the box that says, ‘Get email updates when new articles are published’

One thought on “How about ‘Predictive Analytics’ for IT System failure alerts, before they actually happen?

  1. Dejan

    Great and informative article,thanks!

Comments are closed.