Why prioritization of service incidents is valuable
By Lakshmi Nagarajan, Director of Engineering, Big Data Analytics in Ruckus Networks, CommScope
Wednesday, 01 July, 2020
This is the second in a two-part blog series on analytics, specifically for network administrators. The first part was posted on May 12, 2020.
Let’s begin with incident prioritization. What is incident prioritization, and why is it valuable?
From a network administrator’s point of view, certain issues are more critical than others, and will therefore be deemed more severe and appropriately prioritized. When troubleshooting manually, network administrators typically consider several factors to determine the severity of an issue, such as:
- What parts of the network are/were impacted?
- How many clients are/were impacted?
- If not ongoing, how long did the incident last?
Of course, these considerations can be different based on the type of industry, time of year and/or location where the network is deployed.
Determining the severity of network incidents is often an incredibly complex process. Perhaps not surprisingly, it is equally challenging to create and train algorithms that are capable of automatically and accurately replicating the process. As such, utilizing supervised machine learning techniques to determine a severity score is not an option. This is because the vast amounts of training data required and high costs of such a model make it unfeasible.
Unsupervised Machine Learning Is the Key for Incident Analytics
This is why RUCKUS Analytics uses unsupervised ML techniques to determine the severity of a network incident by analyzing specific factors relevant to the network administrator for a particular issue. Based on the severity, RUCKUS Analytics automatically prioritizes the issue, using AI capabilities to rapidly draw the user’s attention to the most serious network issues.
The second valuable piece of incident definition intelligence enabled by CommScope’s RUCKUS Analytics is scope. Every incident is detected at the highest level/scope possible to help administrators understand how widespread the issue is. For example, in a SmartZone deployment APs are grouped into several logical groups: AP group, zone, domain, and so on. There are also other soft groups such as WLAN (APs having the same WLAN) and DHCP server (APs connecting to a DHCP server). When several APs belonging to one logical group are impacted by the same issue, the issue should be presented at the group level, instead of an individual AP level.
For those familiar with hierarchical clustering, this approach will seem similar to hierarchical “agglomerative” clustering. Meaning, the incident starts at the AP level and is promoted upwards until it reaches the highest level it was initially observed at. This incident “roll-up” brings immense benefit to the network administrator, because it offers an in-depth look at issues that would have otherwise been impossible with a manual troubleshooting approach. This is because the pattern would only be observed at the AP level across various access points.
CommScope’s RUCKUS Analytics uses a blend of all three aspects—incident detection, prioritization, and scope—to ensure network administrators have full visibility into their network and the issues that occur therein. Our goal is to ensure administrators have a worry-free network management experience by using autonomous networks that leverage machine learning and artificial intelligence techniques to automatically identify and resolve network issues.
If you would like to learn more about the incident analytics capabilities of CommScope’s RUCKUS Analytics, you can view this short screen capture demo video.
Or, you can access our product walkthrough video for a quick look at a broader range of features.
Taking it from the core to the edge.
And why causing a Ruckus is a good thing.
Campus fibre optics solutions provide critical backbones that can support ultra-low latency as...