Distributed network monitoring is essential for multi-site enterprises that have a Wide Area Network (WAN). You may be asking yourself how large your WAN should be to justify such a solution to your manager. What I have generally seen is that if you are supporting five or more network locations, then you will probably spend a considerable amount of time troubleshooting network and application issues. If that’s the case, then you have a clear indication that you need distributed network monitoring to make your life easier by spending less time troubleshooting problems. In fact, distributed network monitoring is a great way to quickly detect legit network and application issues, and quickly do fault isolation, which is an important step in the troubleshooting process.
Distributed Network Monitoring: Definition
Distributed network monitoring is a monitoring strategy that relies on the information provided by multiple observation points, also called agents or sensors, about a specific monitored object, or target, with the goal of determining the real status of the target independently from conditions that may affect the measurement of one or more agents.
Thanks to this network monitoring strategy, you can quickly test whether users can access a specific application. If that’s the case, you can determine if it‘ i‘s a network or application issue. This assessment is simple and immediate because all you have to do is to see what percentage of agents are reporting the application down. If some agents are able to connect to the application but not others, then blame the network. On the other hand, if all the agents are reporting the application down, then it’s time to call the application’s tech support.
Use Cases
Let’s take three simple use cases in NetBeez to show what distributed network monitoring is all about. In NetBeez, agents are called beez. There are different types of agents: wired, wireless, and software/virtual.
No problems detected
On the left of the following image, you see a wired beez located in the Paris office reporting no problems.
This is where you want to be, where the network is running fine and no applications are down. You can easily determine that because all the thirteen ping, thirteen DNS, seven HTTP, and thirteen traceroute (TRCRT) tests are green, which means everything is running fine. The tests counters represent the number of tests that the agent is running in real-time. The tests are conducted on specific targets defined by the network administrator, which could be internal or cloud applications.
Performance issues
In the center, you see a wireless beez installed in the Pittsburgh office. In this case, the beez is detecting performance degradation issues on the wireless network. You can easily see this because all of the thirteen ping tests are yellow, which indicates performance degradation issues.
If I drill down on this problem, I can select one of these ping tests and review the alerts that were triggered by that specific test running on that wireless agent:
As you can see, this ping test, like the other twelve, has very high packet loss. That’s a clear indication of a network issues. However, how can I assume that this is a wireless problem and not an issue on the local LAN or WAN link?
Thanks to distributed network monitoring, I can review the status of these same tests run by a wired beez that is connected to the network via an Ethernet network interface only at the same location:
If you then inspect one of the thirteen ping tests running on that wired beez, you can see that there’s no packet loss detected. Check out the real-time graph below:
Application issue
Take a look at this case of the network monitoring target salesforce.com:
Here, we are looking at a Target, rather than a beez. A Target is a monitored object that groups all the tests run by different beez deployed at different network locations. In this case, we are monitoring Salesforce with ping, DNS, HTTP, and traceroute tests. As you can see, the ping and DNS tests are green, meaning that there’s network connectivity from the beez to the website salesforce.com. However, all twenty agents are reporting failed HTTP tests. Having all the beez confirming the same condition is proof. This gives us a clear follow-up action to the tech support team, that is open a ticket with Salesforce. It also gives the network administrator an advantage in dealing with support tickets that may hit the help desk.
Conclusion
I hope that this blog post was useful in introducing the concept of distributed network monitoring. It would be great to hear your feedback in the comments section. I would be very interested to know if you have a distributed network monitoring currently implemented in your network and, if so, what benefits it provides. Also, if you want to learn more about this in a private session, please don’t hesitate to request a demo here.