How to Monitor Hybrid Cloud Performance

Hybrid Cloud Networks

A hybrid cloud is an environment where an organization’s on-prem network is connected with a public cloud to provide end-to-end reachability to services and applications. This architecture, which started to gain traction in the last five years or so, enables organizations to easily move compute workloads between the two environments. When implementing a hybrid cloud strategy, it is not uncommon for organizations to select more than one public cloud provider, generally two. Such architecture is also referred to as hybrid multi-cloud. At the moment, the most popular public cloud providers are Amazon AWS, Microsoft Azure, Google Cloud, and Alibaba. However, many other providers such as Oracle and IBM Cloud are offering public cloud services.

Interconnects with Public Clouds

To extend a corporate network into a public cloud, organizations generally have two methods: use a VPN or a direct connect link. Since the VPN connection is established across the Internet, all network traffic that is exchanged between the on-prem and the public cloud hosts is encrypted and tunneled. In most cases, VPN connections are suitable for applications that require relatively low bandwidth and that don’t have low latency requirements. On the other hand, if the enterprise has high bandwidth (e.g. 10 Gbps or more) and/or low latency requirements then a direct connect link is the best option. Each cloud provider calls such a service in a different way. For instance, AWS calls it Direct Connect, while Microsoft Azure calls it ExpressRoute (below an example diagram).

expressroute public cloudMonitoring Hybrid Cloud Performance

If your organization is managing a hybrid cloud, it’s very important to monitor key network performance metrics that can impact the performance of data transfers to and from the cloud. This applies no matter if your interconnect is based on VPN or direct connect. The same network performance metrics that are monitored in on-prem environments should be analyzed in public cloud deployments. The following table recaps the four key performance metrics that need to be monitored.

MetricTypeDepends on
LatencyPrimaryDistance, Congestion
Packet LossPrimaryCongestion
JitterSecondaryLatency
ThroughputSecondaryLatency, Packet Loss

 

More specifically, latency and packet loss are the primary metrics that need to be constantly monitored, and where the application of anomaly detection rules will detect changes in their performance. The secondary metrics, such as jitter and packet loss, are directly impacted by changes in the primary ones. So for instance, to assure that the throughput of a transfer rate between on-prem and cloud is optimal it’s necessary to detect that the end-to-end latency and packet loss don’t cross specific values. Refer to the Matis equation from a previous blog post that I wrote about packet loss.

How NetBeez Monitors Hybrid Clouds

NetBeez is a distributed network monitoring solution that runs end-to-end performance tests. On-prem and cloud agents constantly monitor the primary performance metrics, such as latency and packet loss. Then, anomaly detection rules verify that incoming results follow a baseline or don’t cross user-defined thresholds. To proactively detect changes in network performance, NetBeez adopts two methods:

  • Performance alerts based on SLAs, such as packet loss < 3% or latency < 150 ms.
  • Performance alerts based on comparing a short-term average with a long term baseline

Below is a screenshot of the anomaly detection section in NetBeez.

network performance alert

 

The secondary metrics, that is throughput and jitter, are also monitored but don’t generate alerts. As explained above, the reason that secondary metrics don’t generate alerts is that the primary metrics are used for proactive detection, as they directly impact the results of the secondary ones. The screenshot below shows the test results of a UDP-based throughput test between an on-prem NetBeez agent and a cloud one.

udp throughput

Conclusion

Monitoring hybrid cloud network performance is not a much different task than monitoring traditional on-prem networks. Organizations that are extending their infrastructure to the cloud should consider a distributed network monitoring approach that will enable them to seamlessly monitor the cloud as they do with their WAN or corporate network. If you’re an engineer or IT executive that’s working on a hybrid cloud project, talk to us, and we’d be happy to share more relevant solutions with you.