The network is often blamed for poor application performance, even if the network is not the culprit. Network engineers therefore need to know how to determine the cause of the problem, even if it’s in the application itself. Below, you’ll find the most common questions we get from our clients on this topic, and how you should address them – whether you’re working with wireless or in a cloud environment.
Q: How do you know if it’s your network or your application?
A: This is the first question you should figure out when users are complaining about a poor application response time. We’ve gone into great detail on this subject, but here are some initial indicators to help you prove that the network is not the culprit:
Packet-level monitoring will show you the conversation between a client and a poorly performing application. If a user request is followed by a quick network acknowledgement (ACK) but a delayed data response, it means it is an application issue. If the ACK is delayed or missing, then the network is to blame. You can get this information clearly visualized in OmniPeek’s Compass dashboard, using the “2-Way Latency” setting, which displays the network and application latencies, with drill-down on a per-node or even per-flow basis.
If you’re streaming Expert events into a log management system, network issues are shown through slow acknowledgements, TCP slow segment recovery, slow and frequent re-transmission, and low throughput. Application problems manifest in slow application response times.
Q: Does application monitoring change in a virtual environment?
A: While network monitoring may be more difficult in a virtual environment due to the introduction of overlay networks and virtual switches which often aren’t controlled by the network team, the fundamental analysis techniques are still valid. A capture only has to be in the packet path between the client and the server in order to get diagnostic info and answer the basic question: is the problem in the network or in the application?
Q: How should I address application problems if they are housed in a cloud environment?
A: Cloud is generally hostile to packet capture, since there’s no network visibility if you don’t control the network. In this environment, we recommend focusing on the end user experience. If there are complaints or concerns about the application performance, capture on a client machine to see what the traffic pattern reveals. We’ve found that many of our customers appreciate using the OmniPeek Remote Assistant capture agent, as it’s a lightweight capture tool with a simple user interface to capture packets from a Windows client. The encrypted capture file can then be sent back to the network team for analysis.
Q: How should I handle issues with real-time applications like VoIP?
A: Sadly, with VoIP it often is a network problem. Latency is often the root cause. Sometimes it’s a transitory problem, like routing reconvergence after a link goes down (or up). Sometimes, the packets are simply routed through inappropriate equipment, like a proxy which doesn’t do any VoIP analysis, but which still adds latency.
There is a pair of tools we recommend for VoIP. First, use the built-in VoIP analysis in OmniPeek Enterprise to measure the MOS scores and determine how widespread the problem is. Second, use Multi-Segment Analysis (MSA) to capture at multiple points simultaneously in the packet path, to determine whether there are any significant sources of latency in the network, and where they are.
Q: How do you know if application performance is sufficient?
A: This is subjective to the end user. We usually suggest examining the application response time. This measures the time it takes an application to respond to a specific user request on a per-request or per-flow basis. The Expert dashboards in OmniPeek and OmniEngine Enterprise will give you these numbers very easily.
Our products also assign one of three basic levels of performance: satisfied, tolerating and frustrated. We dive into deeper detail about how you should measure and report this here.
Q: What if my application has a bug in it? How do I know and how should I solve the problem?
A: Once you’ve demonstrated that there is a problem in the application, the next steps may not be obvious to the sysadmins or developers. Most modern applications are highly modularized, split into multiple layers across many different servers, and if a back-end service is slow to respond, that delay will propagate all the way to the user. Packet capture can provide insight here as well: use the application performance analysis techniques on a capture taken on the front-end server to see whether there’s a dependency on other servers, and what the application response looks like from those remote systems. This works even if the connections are SSL or TLS encrypted, as it will be clear which packets are simple ACKs and which are application-layer responses. Repeat until you find which server in the distributed application is causing the major slowdown.
As always, please let us know if you have any additional questions!