We get a lot of questions from our customers about how they should prepare themselves for different technologies (cloud, 10G, etc.). For this blog we wanted to answer some of the common questions that we receive from our customers – mainly network engineers/administrators – regarding both specific networking trends as well as larger technology trends.
We have Jay Botelho, Director of Product Marketing at WildPackets, answering these questions. He’s been in the networking business for over 25 years. If you have any additional questions for our networking guru please be sure to let us know and we’ll address them in our next Q&A blog.
Q: What are some best practices for troubleshooting in a virtual environment?
A: Virtualization creates “blind spots” in your network, making it difficult to monitor with traditional techniques, i.e. spanning a switch port from a physical Ethernet switch to collect packet-based data for complete root-cause analysis.
Here’s a common scenario. A user is experiencing abnormally long delays while working with a specific application. You know from your network architecture that the app is running on a VM, but you’re not the application engineer so you’re not familiar with all of the nuances of the application’s operations (what data sources it accesses, under what conditions, etc.). You’re able to start a packet capture session on a switch just upstream from the virtual server running the app, and after filtering and watching the user connection to the app you can confirm long delays for some operations, but it is clear the delay is NOT between the user and your capture point. The delay is within the VM, in your blind spot, where communication between the application and the database are virtualized on the same VM.
To address this issue, you must collect data from the virtual switch(es) within the VM in order to get visibility between the application and the database. There are several techniques to achieve this. First, you can use OmniVirtual, a packet-capture software probe specifically designed to run on VMs. You will need to allocate space on the VM to run OmniVitual, just as you would any other application. Once running, OmniVirtual will have access to all data crossing any of the virtual switches on the VM.
A second alternative is to use a virtual tap, available from several tap vendors. These virtual taps install at the VM layer, acting like a traditional tap, providing access to data crossing virtual switches on the VM to a host of solutions, including network analysis and troubleshooting solutions like the Omni Distributed Analysis Platform.
Q: How will migrating to a public Cloud affect my job?
A: There’s some very good data available from industry analysts that predict you will be busier as your company migrates to the Cloud, so don’t worry about your job! But your role will change from managing not only your own infrastructure, but overall service availability and performance in the Cloud as well. Cloud computing merely shifts your application servers from your facility to a third party. Issues like bottlenecks, bandwidth hogs, and unauthorized protocol usage will still adversely affect application traffic. Thus, more diligence must be applied in monitoring application performance and making sure that your service provider is living up to its promises.
Q: Why do issues with VoIP continue to persist?
A: This is a question that we consistently get from both potential customers and customers looking to deploy VoIP. The core issue is that networks are really not that friendly to real-time data (RTP, or real-time protocol, which is used by VoIP and video). Most networks are optimized to carry TCP/IP data traffic, which is much more tolerant of latency, packet loss, and jitter than is RTP. To compensate for this, networks require additional configuration for VoIP, including the use of Quality of Service tagging – QoS (at a minimum) – or even dedicated VLAN or MPLS segments to segregate and give priority to RTP traffic. If you either have or are planning a transition to VoIP, be sure you are using a network analysis solution that treats VoIP like any other data type on the network, since that’s exactly what it is. Often times VoIP problems spike during times of heavy network usage, so you need a solution that can see everything at once and allow you to correlate the activity of all traffic on your network simultaneously.
Q: How will my network analysis needs change as we roll out 10G?
A: 10G is a game-changer for network analysis and troubleshooting. The still oft-used “break/fix” method of network troubleshooting – where packet-based network analysis is only performed after a problem is reported – is no longer effective at 10G. With more data consolidated through fewer resources, the number of problems per segment increases, and the increased network speeds make it far more challenging to try to reproduce problems, or wait for them to happen again. At 10G you need to monitor and record packet-level data on an ongoing basis, arming yourself with a recording of all activity on these highly-utilized segments. If real-time monitoring indicates a negative trend, or if problem reports are rolling in, you can simply “rewind” the network to the troublesome period of time and analyze exactly what was going on. No waiting for it to happen again, and no need for Herculean efforts to reproduce the problem. You have all the data you need to solve the problem – immediately.
This wraps up our first Q&A session. Please keep your questions coming. We’re always up for a challenge, and let’s face it, we picked the softballs this time around…