Category Archives: NetFlow

Packet vs. Flow-based Network Analysis – the Markets Speak

With the introduction of flow-based metrics nearly a decade ago, the debate began between the use of flow-based metrics and deep packet inspection for network monitoring and analysis. Prior to the introduction, deep packet inspection was the go-to solution for both overall network monitoring and detailed network analysis. But as NetFlow data became more and more available from network devices, the market saw a definite shift towards flow-based data, especially for overall network monitoring.

And because NetFlow is so readily available, and so many IT organizations jumped on the NetFlow bandwagon, the use cases for NetFlow reporting expanded, with many organizations looking to NetFlow to be their only network monitoring and analysis solution.

But packet-based network monitoring and analysis solutions did not fade into the woodwork. Those who are ultimately responsible for troubleshooting complex network problems (you know who you are) never overestimated the capabilities of NetFlow and never underestimated those of deep packet inspection (DPI). Even though NetFlow-based solutions gained ground in the NOC, DPI solutions were not displaced. Over time the urge of IT management to make NetFlow solutions fit for every situation subsided, and both NetFlow and DPI solutions now coexist within the IT management infrastructure, each with its place. Flow-based solutions for monitoring; packet-based solutions for detailed analysis and troubleshooting.

And this week the markets validated what network engineers have known all along. DPI solutions are alive and well, and have a solid future within the portfolio of IT management solutions. How did the markets show this? With the announcement that JDSU plans to acquire Network Instruments for $200M. According to an online article on Enterprise Network Planet, Network Instruments revenue for the past 12 months was approximately $40M. So that means that Network Instruments sold for 5x its current annual revenue, a very strong showing indeed which reflects very nicely on the overall strength of the packet-based network analysis market.

So let’s stop looking at flow-based vs. packet-based as a debate, or an either-or decision. They both have their place in the IT infrastructure, and both serve a very valuable, albeit different, function. When planning for overall network visibility, packet-based network analysis must be part of the solution.

The Basics of Flow Analysis

When it comes to enterprise network monitoring, flow-based solutions are by far the most popular, with 30-40 major flow-based network monitoring solutions on the market today. With that many solutions, how do they differentiate from one another, and which one will be best for your network? To determine this, let’s start at the beginning, with the basics. How does a flow-based solution work?

The Data Source

Switches and routers are the primary sources of flow data. Since every packet is traversing the device, it is relatively easy for the device to extract key data from the packets, of course requiring extra processing. Depending on the protocol being used to analyze the packets, and the current load on the router or switch, sampling may be employed, and this could lower the accuracy of the data being reported. All flow-based reporting protocols categorize packets into a flow based on the following seven characteristics: source IP address, destination IP address, source port, destination port, layer 3 protocol type, TOS byte, and input logical interface. The device keeps track of all the flows, storing the information in available RAM, and once every configured interval packages up the data into a stream of UDP packets following a predefined format (like NetFlow or sFlow) and transmits these packets to a user-configured IP address, known as the Collector.

The Data Collector

Once the UDP data stream with the flow-based information leaves the switch or router it is purged and forgotten. It is now the responsibility of the Collector to receive, process, and store the flow-based information. Keep in mind that the original delivery to the Collector is over UDP, which is not a reliable transport, so dropped packets from the switch to the Collector can be a problem (a protocol analyzer like the OmniPeek Network Analyzer can help to identify if this is an issue on your network). Also, the packet stream from the switch is adding to the traffic load on your network, so this should be taken into consideration. Each packet typically contains information on five to ten flows, so a busy network segment can generate a significant number of packets. The frequency of data pushed from the switch to the Collector is something that is configured on the switch, and is typically set to one minute, though you may find a different interval works best in your specific environment.

The Collector becomes the central repository for all data from that switch or router, and from many others, because a single Collector is designed to support multiple data sources. A Collector employs either a proprietary data structure or database to store the large volume of data that accumulates from the flow-based sources, and retains the data for long periods of times (months, at least) for reporting. A flow-based monitoring solution is a combination of a Collector, or set of Collectors, and a central server which processes user requests, communicates with Collectors, and returns the desired results to the user.

What is the difference between Flow-Based Solutions?

Differences between network monitoring solutions based on flow data come in two forms. The first is the type of flow data. Different network device vendors support different flow-based protocols. The most common protocols are NetFlow (Cisco), sFlow (Foundry), JFlow (Juniper), and IPFIX – a proposed industry standard. Each protocol deals with the generation of flow records just a bit differently, with the major difference centered on whether or not sampling is used and how aggressively it is used. The other difference in flow-based network monitoring solutions is in how the vendor presents (displays) the data, and any unique ways each vendor finds to process the data to provide unique results. Unique data processing and presentation is really the only way for vendors to differentiate themselves since the source and format of the data is essentially the same regardless of the underlying flow-based protocol.

What solution would you find most helpful for your company and why? We always suggest that enterprises have something greater than just a flow-based solution, as flow-based solutions tend to lack all the details required for root-cause analysis on your network. If you are interested in learning more about these issues, check out our blog post, “Is A Flow-Based Solution, A Whole-Based Solution?”.

Is A Flow-Based Solution, A Whole-Based Solution?

NetFlow and other flow-based technologies, like sFlow, Jflow, and IPFIX, have become increasingly popular given their leverage of existing resources –network switches and/or routers — to obtain data that is already being processed by these devices. However, as the old adage goes, “There is no free lunch.” Flow-based solutions lack the ability to solve specific problems experienced by the end-user, and can not only stress your network with additional data, but lose key data when your network is most heavily utilized.

End-User Frustration: Why packets and payloads are so important

Problem: You receive a call from an end-user who is experiencing significant application performance problems. Of course the immediate blame goes to the network. How do you quickly fix this problem in a flow-based solution?

Answer: Without access to the packets and payloads, the network engineer would have to perform a great deal of experimentation with the user. Additionally, without help from the application engineer, the network engineer would be hard pressed to figure out the issue if it’s not a simple network issue. Now you have three players that are experiencing frustration: end-user, network engineer, and application engineer.

Remedy: Get a solution that is able to troubleshoot and provide you with access to both the payload and packet information.

In this video Jay Botelho, Director of Product Management of WildPackets, walks through the process of determining the issues centred around a particular user and a particular application, as discussed in the above scenario, and displays how simple and short this process can be when using a solution that provides visibility into your packets and payloads.

Too Much Traffic on Your Network? Flow-based technologies generate traffic of their own

Flow-based analysis generates additional network traffic, with the volume of traffic proportional to the size of the network segment being monitored. The typical packet size is around 1500 bytes, which is relatively large. These packets come in spurts ranging from tens of Kilobytes to several Megabytes of traffic for each reporting interval, depending on how many flows are monitored by the switch or the router for that given interval. NetFlow packets usually contain data for 5 to 10 flows per packet.

On highly utilized network segments, this added traffic could cause undesirable results.

Flow-based technologies break down under pressure: How flows react to a heavily utilized network

All flow-based solutions share hardware resources with the prime directive of your router or switch – forwarding packets. If your router or switch is heavily utilized, it will focus first on its prime directive, compromising flow-based reporting. This can create intermittent inaccuracies in your monitoring and reporting that are very difficult to detect, affecting your ability to collect essential information from your network at a time when it is needed most.

In addition, the flow records sent from the switch or the router to the flow-based processor are based on UDP packets, an unreliable transport mechanism. There are no acknowledgments with UDP, so dropped packets result in missing and inaccurate flow-based data. Remember, each NetFlow packet reports on 5 to 10 flows, so for each dropped packet many flows are ignored. And this is most likely to happen when the network is busiest, compounding your inability to get an accurate picture of the current state of the network.

Sampling is another important factor to consider when evaluating flow-based analysis systems. The default configuration for NetFlow is to monitor and develop flow records for 100% of the packets – no sampling. But it can be configured to “1 out of k” static sampling, or the network device itself can switch to a sampling mode when network traffic gets heavy. Sampling leads to inaccuracies in reporting, and these inaccuracies can vary substantially since it all depends which flows are being ignored through the sampling.