Before we address this question, we must address an even
more basic question - "What is NetFlow"? NetFlow, and other flow-based technologies
like sFlow, JFlow, and IPFIX, are simply specifications for collecting certain
types of network data for monitoring and reporting. The data sources are
network devices themselves, like switches and routers, the idea being to
leverage existing resources in the network to provide data that is for the most
part already being processed by these devices. To that end, flow-based systems
provide an economical source for network monitoring data.
All flow-based systems start with flows as their basic
element. A flow is a sequence of packets that has the following seven identical
characteristics: source IP address, destination IP address, source port,
destination port, layer 3 protocol type, TOS byte, and input logical interface.
By definition, a flow is unidirectional. Flows are processed and stored by
supported network devices as flow records, and it is these flow records that
vary from specification to specification -
i.e. a NetFlow flow record does not take quite the same form as an sFlow
flow record. This requires different parsing and processing techniques for each
flow-based specification. It is at this step where flow records are consumed and
the term NetFlow Analyzer is introduced.
Basic flow analysis is a multistep process, requiring
several different elements to be present. Packets enter a switch or router,
just as they would as part of normal network operation. If the network device
is flow-enabled and the feature is active, additional processing will take
place to identify individual flows in the packet stream per the seven
characteristics mentioned above. Depending on the configuration of the network
device and how busy the network is at any given time, this processing may be
done on every packet, or just a sampling of the packets. As flows are
identified, flow records are created per the specification supported by the
network device, for our purposes NetFlow, and the records are stored locally in
the network device. As flows are completed, the records associated with those
flows are exported to an external NetFlow Collector, where they are archived
for further analysis and reporting. Once the flow record leaves the network
device it is deleted from memory to make room for other flow records. Though
efficient since the packets already must be processed by the network device,
NetFlow does put an additional strain on the network device since it requires
additional processing beyond that required for only switching or routing, and
it requires additional storage on the switch for the flow records being
processed and exported.

A NetFlow Analyzer includes the NetFlow Collector, which
accepts and stores the completed flow records; a storage system to allow for
long-term storage of large volumes of flow-based data; and analysis software to
mine, aggregate, and report on the collected data per user requests through a
customized UI, often web-based but sometimes client-server. The NetFlow
Analyzer can be software-only or appliance-based, but most systems are
appliance-based, and the system often includes multiple appliances.
So what are the advantages? NetFlow data comes "for free"
from NetFlow-enabled network devices, eliminating the need for additional
network probes to collect the flow-based data. But remember, it's not entirely
free since it requires processing and storage resources on the network device
thereby competing with the prime directive of the device - route packets. Given
the 7 characteristics of a flow, NetFlow Analyzers can provide a relatively
detailed set of network performance data, and given enough storage this data
can be archived for quite a long time providing a long-term record of network
behavior.
But there's no such thing as a free lunch. NetFlow Analyzers
may not always be 100% accurate since the source of the flow data can be from
sampling and not an analysis of each and every packet. NetFlow Analyzers also
create additional network traffic moving flow records from the network device
to the NetFlow Collector, possibly impacting performance on an already busy
network. And NetFlow Analyzers can report on nothing more than the information
they can interpolate from the 7 flow characteristics, making them excellent
network monitors but poor network analysis solutions because they often lack
the data to perform root-cause analysis once a network anomaly is detected.
Network analysis systems that derive data from independent
interrogation of each an every packet, like the OmniPeek Distributed Analysis
Solution, provide all the data necessary not only for detailed network
reporting, but for advanced, root-cause analysis as well. No sampling, no need
to move data across the network for storage and analysis. All analysis is done
at the source, by tapping into a network device and processing all the data
locally.
Each system has its place, but when the time comes for
root-cause analysis, and it always does, a packet-based analysis solution like
the OmniPeek Distributed Analysis Solution is what you need.