The network is like a circulatory system, with the packets as the blood cells which carry markers that indicate when one of the subsystems isn’t doing what it’s supposed to be doing. In the case of a network, the symptoms seen by the broader organization are typically slow response times or interruptions in accessibility.
The best technology solutions allow a network administrator to rein in their networks and allow for maximum up-time by providing definitive visibility coupled with an efficient work flow. This coupling requires tight integration between monitoring and troubleshooting capabilities. A clear, graphical presentation of monitoring data is an ideal starting point, but the ability to drill down to lower levels of detail to solve difficult problems as needed is the real kicker. Further, real-time information must be paired with a sufficiently rich historical data store. When this is done, the context of conditions that lead up to an incident can be properly included in the analysis process, allowing intermittent problems to be found and studied in sufficient detail without having to wait for a recurrence before beginning analysis and resolving.
There are many technology options for performance monitoring and management. Some start with high-level data, typically traffic volume or quality metrics, and stay there. The result is grand vistas of the managed environment, but limited help with where to turn when performance problems occur. Others start with the lowest level of data, such as network packet traces, and stay there. The result is you often get everything you need to analyze problems, if you happen to be in the right place to capture a trace in the first place. The trick is to find a solution that provides definitive actionability, but can also let you see the larger managed network environment and how it all fits together.
The next area of consideration when planning your network monitoring system is the need for comprehensive coverage – considering both breadth and depth in order to deliver the best possible value. A performance monitoring architecture exhibiting adequate breadth must draw data from multiple points across the service delivery infrastructure. In topological terms, this means establishing measurement points in the core, distribution, and access layers. In architectural terms, this means instrumenting data centers, WAN provider edge points, internet and customer connection points, and branch facilities. Within core networks, and in particular within data centers, solutions must be able to support very high capacity technologies, including Gigabit and 10 Gigabit Ethernet.
A special mention is needed here about monitoring wireless access networks. With security concerns steadily declining due to improvements in technology and practice, wireless network access is becoming the norm. What this means for the average operations team is that they need to become savvy in managing performance within the wireless realm. Wireless poses many challenges for organizations, which may have tools for rollout and administration of access points, but no means for troubleshooting issues that occur within this new access layer. The answer is to find products that are designed to bring the network performance viewpoint into the wireless realm (as much as is readily available in the wired world today). Ideally, those tools should be the same ones you are using on the wired side of your networks, so there is no discontinuity or learning curve when moving from one domain to another.
Lastly, it is essential to adopt performance monitoring and management strategies that are inclusive of all types of traffic that will be present within the service delivery infrastructure at any point in time. This means not just Web traffic, but file transfers, routing protocol updates, IP voice traffic, client-server, database queries, transactions, video streaming, and protocols specific to industry verticals, such as financial trading or utility infrastructure controls. Without such comprehensive views, the interaction between various network-attached devices and applications cannot be fully understood, and hence the ability to troubleshoot all potential performance problems will be reduced.