In a previous blog post I explained that Intent Based Networking not only intoduces new technology but that it also requires change in the way we operate network infrastructures. And that it is in fact a journey, but how do you get started on that journey? With your existing network infrastructure? It’s impossible to throw away every piece of equiment and design and start greenfield.
And that is true, most of us working as system engineers, network designers and architects will face the challenge of beginning this journey on existing hardware with existing designs and the very downtrodden path of operating and troubleshooting networks.
Before I continue with my tip, I will first provide you with an paradigm story about why network infrastructures are important to each and every organisation that exists today . And although any network engineer understands the process of encapsulation, decapsulation, and the 7 OSI layers of abstraction; users and managers do not. So I use this paradigm to explain why these abstract network
Every organisation can be seen as a building or a house, each uniquely built and designed. On the top of the building we see all the applications that are required for the business to be able to run. This could be your Office 365, Google Mail, an ERP system like SAP, but also your office applications and a fileserver to store all your documents. It doesn’t matter if these applications run
And at the first step of the building, where the employee or contractor stands that wants to get access to those applications, you can see the physical connection to the corporate network in the office building. It is your physical network cable that you plug into the wall or the virtual network cable using WiFi technology that connects your computer to an access point.
All these cables, virtual and physical, are connected to switches, access points, wireless controllers and routers within a single branch office. With that I mean that there is some sort of connection locally to the physical location. But there is still no access to the applications.
And that is because there are four distinct pillars that build the logical connection from that connection hub in the branch location to the applications that generally live in the datacenter or the cloud.
The first pillar that is required is Dynamic Host Configuration Protocol, better known as DHCP. It is the process that makes sure that every device in your network can get the proper IP address configuration and other network related settings. This allows your computer to connect to all these different networks without you having to enter
The second pillar is DNS, or Domain Name System. There are only a few network engineers that know the IP addresses of the servers where applications are published by heart and can use those IP-
A third pillar is security, which is key to any proper and good modern day network infrastructure design. Network Access Control (who is allowed to do what with which device),
And last but not least, the fourth pillar is the logical connection from the branch office to the head quarters, either via a Virtual Private Network (tunnel), or via a circuit that a service provider delivers. It allows that local branch office to communicate with the datacenters.
These pillars combined with the two first steps form the network infrastructure or in other words the foundation of the enterprise. If a single of these components is not working as expected users cannot access those enterprise applications and the enterprise stops working. And this paradigm is easy to write out on a whiteboard and explain the importance of networks and connections in any organisation.
And as most users are not aware of this foundation, they tend to blame all issues on the network. I’d just like to give you some examples of issues or problems that come to network operations in a very frequent manner
- “I cannot access application X, I could access it yesterday”
- “Application Y is slow, it’s the network. I know that for sure, it’s that uplink we had some issues with 3 years ago”
- “I cannot connect to the wireless network, and it’s not my computer because it works at home”
- “I can’t logon to the computer, it’s the network’s fault”
- “Application Z is crashing, it must be the network that disconnects me”
And I can continue with this list of examples for a long time, but for this blog post it’s sufficient. And usually context is left out when they call the problem in, so they forget to mention they just bought a new HP laptop or updated Windows 10 (there are quite some interesting behaviours with windows 10, wireless and HP by the way), or that they changed their password recently or forgot to read that application downtime notification message.
And so what happens inside the average network operations team when a call comes in? We take our backpack of tools and utilities and respond to the incident and start investigating the issue. And all network engineers have their own backpack of tools and utilities that we prefer to use. And we need to be thorough to be able to convince the user (and if it takes to long their manager and so on) that it is not the network but a different issue.
So the average network operations team is responding to incidents in the same manner as firefighters are responding to fire notifications or fire alarms that come in.
In summary all the operation staff is putting out fires (incidents) in the enterprise and everybody is busy with responding to incidents. And now, how can you change that for your journey to
Because the network engineers in your team are primarily responding to incidents the result is a backlog in project work and in summary they are buried up to their necks in work. And let’s be honest, if you’re soo busy with work where every issue is important or priority, there is absolutely no way that the engineer can take a step back and see from a distance what is happening, considering what structural changes would be required to prevent incidents and with that improve the quality of the network and services.
And with that in mind, how can you reduce the work in such a way that the work load on the engineer is reduced so he can take a step back? Quite simple, by creating
And that is exactly my first tip: Gain visibility before control
With that I mean that before you can gain any control over your network or processes, you need to have visibility in what’s happening. And how to accomplish this visibility? With Cisco DNA-Center in Assurance mode.
If you look at Cisco’s Digital Network Architecture (check out my perspective on blogs.cisco.com for more information), Cisco DNA-Center (DNAC) appliance provides three distinct roles, being a service management tool, automation tool and obtaining analytics from network devices. And the latter part is precisely what you can use to gain control and visibility.
DNAC assurance uses a specific architecture to combine differnet sources of information to provide the visibility. The diagram below provides an overview of that architecture.
Image courtesy of Cisco Systems.
In a future blog post I will write a bit more about the technical and configuration aspect of DNAC Assurance, but for now it is sufficient to know that DNAC Assurance uses SNMP, SNMP Trap, Syslog, Netflow (limited) and model driven telemetry to get information from the network devices. DNAC then integrates other information DNAC has (such as device configuration, AAA, etc) and then provides a single pane of view for DNAC assurance.
That single pane of view, visible from the DNAC Assurance tab, is then split into two types of assurance, client assurance and network assurance. It uses ± 100 advanced metrics (already 66 on wireless alone in categories like onboarding, wireless and infrastructure) to analyse the received data and provide the operator with quite some information to determine whether a problem is within the network, a client that misbehaves or just that the DHCP server is not responding. The configuration of areas can also help the operator to quickly see if a problem is a client, or a complete area.
As can be seen from the screen shot above, the locations in Netherlands have a severe problem in the Wireless network, where as only a few wireless clients face a problem in the USA.
And within DNAC Assurance there are quite some handy features already that you really want to dig out a bit more. To name a few features
- Client 360
With Client 360 it is possible to look at the network from the client perspective. What did the client see, which AP was the client joined to and what other AP’s were visible. How was the IP address obtained, etc.
- Device 360
A similar tool to the client 360, but then from the device perspective. What clients are connected, what are the routing tables, who are my device neighbors.
- Path trace
With patch trace it is possible to check, based on configuration and running information, what path a specific packet from a client to a server (or vice versa) is taking. Path Trace also takes into account CAPWAP tunnels, the underlying infrastructure, etc. It provides a nice graphical overview of how a packet would flow across the network.
- Time machine
This is really a powerfull feature, as DNAC saves the raw telemetry for some time, it is possible for you to go back in time and check how the network was, around the time that the user was experiencing a problem. It can really give you insight in what happened.
- Apple fast-track
As Cisco and Apple work together, there is now also a feature called fasttrack in Wireless. With IOS 11 devices, specific wireless information (like why a client disconnected from the wireless network). These extra bits of context and info can give the operator more insights in potential network related problems.
In summary, DNAC Assurance provides the operator with more insights, context and proposed actions to determine faster whether it is really a network problem and if so, some suggetions are provided to solve the problem. That way, the operations team will get more time available which is needed for your journey to software defined networking.