Many people are pinning the CCIE vs going into network automation like it’s a crossroads where you have to make a “once in a lifetime” decision and then live with the consequences. One path will lead to misery and the other to people rejoicing and singing kumbaya around a camp fire with a million dollars in the bank.
After explaining my view of this “debate” maybe someone will go: “It’s easy for Daniel to have an opinion like this. He is already is certified at the expert level and doesn’t have to make a choice between the two in his career”. Well thank you. The view is astounding up here in the ivory tower. We still have a few rooms to spare… Let me explain why it’s not a choice between the two.
Knowledge still matters
You can’t automate what you don’t know. I’m not suggesting that everyone needs to be a CCIE or that the only path to expert level knowledge is the CCIE but you can’t automate what you don’t know. If you don’t know how the protocols work, how can you verify? How can you know when you have found an anomaly? How do you handle the exceptions? It makes sense to start with the low hanging fruit and automate all of the simple things. That might get you 80% of the way. It’s the remaining 20% that you have to decide if it’s worth automating or not depending on how complex it is.
So let’s say that you have two BGP peerings with two different providers. Someone with CCNA level skills could probably get the peerings up an running, perhaps with a little help from a colleague or by just googling some stuff. The peerings are up and running just fine. All traffic is flowing as expected through Provider 1.
At a later stage, one of the peers starts flapping. Where is the problem coming from? Physical layer? BGP? Maybe the providers router was updated with a new software and it’s sending a malformed update or the TCP authentication is off so the peering can’t come up. Maybe the provider starts sending more or less prefixes than normal. How would you detect this? Maybe traffic shifts to the secondary provider. Would you notice? Maybe you did maintenance on your end and shifted traffic to the secondary provider but after the maintenance the traffic won’t shift back. What happened? (google BGP wedgie).
The point here is that automating the configuration is easy but the person creating the templates still needs knowledge of BGP. Someone still needs knowledge of how BGP operates. Someone needs to understand BGP communities, attributes and traffic engineering. This does not go away with automation.
Certifications are a learning path
Most people overestimate how much of the knowledge that is vendor specific in the CCIE. Most of the knowledge can be used on any vendor equipment. Sure, you spend a lot of time learning the CLI and knobs and behavior of that vendors equipment but if you think that’s all you learn then you either haven’t tried the CCIE or you took the wrong approach while studying for it (sorry). The CCIE is about developing an expert level of protocols and how to implement these. Knowing the CLI is just a byproduct of this. Personally I made sure I spent a lot of time studying TCP/IP, the history of Ethernet, messing around with STP and how BPDUs get formatted depending on what type of link in use etc. I followed the blueprint but I went deeper where I felt necessary and I allowed myself to “mess around” outside of the blueprint to learn things that I expected someone at the expert level should know. Yes, it took my a lot longer than some people to pass the thing but in the end, who cares? I got the job done and learned a ton by doing so. This knowledge helps me in everything new I study as I have learned how to study efficiently and I have a good understanding of protocols and algorithms.
Automation today vs automation tomorrow
Many people argue that the “normal” networking knowledge is becoming obsolete. They argue about vendor CLI and implementations but at the same time have no problems on putting all of their eggs in the baskets of Ansible, Chef, Puppet etc. Going open source is not the same as not having any lock-in. You always have some form of lock-in when you learn a product and develop all of your tooling around that product. I don’t expect that Python and Ansible will be the main automation tools forever. That doesn’t mean that the knowledge is lost. Just like knowledge from learning older networking protocols was not a waste. In the end we will see more and more vendor solutions that are the “total package”. Think Cisco SDA. Think Apstra. Where you can have a “turn key” solution where you don’t need to have Python and Ansible skills to run the products. That doesn’t mean that you can’t leverage that knowledge to extend these products but it’s not going to be required to have those skills to operate these products. That’s why I don’t believe in the saying “every neteng must become a coder”. It’s still useful knowledge. It’s just not how I see the market turning out.
Let experts do what they do best
If you read the interview I did with Ivan recently, you will know that he doesn’t think that every neteng has to become a coder. Quoting Ivan: “there are other people out there that are way better programmers than you are, so focus on what you’re doing best (= networking) and let other experts do what they do best.”. This doesn’t mean that you can’t help developing pseudo code or writing proof on concepts or writing some scripts. However when you put things into production, if you wrote it, you are responsible for it. That means that you have to take the responsibility for bugs and developing the code, adding new features etc. Not everyone enjoys this part of coding, “the grind”. They are more into creating things and the amazement you get when you see something running and it works. It’s an entirely different story to be responsible for the quality of the code, testing it and having to live with the code for a long time (code never dies). So in some cases it’s best to leave the coding to the experts. It all depends on the size and structure of your organization.
You thought deleting the wrong VLAN on one switch stack was bad? How about doing it on 100 switches at the same time? Could be disastrous and a resume generating event. When automating things the blast radius is much larger because it’s so much easier and faster to deploy things than when you entered all of the commands manually or copy/pasting them in. When you do it manually you can often notice when something goes bad, maybe the TTY hangs or you catch something in the logs. When deploying the change en masse you aren’t logged in yourself so you might not catch what’s on the TTY or in the logs. Of course you should do testing but how many have a testing environment that matches up with their production? A typo or the wrong command can have a huge impact when the blast radius is so large. For this reason it’s even more important to have someone knowledgable that understand what needs to be done in a change, the risk of implementing the change and how the change can be recovered if needed to. The person also needs to understand how errors can be detected, what data needs to be gathered in that case and recovering gracefully (if possible). This requires expertise in both protocols, implementation and possibly CLI (if not using APIs). This all comes back to the point above regarding “you can’t automate what you don’t know”. Even if there isn’t an exact testing replica of production, large changes should be deployed in smaller pockets of the network first. These should be selected based on having the least impact on the organization if something goes wrong.
It’s not CCIE or automation. It’s CCIE and automation. I’m not saying everyone needs expert level knowledge. Someone with CCNA or CCNP level knowledge can become great netengs and combined with automation skills they will become really attractive and successful in the market. There’s still going to be a demand for experts though. Maybe not as many as today because implementation will not be the main factor, knowledge will be. You have to decide if you want to be the person that goes super deep and enjoys knowing the protocols inside out or if you are satisfied with knowing enough to work more with automation and looking up details as you go. Having a CCIE and automation skills will of course make you super attractive in the market. We also have to remember that not all organizations can or are willing to automate at this time for various reasons such as organizational structure, costs or fear of what’s new. It’s not wrong enjoying to work on the CLI. If that’s what you like, then do so. It might not last you a life time but hey, people are still doing COBOL… So don’t think that knowledge doesn’t matter. It does. More than ever. There are new job roles though and the market is changing (as it always does) so go for the thing that is most rewarding and interesting to YOU. Don’t forget about the fundamentals though… Or you might end up repeating mistakes of the past and blasting yourself out of orbit 🙂 Don’t be afraid to go for an expert level certification if you want to and if you don’t want to, that’s fine… Good luck!