Automation isn't as hands off as you think
- select the contributor at the end of the page -
If you’ve ever taken advantage of automation, you know just how much it can help ease your workload. But if you’re new to automating, what you might not yet know is the disaster it can also create if left unattended. Let’s take a few minutes to talk about why this happens and how you can avoid it.
But before we get started, and since I’m about to shine a light on the dark side of automation, I think it’s important to share with you that I’ve staked my brand on automation—so much so that I’ve been referred to as “Adam, the Automator.” I blog about automation. I’ve written magazine articles on automation and have authored Pluralsight courses on PowerShell in which I show how to automate things. Yet here I am, ready to tell you that it’s great—until it’s not.
Automation: A quick overview
At its core, automation simply means to create a hands-off task that behaves in a standardized manner; it’s intended to replace a task or series of tasks that would otherwise take much longer. Sounds wonderful, right? It certainly can be if you put in the time and effort to automate an insanely complicated task that allows some degree of predictability and will now take far less time. But let’s back up a minute to that whole “hands-off” thing.
To be hands off means having no interaction. It implies autonomy, less control, robots...ah! All joking aside, automation, if left unchecked can cause serious problems if not taken seriously. I’ve got a firsthand example I’d like to share to show you exactly what I mean.
Automation gone wrong
As an IT professional, I managed a System Center Configuration Manager (SCCM) environment for a healthcare organization. Let me give you an example of a situation where automation went awry during a routine software deployment.
SCCM has a method that allows you to deploy software to one, 10 or 10,000 machines at once. It’s a master of getting software installed on thousands of machines within minutes. Software deployments are major time savers and, in this case, allowed just a couple of people to manage 5,000 desktops. That couldn’t have been possible without a tool like SCCM, so what went wrong? To be frank, I got too comfortable.
One night, after spending all day packaging up a new piece of software that needed to be rolled out to the entire enterprise, the time came to push the software out to all 5,000 desktops. It was believed to be a routine software install. It was a simple MSI that, during testing, took less than a few seconds to install and successfully completed every time. I wasn’t too concerned about this little MSI. I had pushed out much larger packages before, but what I didn’t realize was that size doesn’t actually matter.
The clock struck 3 a.m. and it was maintenance window time. Being on top of things already, I had pre-staged all of the clients and they were all installing at 3 a.m. sharp. The installs went great…until I noticed the helpdesk starting to light up. People were calling about seemingly random problems with all kinds of different applications. Being a hospital, there were still plenty of people working at that hour and they weren’t happy. After pinpointing the problem I noticed that the deployment accidentally had gotten targeted to SERVERS as well—which was not my intention. Even so, that would have been OK, but a particular file was in use on these servers and the MSI decided to initiate a reboot!
After unsuccessfully trying to stop things, it was too late. I had rebooted nearly 75 percent of the 700 or so servers in the data center. As you might imagine, it was not a good day at all. After explaining exactly what happened to management the next day—and that it was completely my fault—I was lucky enough to work for an understanding client and I got to keep my job. This was a case of automation’s bad side.
Automation requires respect
While being able to install the software so quickly was great, what I failed to realize was this: Once that ball starts rolling there’s no stopping it. A wise man once told me automation is the only way to fail at scale—this is the most truthful statement I’ve heard in a long time. If you don’t treat automation with respect, it can take a situation that might be an inconvenience to an all-out three alarm emergency!
Like I said, I’ve built my brand around automation and the last thing I want to do is scare you into not automating. If there’s anything I hope you’ll take away from my own automation emergency, it’s this: Treat automation with respect. This involves taking your mindset away from single, one-off events and shifting to manage workloads in mass. It’s a completely different way of thinking and one that can certainly help you avoid the pitfall of 500+ servers rebooted and hundreds of angry customers.