Before you automate: Things to consider before you decide on letting your machine do it

As significant chunk of my work involves automation using PowerShell. I speak as an Infrastructure specialist, when I say that automation has gained a lot of traction. Automation has several benefits, such as:

Reduction in human errors.
Higher speed.
Higher efficiency.
Reduction in effort.
No boring tasks.

This is all good, shiny. However, can everything be automated?

Here are some things you should consider before you automate anything, whether it is an odd user account termination process, or disaster recovery of your datacenter.

Standardisation

Automation is a process in which you make machines (or software) do things for you. Machines and software are not inherently smart. They do tasks as you instruct them to. For instance, consider a simple scenario:

You are part of the IT environment in a small advertising agency. Your agency has some fifty employees. Your environment does not have its own email domain, and therefore, everyone creates a mailbox with an email provider of their choice (Gmail, Outlook, AOL, Yahoo, etc.), and use it for business. The provider list is not restricted to the aforementioned providers; let’s say a tech-savvy person in your agency has his own email domain, and hosts his own email. Can you create an account termination process wherein you remove the mailbox associated with the terminated user?

It’s not impossible. You can write a script that makes API calls to the email providers that allow such a thing, but with increasing diversity in email providers, the code will keep getting longer; or more complicated, even.

Therefore, one of the most important requirements to automation is standardisation. If your process isn’t standardised, automation of it may be possible, but may not be fruitful.

More than anything else, the automation of a non-standardised process will introduce more errors than potential manual errors, and would call for (roughly) almost triple the time spent in troubleshooting the issues and rolling back actions. You do not want to do this, your employer does not want to pay for it, and your end-users do not want to wait for it.

Effort comparison

This is another aspect of paramount importance. Here is a story:

The good candidate

In the past, I worked on an inventory audit report, which used to consume eight hours every quarter, to do manually. The logic was rather straightforward, involving VMware vSphere, Excel and Active Directory. However, the process was something like:

The engineer logs into the different vCenters, and:
1. Lists out only those servers that are powered on.
2. Copies the table, pastes it in Excel.
3. Copies the server names alone and pastes the list in a text file.
Runs a VB Script, which:
1. Connects to Active Directory and extracts a list of all servers.
2. Opens the Server Inventory (an Excel workbook) and gets data from there.
3. Reads the input file created with the servers from vSphere.
4. Creates four lists with confusing names, by comparing the three input lists.
5. Attaches these lists to an email and sends the email to the engineer.
The engineer then parks his rear in his chair for the next eight hours and compares the lists, performs some manual actions, cleans up things, adds notes, and at the end of his shift, is finished creating the audit report.

This audit report is then sent to the Infrastructure Management team, who then make modifications to the environment by cleaning up servers or making entries in any or all of the three systems in question.

In a nutshell, eight hours were being spent every quarter on only creating the inventory audit report. That is thirty-two hours in a year, without counting the “internal review”, which involved another four hours per quarter, spent by another pair of eyes to check and ensure that there are no manual errors—in all, forty-eight hours in a year.

This was an ideal candidate for automation.

We spent two shifts building the logic, writing the script, as well as thorough testing of this report—including unit testing, integration testing and system testing, and finally, implementing the change. After the script was deployed, this is what the engineer had to do:

Run the script.
Go, get a cup of coffee.

Seven minutes per run, compared to twelve hours per run. The script can be scheduled on a server, reducing manual effort to almost zero, also making the total run time immaterial. Sixteen hours were spent in creating the script. Had this not been automated, 144 hours would have been spent in the last three years; the automation saved 128 hours. In the next five years, it would save 368 hours in all.

In other words, 46 man days would be saved in eight years.

The flip side

I also see potential candidates for automation which involve doing simple tasks. Like, mapping another user’s mailbox in Outlook. Why would you need a script for this? If you are using Exchange 2010 or later, all you would have to do is:

Add-MailboxPermission -Identity SOMEUSER -User REQUESTOR -AccessRights FullAccess

There. If REQUESTOR has Outlook 2010 or later, the mailbox would automatically be mapped to his profile. But that was the catch: the environment that requested this primarily uses Outlook 2007.

‘Then, why not create a script for the mapping part?’ I was asked.

‘Very well, how long does it take to map a user’s mailbox?’
‘About thirty seconds.’
‘How many such requests do you get in a year?’
‘About twenty, I guess.’

Ten minutes in a year. That is the time that this team would spend mapping mailboxes in Outlook.

‘Any idea on how long you plan to use Office 2007?’
‘Another nine months, I guess; we plan to move to Office 365.’

So, I would have to research Office hooks in .NET (PowerShell is not built for Office tasks), script the necessary calls, may be throw in the mailbox mapping cmdlet also—because, hey, might as well—test the script with at least a couple of users, deliver it to the team, the team spends some time to tell the client this is being done, create a slide deck, hold a meeting, create a change request, implement the change …

Compare this with 450 seconds. Worth it?

Case closed.

Code reuse

Always, write code that can be reused. I see many PowerShell scripts out in the world, that are pretty much monoliths. Monoliths are hard to manage, hard to troubleshoot, and hard to reuse.

Write PowerShell scripts as functions; use PowerShell as a programming language. Understand that PowerShell outputs objects. Therefore, if you would like the script to do four things, write the script as four functions. Chain the functions together to achieve your end goal. This way, if you write another script that uses one of the four capabilities, you could simply call this function.

A suggestion that I usually give is wrap all functions you create, into a module. This way, you would have to write every function only once—no need to copy-paste—and call them as necessary, in other functions that require them.

Scalability

Write scripts that are scalable. I will write another post on a solution I implemented for a client, which involved updating a certain all-company distribution list using a feed from Oracle Identity Manager. We first created this for one group. The management liked the solution, and wanted this expanded to (ten) more distribution groups—which were location-specific—based on a certain filter parameter available in the OIM feed. We did not know the client would ask for this, but we created the script the way we create them.

Had the solution not been built with scalability in mind, we would have ended up adding at least one more script to the inventory. Not to mention the effort spent in writing the script (now with a new logic). Testing, presenting, implementing, etc. would have obviously followed. But no; only some fifteen lines were added to the script to scale it. Merely fifteen lines, most of which were in the param () block.

Give some thought to scripts before you write them. Think of how data can be passed from and to the functions; use those doors and windows, so objects can flow in and out. Considerations could be even as simple as whether you would like the function to perform actions on one object, or several. Think of these aspects in advance and build your functions. And as a rule, the function may take several objects as input (only one through the pipeline, though), but should return only one object (or an array of one kind of object) as output.

If you learn to handle the data, and to build the logic around it, you will soon be writing much less, and doing much more. That is the goal of scalability.

Upgrades

When you automate anything, have the infrastructure roadmap in mind. Ask questions such as, ‘How long will we use this application?’, ‘Is an upgrade/phase-out planned for this application in the near future?’, etc.

At the same time, research on how your scripts would work on future versions of a certain product. For instance, Microsoft has combined all of Azure’s modules into a single module. Now, all you have to do is install that single module from the PowerShell Gallery. Knowing Microsoft, I had anticipated this would happen somewhere down the line; did not expect it to happen so soon. (Of course, this is a step in the right direction.) Not only that, but now, the cmdlets are prefixed with Az instead of AzureRm. Sure, there is backward compatibility in this case, but not every case will be this, not everyone is Microsoft.

Of course, it is impossible to know whether your scripts will work in future versions of a product. And this makes documentation important. A little knowledge of Regular Expression and using the right tools (like Visual Studio Code) will help as well. You can modify standarised cmdlets in a single sweep. And this is another case for standardisation of names—name your functions in a similar, standard way. The names should make sense, and follow the PowerShell naming conventions.

Document every automation you implement. Document the logic you use. Implement every automation with a change request that contains the explanation of your code—what it does, how it works, etc. Perhaps even make a flow chart.

And as much as possible, use native capabilities (for instance, do not use dsget in your AD PowerShell scripts).

Finally, keep best practices in mind. An example would be making your code readable with the right level of indentation, comments in the right places, and so on. Remember, if you script, and your scripts achieve complex tasks, you are a programmer. Programming best practices apply to your work as well. Treat your code with respect; if you don’t find your work respectable, work harder to make it respectable.

We talk about this and similar other good practices to consider before scripting, and the way to build a logic when scripting, in our book, [PowerShell Core for Linux Administrators Cookbook]({{ site.book }}). Amazon has a sample. Go ahead, give it a try. Though it says “Linux” in its title, the skills are transferrable to Windows as well.

Before you automate