Infrastructure as Code: The Engine at the Heart of DevOps

In the News:

TechBeacon – by Christopher Null

"Infrastructure as code" (IaC) doesn't quite trip off the tongue, and its meaning isn't always clear. But IaC has been with us since the beginning of DevOps—and some experts say DevOps wouldn't be possible without it.

What is infrastructure as code and why does it matter?

As the name suggests, infrastructure as code is the concept of managing your operations environment in the same way you do applications or other code for general release. Rather than manually making configuration changes or using one-off scripts to make infrastructure adjustments, the operations infrastructure is managed instead using the same rules and strictures that govern code development—particularly when new server instances are spun up.

That means that the core best practices of DevOps—like version control, virtualized tests, and continuous monitoring—are applied to the underlying code that governs the creation and management of your infrastructure. In other words, your infrastructure is treated the same way that any other code would be.

"The basic principle is that operators (admins, system engineers, etc.) should not log in to a new machine and configure it from documentation," says Boyd Hemphill, director of evangelism at StackEngine. "Rather, code should be written to describe the desired state of the new machine. That code should run on the machine to converge it to the desired state. The code should execute on a cadence to ensure the desired state of the machine over time, always bringing it back to convergence."

Hemphill adds, "This IaC thinking, more than any other single thing, is what enabled the cloud revolution, because a single ops person can start 100 machines at the press of a button, and also have them properly configured. The elasticity of the cloud paradigm and disposability of cloud machines could truly be leveraged."

IaC isn't just automation

In much of the current literature, IaC is often wrapped up with the topic of automation, and many of the best practices of IaC involve smarter deployment of scripts and automating manual processes. But IaC is a concept that extends beyond simple infrastructure automation. IaC requires applying DevOps practices to automation scripts to ensure they're free of errors, are able to be redeployed on multiple servers, can be rolled back in case of problems, and can be engaged by both operations and development teams. The use of modern coding systems like Ansible or Puppet is designed to make IaC environments accessible to anyone with basic knowledge of modern coding techniques and structures.

Hemphill summarizes four best practices of IaC:

  • Manage infrastructure via source control, thus providing a detailed audit trail for changes.
  • Apply testing to infrastructure in the form of unit testing, functional testing, and integration testing.
  • Avoid written documentation, since the code itself will document the state of the machine. This is particularly powerful because it means, for the first time, that infrastructure documentation is always up to date.
  • Enable collaboration around infrastructure configuration and provisioning, most notably between dev and ops.

Infrastructure as code versus change management

IaC is in some ways still an emerging and evolving concept, and many organizations are still figuring out how best to implement the practices listed above in their existing DevOps framework. As such, the idea is often confused with and folded into existing change management (CM) technologies. So how does a company that already embraces CM implement IaC?

"I don't know if anyone has really come up with the right way [to implement infrastructure as code] yet," says Cliff Moon, CEO of Opsee, "but there's definitely a wrong way. Trying to retrofit IaC onto last-generation configuration management tools, as many of the CM vendors are doing, will be a recipe for pain."

Moon explains the issue as a question of declarative versus imperative modeling. "Most configuration management systems are not declarative, and for the most part they can't really compute over the state of a deployment. At their simplest, all CM systems are built on feedback loops; they look for conditions like 'Is X installed? Is the configuration for Y up to date?' And they take remediating action when those conditions fail. This happens in a loop until convergence, which is when all of the conditions report back as being true."

ScriptRock post explains the differences between declarative and imperative methodology fairly succinctly. "Imperative focuses on how and declarative focuses on what. In a software engineering context, declarative programming means writing code to describe what the program should do as opposed to how it should do it."

Performance implications with the imperative model

Moon notes, "The problem with the imperative approach as it relates to cloud deployment is that it requires you to enumerate how it will do the remediation in a piecemeal fashion. So if you're trying to do IaC through Chef, every Chef run could cause an outage until convergence is reached, unless you're really, really careful."

Moon offers an example involving the migration of a back-end system from Amazon Web Services Elastic Load Balancing to NGINX. This is not necessarily a straightforward transition, he says. "There might be three or four different steps, dependencies between the steps, and an ordering of those steps which results in a clean cutover. CM tools are not built to compute that ordering. In addition, CM tools don't carry a model of the current state of your deployment, meaning that they must query all conditions multiple times per run."

The result, says Moon, is a performance hit and slow convergence whenever changes are made. In contrast, he says, true "IaC tools model the state of the infrastructure internally and typically only need to touch things that change."

Direct versus indirect automation

Another way to look at the declarative/imperative concept comes from Eli Feldman, CTO of advanced technology at EPAM Systems. Feldman says that the key to proper IaC practice is to first consider the language in which the routines are written. This is informed by whether the type of automation required is direct or indirect.

"Direct automation routines, which are strictly meant to automate infrastructure, are typically written in Descriptive Scripting Language," says Feldman. "Indirect routines are typically written in the language of the application itself. Such indirect routines will align infrastructure to its business logic in real time.

"The direct infrastructure automation use case is a best practice on its own. Programmatic deployment of an application is as important as the programmatic nature of the application itself. Quality, performance, stability, and reusability are all equally critical for both. Deployment must be considered an integral part of the application delivery process. Quality assurance must cover the entire delivery pipeline. In a continuous deployment model, both the application and the deployment are tested continuously and failure of either renders the release as a failure. Direct automation using IaC is broadly applicable to any application."

Contrast that with indirect automation. Says Feldman, "Indirect automation is geared toward applications that require continuous change in their infrastructure environment at runtime. That is, applications required to support rapid changes in workload type or volume and designed with self-healing and self-scaling in mind."

To summarize: The type of applications you're deploying informs the type of infrastructure you need to manage, and the type of infrastructure you utilize informs the type of automation you utilize as part of an IaC strategy. In most environments, direct automation will be appropriate.

A declarative approach to IaC

Puppet Labs is at the forefront of the declarative direction of IaC and change management. Carl Caum, technical marketing manager at Puppet Labs, explains that Puppet is designed simply to "look for problems in your code, then fix it by bringing it into the correct state."

Puppet doesn't run scripts or execute code on your infrastructure; it doesn't know how. Rather, it builds a graph of what your infrastructure code base is supposed to look like, then recreates a model of that desired end state. "In Puppet there are no resources," says Caum. "There is only the graph. It is impossible to break out of that model."

Caum explains that the main way imperative CM tools can get into trouble is when multiple scripts, knowingly or unknowingly, execute against the same piece of code. This can lead to an unstable model, with scripts running "on the side" that the CM tool doesn't know about.

"In Puppet, we understand everything before we do anything," says Caum. "If you have two pieces of code managing the same resource, Puppet won't allow it and won't run until the situation is resolved." The result is more assuredness over the validity of the infrastructure's convergence state than you can get through an imperative CM tool.

Best practices for infrastructure as code

With the fundamentals of IaC covered, it's now time to consider some best practices surrounding its implementation.

Exercise caution when extending IaC tools to novices

By design, IaC makes deploying and reconfiguring server environments painlessly simple, but that's a double-edged sword. While novices can spin up a hundred instances in just a few minutes, they can also do an incredible amount of damage in a short amount of time.

Says Moon, "Just like SQL gives people without deep knowledge of data storage and processing techniques the ability to process large amounts of data, IaC allows people without deep knowledge of infrastructure the ability to set up relatively complicated stacks of infrastructure quickly. Like SQL, it is an abstraction, and all abstractions are leaky. Inevitably, once the user wants to do something more complicated than many of the designed-for use cases, they will have to look under the hood and figure out how to get the engine to do exactly what they want."

Go slow when rolling out IaC to the DevOps rank and file, and ensure that users have supervision and guidance, particularly when trying something new.

The stricter the better

Sujatha Kashyap, vice president of technology at Robin Systems, stresses that the more strictly you define everything in your environment, the fewer problems you'll encounter. "Specify the environment as strictly as possible, leaving little to chance," she says. "Be as specific as possible about the infrastructure requirements, including network bandwidth and storage I/O operations per second, if possible. This is often overlooked."

She adds, "The application developer knows the factors that affect application behavior and performance the best. Involve the developers in writing the IaC specifications for the infrastructure elements and runtime environments. Use monitoring and feedback information to tweak your configuration management scripts for continuous improvement."

IaC tools are evolving at different rates

"One thing to keep in mind is that platforms like AWS and Windows Azure are evolving very quickly with new features and services," says Feldman. "Third parties very scarcely keep up with the velocity of changes." In other words, a third-party tool may often find itself incompatible or behind the times when popular IaC platforms are updated. "So, if you're trying to stay at the cutting edge on these platforms, accept vendor lock-in or use open source libraries and contribute your support for new features back to the community."

Containers point the way forward

As noted earlier, IaC is still evolving, and Hemphill says that we're in for further evolution as containerization and IaC collide. The question in the future isn't going to be how to manage 100 clones of a single machine, but rather how to manage 100 machines that may have a variety of different configurations.

"On the cutting edge of infrastructure are containers—most notably Docker containers," he says. "Docker's base tooling arguably obviates the need for configuration management tools. The configuration management problem of yesterday roughly translates to a simple Dockerfile today (a short shell script). The complexity of the problem is moved from the state of a single machine to the state of multiple heterogeneous services and the relationships between them. This is commonly called 'orchestration.' In other words, how do we now define a distributed application or set of services spanning multiple machines, their relationships, and desired state in code?"

The solution potentially lies in the Docker Compose tool. "This tool describes the run time state of each service and the relationship between them via a simple but powerful data format, YAML," says Hemphill. "Like a Chef cookbook or a Puppet manifest, this YAML file can be managed via source control with all the inherent advantages described above. Docker Compose, however, assumes all containers are on a single machine today." (Docker Swarm should pick up the slack as it emerges from beta.)

Infrastructure as code makes DevOps possible

In simple terms, IaC is a framework that takes proven coding techniques and extends them to your infrastructure directly, effectively blurring the line between what is an application and what is the environment. In a sense, this is the same thing DevOps is doing with the staff in charge of these two worlds, melding developers and operations staff into a single entity with a portmanteau of a name.

"Infrastructure as code" may not be as catchy as "DevOps." So..."Envirocation," anyone?


Original publication is here.