Understanding Cloud Native
Mastering the Cloud with Cloud Native
By now, it’s apparent that the cloud, sporting nearly infinite computing power, networking and storage, enables fundamentally new ways of delivering both business value and IT agility. Equally evident, however, is the divide separating those organizations that treat the cloud as, in effect, an extension of their data centers from those that understand how to truly take advantage of the cloud (who know, in short, how to use the cloud as it was intended). The rewards for those who embrace the cloud — in terms of IT responsiveness, cost management and even competitive differentiation — can be significant.
Cloud-native enterprise computing implements new paradigms for computing, which no longer dictate monolithic, three-tier or even n-tier, tightly coupled applications with relational databases and ACID transactions. Rather, modern applications rely upon a massively distributed, scalable highly resilient ecosystem of cooperating services based on both relational and non-relational persistent stores.
This piece describes the core attributes, architectural principles and technologies behind this new approach to enterprise computing.
Cloud-Native Architecture: Simplifying Applications with Microservices
Primarily because of the expense of hardware, applications from a decade or more ago were monolithic or at best “three-tier,” having separate presentation, business logic and storage components. In both cases, as requirements grew, applications became cumbersome to maintain, complex and time-consuming to upgrade, limited in scale and flexibility, and brittle — that is, susceptible to even the most benign of failure conditions.
A new application architecture, popularized by Martin Fowler in a 2014 article, sought to address these issues. Microservices are discreet, separable, loosely coupled software components that each implement a particular business function, such as a shopping cart or payment system. Communicating with one another via APIs and message buses, microservices can be both replicated (i.e., many instances) and distributed in order to meet scalability and resiliency needs.
Among the many advantages of the microservice paradigm are that it enables small, independent development teams to each focus on one specific business function, rather than having a single team for the entire application. In addition, different functions can scale as needed at runtime. For example, early in a holiday season, the browsing function of an eCommerce application may see heavy traffic; later, as the holiday approaches, the shopping cart and payment functions will see the heaviest demand.
They’re not a technology or a product; rather, microservices represent an architectural pattern that can be implemented in many ways, such as with containers or serverless functions. For several reasons, microservices are well suited for cloud applications. The following diagram shows key attributes of the microservices pattern:
Domain-Driven Design (DDD)
Using the idea of a “bounded context” to subdivide systems along business and organizational — rather than technological — boundaries, DDD defines coherent, self-contained pieces of business functionality, meaning that microservices can be designed with minimal external dependencies.
High Cohesion, Loose Coupling
Microservices communicate via well-defined interfaces, and support flexible data schemas to avoid brittleness as data definitions evolve. Inter-microservice messaging uses any of a number of approaches, including RPC, event grids and publish-and-subscribe message buses. Finally, asynchronous processing (sometimes called the async/await model) is essential for scale because services can no longer afford to block on or wait for responses.
In a cloud-native environment where “perimeters” have little meaning, identity becomes the core unit of security. Cloud-native systems employ the principle of zero trust; all services run with the fewest privileges possible. Identity is transmitted between systems, either via the flow of tamper-proof tokens representing real-world identity or by using trusted sub-systems. All data is encrypted both at rest and in transit (and in some instances, in use); public-private key cryptography is used to manage encryption, with careful storage of private keys in dedicated, secure vaults.
As defined in the Reactive Manifesto, event-driven or reactive systems respond in near-real time to events, such as user requests. To achieve this goal, they rely on a combination of architectural approaches, including asynchronous message-driven communications, robust failure handling and elastic scale.
Not all workloads require the high levels of data integrity and consistency afforded by ACID transactions. For example, many applications — such as social media — can support independent updates on distributed replicas that periodically reconcile changes. Such eventual consistency increases the overall availability of a system since it avoids the complexity and potential deadlocks of distributed transactions.
Different microservices will have different storage requirements. Some may require ACID transactions, others simple append-only stores. Each microservice can choose the storage technology that best suits its needs. Cloud data warehouses and data lakes can then combine data from many types of stores in order to perform analytics and reporting.
Cloud-Native Technologies: Implementing the New Models
The benefits of cloud native are, of course, founded on the emergence of remarkable new cloud technologies, which continue to evolve at an ever-increasing pace. Four fundamental principles underly these technologies:
Elasticity & Scalability
With near limitless resources in the cloud, microservice-based applications can scale elastically. As demand rises, CPU, memory, storage, network and, increasingly, GPU, scale up automatically to meet the demand and, as significantly, scale down in periods of low demand to reduce costs.
Container-based platforms — in particular, the industry standard, Kubernetes — provide a distributed cloud operating system spanning hundreds or thousands of servers, supporting tens of thousands of container instances.
While Kubernetes-based systems require management of underlying infrastructure, serverless technologies let developers focus on writing stateless application logic, deployed as free-standing functions, using any of the modern programming languages and runtimes. Thus, developers can focus solely on providing business value.
Additionally, a wide variety of massively scalable data management platforms — relational, column-oriented, in-memory, NoSQL and document-based databases, files, large binary objects (BLOBs), key-value pairs, event streams and persistent queues — are all readily available on cloud platforms. The solution can be tailored to the problem with a relatively small cost to change. Vast volumes of data can be stored at a low cost, enabling deep analytics and machine learning applications not previously practicable on premises.
Finally, a wealth of highly secure networking options, ranging from virtual private networks to dedicated lines connecting cloud data centers and clouds to on-premises data centers, result in high bandwidth, low latency and low-contention solutions.
Monitoring & Observability
As microservice-based applications become increasingly componentized and distributed (using Kubernetes or serverless), new approaches to systems management are required to manage this complexity. Thus, cloud-native systems adopt a “measure everything” posture. Cloud platforms make it easy to instrument code and capture detailed information about the behavior of the system, providing monitoring (telling you something is wrong) and observability (using logs, metrics and other data to infer what went wrong).
Centralized infrastructure and standard log formats make it easy to collate and query these logs across multiple sources. Built-in tracing observability (using technologies such as OpenTracing) across cloud-native systems — from application to runtime and networking to storage — allows developers to track down bottlenecks and problems quickly.
Centralized, scalable logging and log querying (using technologies like Prometheus and its associated query language, PromQL) provide time-series monitoring, logging and observability, and generate alerts critical for reliability, decision making and running the business.
In cloud data centers containing millions of servers, routers, networks, racks and storage devices, failures happen, and microservices should be designed accordingly with an expectation of redundancy and failover.
In order to continue operating in an inherently unreliable environment, microservices take advantage of cloud features supporting resiliency. These include redundancy at all levels — compute, network and storage – and are configurable based upon required service level agreement (SLA). Other cloud capabilities, such as availability zones and georedundancy, help microservice-based applications survive hardware failures; retries and circuit breakers aid microservices in surviving networking overloads and cyberattacks.
Additionally, automated, intelligent systems management, relying on tracing and logs, can automatically detect and respond to failures, errors and changes in demand, track metrics and their trends, and take preventative action before a failure occurs. Finally, platforms such as Kubernetes can “groom” new instances of resources and launch them to replace failed ones.
Automation & Provisioning
DevOps, combining development and operations, implements new, faster, automated processes, streamlines the development process, and makes deployments more secure, reliable and predictable.
Continuous integration and continuous deployment (CI/CD) pipelines automate everything that happens, from code check-in to deployment, including invoking automated testing, security vulnerability analysis, system integration and deployment to the cloud.
At the heart of a CI/CD pipeline are declarative descriptions of configurations, saved and versioned just like application code (hence called ‘infrastructure as code’). If a new configuration fails, it can easily be rolled back to a known working version. Updates to live instances are discouraged because of the risk of configuration “drift;” that is, the runtime code in the production version diverging from the original source code. Hence, the principle of immutable infrastructure dictates that any change that is necessary for the production version requires a rebuild from the original source, which, assuming the steps in the pipeline are largely automated, should not be an onerous task.
Cloud-Native Computing: Mastering a Whole New World
Without question, the emergence of the cloud-native computing model has revolutionized not just the mechanics, but the very core of how we think about enterprise software. In just a few short years we have evolved from mainframe, client-server and three-tier models, to a massively scalable, distributed, microservice-based paradigm, changing both the very character of applications and how we develop them.
With any great change comes an initial period of complexity as organizations seek to master new architectures and technologies. Yet, by adhering to these principles, organizations can quickly overcome the complexity and realize the technical and business benefits that the cloud promises.