“DEVOPS is the union of people processes and products to enable continuous delivery of value to end user”

Donovan Brown Microsoft

Is DevOps Real

Definition of Terms: The phrase OODA loop mentioned in the video refers to the decision cycle, Observe, Orient, Decide, and Act, developed by military strategies and United States Air Force Colonel, John Boyd. Boyd applied the concept to the combat operations process, often at the strategic level, in military operations. It is now often applied to describe commercial operations and learning processes. This approach favors agility over raw power in dealing with human opponents in any endeavor.

Fast Delivery Cycles

One of the core values of DevOps is a shortened release cycle. John Allspaw, who was an employee at Flickr when he pioneered much of the DevOps thought process, and is now at Etsy, where he is a leading thinker, created the following visual to demonstrate the difference between a slow and faster delivery cadence:


The chart on the left shows the rate of change for a slow delivery cadence. If you look along the time vector, you see a long time between deliveries. Maybe this is three months. Maybe this is three weeks. Whatever the time frame, if we wait a long time and then deliver into production, we will deploy a massive amount of change, including many lines of code, lots of features, and lots of things that can go wrong.

Oftentimes, organizations will think that this is the right way to deliver because, when they stretch out their delivery times, they are able to perform a lot of regression testing and full testing, and ensure that everything is complete. It is considered safe when change does not occur too frequently. This is not the case. As you lengthen the delivery cycles and increase the amount of churn, it becomes very, very difficult for to go back and fix anything bad that happened in the release. There is so much change that it becomes difficult to identify where the problems occurred, resulting in a lot of dysfunction. When we spend more time in regression and testing, then we spend more time waiting and making sure the deployment is absolutely perfect.

If we move to a fast delivery cycle, where we deploy very frequently, the changes are smaller at each individual release. The goal is faster and faster release cycles with less and less change, so that we ensure that what goes into production has limited impact and we can test very specifically. Consequently, we can get things into production quickly, get feedback rapidly, and learn from our customers at a much faster cadence.

DevOps Lifecycle

https://upload.wikimedia.org/wikipedia/commons/thumb/0/05/Devops-toolchain.svg/1280px-Devops-toolchain.svg.pngOver the past 15 years, agile software development has gained immense popularity among software development teams. The agile process accelerates the software development process and forced a faster release cadence.

Traditional IT operations teams struggle with a fast release cadence because their practices make the release and operations processes reliable, but not agile. Additionally, the disconnect between development and operations increases the likelihood of mistakes and lead time when issues occur.

Within a DevOps culture, all team members who are involved in creating, delivering, and monitoring software, work together closely to ensure high-quality releases at increasing frequencies.

Over the past 15 years, agile software development has gained immense popularity among software development teams. The agile process accelerates the software development process and forces a faster release cadence.

Traditional IT operations teams struggle with a faster release cadence because their practices make the release and operations processes reliable, but not agile. Additionally, the disconnect between development and operations increases the likelihood of mistakes and lead time when issues occur.

Within a DevOps culture, all team members who are involved in creating, delivering, and monitoring software, work closely together to ensure high quality releases at increasing frequencies.

State of DevOps Report

Puppet Labs publishes the State of DevOps Reports based on responses from over 20,000 tech professionals worldwide. They conclude that organizations embracing DevOps practices consistently outperform their peers. For example,

  • Companies with high-performing IT organizations are twice as likely to exceed their profitability, market share, and productivity goals.
  • High-performing IT organizations experience 3 times lower change failure rates and recover 24 times faster than their lower-performing peers. They also deploy 200 times more frequently with 2,555 faster lead times.

For more information, see the following video and the 2016 State of DevOps Report.

DevOps Practices and Habits

The video in the next unit provides further information about the following seven practices and habits of a successful DevOps culture. These include:

Practices

Habits

Seven Key DevOps Practices:

  • Configuration Management
  • Release Management
  • Continuous Integration
  • Continuous Deployment
  • Infrastructure as Code
  • Test Automation
  • Application Performance Monitoring

Seven DevOps Habits:

  • Team Autonomy and Enterprise Alignment
  • Rigorous Management of Technical Debt
  • Focus on Flow of Customer Value
  • Hypothesis Driven Development
  • Evidence Gathered in Production
  • Live Site Culture
  • Manage Infrastructure as a Flexible Resource

See also, Sam Guckenheimer’s talk on the 7 habits of successful DevOps.

Build and Release Pipeline

DevOps believes in the automation of builds and deployments. Conceptually, a release pipeline is a process that dictates how you deliver software to your end users. In practice, a release pipeline is an implementation of that pattern. The pipeline begins with code in version control and ends with code deployed to production. In between, a lot can happen. Code is compiled, environments are configured, many types of tests are run, and finally, the code is considered “done.” By done, we mean that the code is in production. For some organizations, a release pipeline strategy and structure needs to cover services, database, web, and mobile application components, and include:

  • binary package consumption
  • continuous integration builds
  • package publishing
  • automated testing
  • continuous delivery

It is important to recognize that mobile deployments are different than web and server deployments. How you get your mobile app out for feedback is different than the process for server application feedback.

Binary package consumption

Package management allows you to reuse trusted components. An example is NuGet feeds. If you consume and reuse packages consistently, then you can keep your application fresh as those packages are updated. This is particularly important for security and compliance as the packages can be scanned, uniquely identified, and, in the event of new vulnerabilities, automatically updated.

Packages should not usually be included in version control because they may significantly increase the size of version control repositories. Better to have the packages stored in package management, such as internal NuGet feeds.

Continuous integration builds

Continuous integration builds are builds that are triggered upon every check-in of code. They should be quick to complete and include automated tests (such as unit tests). They may or may not drop artifacts upon completion.

Package publishing

Package publishing may tie into binary package consumption. You should use a version that matches the build and deployment number for full transparency (if a deployment breaks, you should be able to find the matching package quickly). The package should be hosted somewhere accessible by projects, such as NuGet feeds.

Automated testing

Testing should include unit, integration, automated UI, web performance, and load tests. Tests should be able to run independently and as often as possible without breaking, such as in continuous integration builds or automated deployments.

Continuous delivery

Continuous delivery means that deployments are triggered automatically after a build completes. The process should include picking up a compiled package of code, setting the target environment in the desired configuration state, and deploying automatically to environments. It may include approvals into QA or production environments.

Backlog

One of the primary cultural shifts required for adopting DevOps practices is at the backlog level. The novel, The Phoenix Project, designates the following types of work:

  • Business projects: work that comes from business initiatives.
  • Internal IT or engineering projects: infrastructure/operations projects from business projects, or internal improvement projects. These items should be tracked in the same backlog.

Note: To continuously improve the system of work and adaptability, it is recommended that at least twenty percent of development and operations teams’ time is allocated towards nonfunctional requirements (security, infrastructure, etc.) to make the process significantly better and more stable. If time is not allocated to this work, technical debt will accumulate and create more unplanned work.

  • Unplanned/recovery work:  operational incidents tacked onto planned work, which may cause bottlenecks and confusion with handoffs. This work may also cause long lead times with high utilization and is the result of not removing technical debt nor improving practices.
  • Most organizations use some form of version control to manage their source code. However, not all organizations use version control effectively. Two version control strategies sometimes forgotten are 1: setting up an appropriate branching strategy, and 2: and enabling transparency by linking work items to code.

Version Control

One approach is to have a single branch (such as the master or main branch) that gets built in an automated continuous integration build, then deploys to a development environment upon every commit or check-in that occurs to the branch. This way, it is easiest to know which changes have been released into the development environment, and eventually into production. The changes automatically deployed into the development environment will then get promoted or copied to the test or staging environment, and then into production. The same compiled code will get deployed to production so that there is never a situation where unchecked code gets promoted to production. Before deploying to any environment, it is important to ensure the changes made align to the business strategy.

An easy way to verify that deployed code has value and is related to work on the backlog (rather than code that does not tie back to any work items) is by linking work items to code changes so that metadata is kept with the changes, and then with build and deployment. By connecting work in progress to code, you can validate feedback from users as well as the changes that were made in the code.

Some organizations use different branching strategies, depending on the type of application being deployed, the degree of coupling, the organizational structure, and the cadence of deployments. A general rule-of-thumb is to promote the simplest branching strategy possible to keep maintenance low and sustainability high. If you have long-lived branches that are infrequently merged, you are accumulating a subtle form of technical debt.

If your organization does not currently use version control, it is important to start using it to version your code. A good way to start is to learn more about Git repositories, and distributed vs. centralized version control.

Continuous Testing

Continuous testing is the execution of tests repeatedly against a code base and deployment environment. In practice, continuous testing is the most difficult part of a continuous delivery pipeline to keep up to date. Continuous testing provides quality gates throughout the pipeline and increases confidence in code long before production.

Automation is the key: automated unit, integration, coded UI, and load tests are common in continuous testing.

Use an iterative and incremental approach: depth of testing often progresses as an environment gets closer to production.

Beta Testing and Progressive Exposure

Beta testing and progressive exposure are important strategies in DevOps to receive critical feedback in production.

Beta testing:  A form of external user acceptance testing where beta versions of an application are released to a limited audience, known as beta testers. Versions are released to groups of people for more testing, and sometimes made available to the public for more feedback.

Progressive exposure: Switching small numbers of users over to a new version of software for feedback, then progressively exposing more users to the features over time.

Both of these techniques involve a small group of users who test new versions of an application. Rapid feedback means you will quickly know which features are relevant and have the information you need to strategize and implement ways to improve the application. Any performance issues will only affect the small number of users testing the new version. These topics are covered in more depth in the section on application performance monitoring later in this course.

Compliance in DevOps

If you work at a financial institution, healthcare company, government agency, or any other regulated industry, compliance is often the first concern when thinking about moving to DevOps. From Sarbanes-Oxley (SOX) and Health Insurance Portability and Accountability Act (HIPAA) to Criminal Justice Information Services (CJIS), some organizations must be more careful when making process decisions.

A primary benefit of many deployment solutions now is that they offer secrets management, for example, Azure KeyVault. A secrets store gives you the ability to abstract passwords, connection string information, and any other variables from all users so that there is no direct access to sensitive data in databases or on machines. Administrators only set up access for the deployment solution to the machine and abstract the passwords.

An automated release pipeline not only ensures consistent versions across assemblies, packages, builds, and deployments, using automated tests, infrastructure as code, and frequent releases, but also makes compliance more predictable and straightforward. With fast feedback loops, if there are noncompliance issues, corrections can be made more quickly, and the changes can be tracked automatically.

Ken Cheney, from Chef, states “the key to making compliance an advantage is to specify compliance requirements as code, allowing it to be tested just like any other piece of code in the software development pipeline. Previously, manual verification tasks—often tracked through spreadsheets or other arduous methods—can now be proactively addressed as embedded tests in an automated workflow. Security risks are brought to the surface early for faster remediation, so out-of-date software is identified and updated quickly.”

Every organization treats compliance differently, so different tools may be needed depending on requirements. Whichever tools you adopt for compliance, DevOps practices offer predictability and simplification towards maintaining compliant practices.

For more information, see http://devops.com/2016/03/18/devops-help-hinder-compliance/.

Security in DevOps

Along with compliance, security is often another fear when considering DevOps practices. It can be concerning to imagine more automation and less manual security checks. According to Wired magazine, “ultimately, DevOps will turn the IT business model on its head with shorter cycle times, automation, and deep cross-functional integration to deliver the next great idea.” By bringing into focus earlier in the deployment pipeline processes such as configuration management and automated testing, fast and predictable releases are possible. Security can be introduced earlier in the process as well.

DevOps can improve security in the following ways:

  1. Component packages can be automatically scanned from a trusted registry.
  2. Using automation and operational tools, security can be addressed when development begins with code analysis, rather than as an afterthought. After fixing the code, the code that enters production is sure to have already gone through security scans and remediation. Failures can break the build.
  3. If a vulnerability is discovered, an automated pipeline can immediately remediate by deploying a new component package.
  4. Using the public cloud creates a dynamic infrastructure. Unlike a static data center architecture, there is no place for persistent threats to hide.
  5. Another option is to automate simulated attacks and stress on the system before it goes to production to validate the code’s responses. If these attacks are added into the build pipeline (such as calling a script and exiting if failed), these tasks can automatically fail the build, and the release won’t get deployed to the next environment.
  6. Once in the production environment, automating tests for security and continuously monitoring will ensure that the application is secure.

It’s important to note, however, that security specialists are still required for monitoring and adding security in development to DevOps. There are never any guarantees, but by automating more processes and establishing predictable pipelines and processes, good security practices can be even more consistent.

DevOps in the Cloud

When deploying to different environments in a DevOps culture, one of the first questions that should be addressed is: “what type of environment are we deploying our code to?” The answer depends on the type of application and the current deployment process. There are many options for cloud providers, whether public or private (such as for government), and on-premises is still a common scenario. There are advantages and disadvantages to each, and some actions are easier to accomplish on one type and harder on another.

Easy in the cloud, difficult on-premises:

  • Scale: scale vertically or horizontally to as many machines you need, only limited by the amount you wish to pay for. This is not always possible on-premises, whether blocked by physical memory available for VMs, or hoops to jump through for acquiring new machines.
  • Store data: although it varies between which cloud you choose, the price for storage accounts in the cloud may be much cheaper than hosting your own SAN with enough space.
  • Reliability: Many cloud providers offer a service-level agreement of nearly 100% (potentially 99.9%), and that isn’t necessarily a guarantee on-premises.
  • Time and people needed: when on-premises, you might not often consider the cost for time and people involved with setting up and maintaining servers. It may take a few days to get a machine provisioned on-premises; but, in the cloud, it may take only minutes. Instead of assigning roles to start up and maintain machines on-premises, time can instead be used to improve cloud infrastructure or innovate to make the DevOps process better.
  • Security: cloud providers like Microsoft, although they are larger targets for attacks, are constantly looking for breaches in security with aggressive white hat hackers. It may be possible they can be more secure than your on-premises machines.
  • Cost: cost can be optimized for variable loads. This, of course, depends on your application architecture and requirements. The cloud makes it very economic to scale out, not necessarily to scale up.

Easy on-premises, difficult in the cloud:

  • Rework: If major rework is required to get the application into the cloud, it may be too challenging to justify the time and resources needed.
  • Culture: Although not often considered, the culture of an organization greatly affects the move to the cloud. If a rigid organizational culture favors on-premises machines, it may be challenging to move to the cloud.

Change Agents for DevOps

Making the cultural shift to DevOps is difficult. You can be an effective change agent for your organization to successfully guide this shift by understanding business problems and connecting the technical measures that will solve them, and by tracking metrics to align with suitable business results.

For example, when attempting to convince your business stakeholders to automate deployments, instead of starting the conversation with, “we want to automate our release pipeline,” try driving the conversation with: “by focusing on our deployment speed, we have faster time to mitigate issues and faster time to value. Doing this allows us to work in small batches so that we can test for usage and get validated learnings out of every deployment. It’s typical that requirements increase business value about a third of the time, decrease business value about a third of the time, and make no difference a third of the time. We want to fail fast on the ineffective two thirds and do more of the one third that generate improvements. With DevOps, we can work faster to know that we’re developing the right features or pivot if necessary. Ultimately, we can improve and adapt based on being constantly informed by data.”

Drawing the connection between how your organization performs some action and why has to be first and foremost. If you are able to tie in the business strategy with technical implementations, you’ll be able to overcome most any objection to DevOps.

Security in DevOps

Along with compliance, security is often another concern when organizations contemplate moving toward DevOps practices. The prospect of more automation and fewer manual security checks is an understandable concern. An article in Wired magazine offers this view: “Ultimately, DevOps will turn the IT business model on its head with shorter cycle times, automation, and deep cross-functional integration to deliver the next great idea.”

By bringing processes such as configuration management and automated testing into focus earlier in the deployment pipeline, fast and predictable releases are possible. Security can be introduced earlier in the process as well.

DevOps can improve security in the following ways:

  1. Component packages can be automatically scanned from a trusted registry.
  2. By using automation and operational tools, security can be addressed when development begins with code analysis, rather than as an afterthought. After fixing the code, the code that enters production is certain to have already undergone security scans and remediation. Failures can break the build.
  3. If a vulnerability is discovered, an automated pipeline can immediately remediate by deploying a new component package.
  4. Using the public cloud creates a dynamic infrastructure. Unlike a static datacenter architecture, in the cloud there is no place for persistent threats to hide.
  5. Another option is to automate simulated attacks and stress on the system before a product or app goes to production to validate the code’s responses. If these attacks are added into the build pipeline (such as calling a script and exiting if failed), these tasks can automatically fail the build. This ensures that the release won’t get deployed to the next environment.
  6. Once in the production environment, automating tests for security and continuously monitoring will ensure that the application is secure.

It’s important to note, however, that security specialists are still required for monitoring and for adding security in development to DevOps. There are never any guarantees, but by automating more processes and establishing predictable pipelines and processes, good security practices can be even more consistent.

Case Study: Microsoft Developer Division Moves to DevOps

Over seven years, the Microsoft Developer Division (DevDiv) embraced Agile practices. The division achieved a 15-times reduction in technical debt through solid engineering practices, drawn heavily from XP. They trained everyone on Scrum, multidisciplinary teams, and product ownership across the division. They significantly focused on the flow of value to customers. By the time they shipped Visual Studio 2010, the product line achieved a level of customer recognition that was unparalleled.

After they shipped Microsoft Visual Studio 2010, the team knew that they needed to begin converting Team Foundation Server into a software as a service (SaaS) offering. The SaaS version, now called Visual Studio Online (VSO), would be hosted on Microsoft Azure, and to succeed with that they needed to adopt DevOps practices.

That meant that the division needed to expand their practices from Agile to DevOps. A tacit assumption of Agile was that the Product Owner was omniscient and could groom the backlog correctly. In contrast, when you run a high-reliability service, you can observe how customers are actually using its capabilities in near real-time. You can release frequently, experiment with improvements, measure, and ask customers how they perceive the changes. The data that you collect becomes the basis for the next set of improvements you do.

In this way, a DevOps product backlog is really a set of hypotheses that become experiments in the running software and allow a cycle of continuous feedback. DevOps grew from Agile based on four trends:

Screenshot of a DevOps Lifecycle. An oval made of arrows pointing counter-clockwise is in the middle of the screenshot, and on the inside, abutting either side of the oval, are two more arrows pointing counter-clockwise that form small circles, indicating a brief exit from the larger lifecycle, and then reentry back into it. In the middle of the oval are a group of people icon standing above a bi-directional arrow labeled

Unlike many “born-in–the-cloud” companies, Microsoft did not start with a SaaS offering. Most of the customers were using the on-premises version of their software (Team Foundation Server). When VSO was started VSO, the division determined that they would maintain a single code base for both the SaaS and “box” versions of the product, developing cloud first.

When an engineer pushes code, it triggers a continuous integration pipeline. At the end of every three-week sprint, they release to the cloud, and after four to five sprints, they release a quarterly update for the on-premises product, as illustrated below:

Screenshot of the Visual Studio process. A bar on the top has three circles with clockwise arrows. The circles are labeled Update 1, Update 2, and Update n. A bar on the top has 5 similar circles with no labels. Between the two bars is a larger counter-clockwise circle labeled vNext.To learn more about specific aspects of this journey, from source control to live-site culture, and from branching to open source, to alignment, see Our Journey to Cloud Cadence: Lessons Learned at Microsoft Developer Division

To be continued

 

Leave a Comment


Your email address will not be published.