What is devops?

It’s not all about the tools

Created by Laurence J MacGuire a.k.a Liu Jian Ming

ThoughtWorks Xi’An, 2014/12/04

Creative Commons License

The “5 W’s”

  • Who?
  • What?
  • Where?
  • When?
  • Why?
  • … and how?

Waterfall project management

It makes perfect sense

Perfect Sense

The Client

The Client

Marketing

Marketing

BA/PM

BA/PM

The Programmers

The Programmers

And Then The QAs

And Then The QAs

Meanwhile, as a PM/Marketing

Op will surely deliver

As a Client

The Blurst of Times

Agile Project Management

The Same Roles

The Client

Marketing - BA - Programmer - QA

和谐 / 合作

Such Harmony

Positive Changes

  • Unit Testing
  • Behaviour Testing
  • Integration Testing

TADA!

Continuous Integration

Oh, Wait

Code that only lives in Git is dead.

And where’s the Ops guy?

Developers

Responsibilities

  • Writes code
  • Fix bugs
  • Test things

Developers

Requirements

  • Tools (IDEs, editors, debuggers, etc)
  • The Internet (LOL, HuaWei)
  • Clear direction from “Business roles”
  • An Ops team to run their software

Developers

Wants

  • Build cool stuff
  • Use cutting/bleeding edge tools
  • And ship fast, move on to the next cool thing
  • No bugs

Operations

“SysAdmins” & “NetAdmins”

REA’s Operations people

  • Jesse (Network)
  • Javier (General infrastructure)
  • Wigs (Security)

Operations

Network Admins

Network geeks

Operations

Systems Admins

System Admins

Operations

Security

Security

Stuff they do

  • Setup ActiveDirectory
  • Install/Repair servers
  • Install and maintain network equipment (800+ users)
  • Plan for huge networks
  • Install phone systems
  • Know Linux/Windows/Cisco IOS by heart
  • (All that command-line magic)
  • Restart whole data-centers at a time
  • Last line of defense for any IT related issue

Also

Beer

Operations Redux

Responsibilities

  • Keep systems up and running
    • Up-to-date
    • Secure
    • Working
  • Provision systems
    • Because of growth
    • Because of new requirements

Operations

Requirements

  • Stable/Predictable software
  • Clear/Consistent growth/change plan

Operations

Wants

  • Replace themselves w/ a script
  • Never hear the pager go off

Conflict

  • Cutting/Bleeding edge is NOT stable/predictable software
  • Rushed releases are NOT a clear change/growth plan

Conflict!

  • Dev: “I want {mongodb-2.Xalpha/amazing-non-acid-compliant-‘database’-beta}”
  • Sysadmin: “Nope. It’s unstable”

Crying

Conflict!

  • SysAdmin: “There’s a bug in production”
  • Dev: “It works on my machine”

Poker Face

Conflict!

  • Dev: “Fixed the permissions problem w/ chmod -R 777 /”
  • SysAdmin: “)(!*@#)(^)(@#)(“

FUUUUUUUUUUUUU

Conflict!

  • Dev: “We want to ship this software.”
  • SysAdmin: “Ok. Our next change window is in 18 days.”

Okay

Battle of (two) roles

  • SysAdmins very protective of their turf
  • Developers relinquish accountability

It’s bad. It’s a self-reinforcing cycle and everybody loses.

Continous Delivery

An Extension of Agile

Cut delivery time (each iteration, or less)

Stability through constant/incremental change

Increase visibility/feedback

We can do that w/ Delivery

But Wait! That fence!

What is DevOps?

  • DevOps is NOT a tool.
  • DevOps is NOT someone.
  • DevOps is NOT a team.

Right. But what is DevOps?

Development & Operations

It’s bridging the gap between development and operations.

DevOps is a way of doing things. It’s an attitude.

Positive Changes

  • Moore’s Law: The cost of virtualisation is minimal
  • Better support for all things distributed

Primitives

  • Server vs Instance
  • Hard-drive vs Persistent Storage
  • Data-center vs VPC

Awesome

  • Composable tooling
  • We can program this stuff

Paradigm Shift

Infrastructure barrier much lower

  • Callable through APIs
  • Requires mostly theoretical knowledge
  • Much less practical experience

Paradigm Shift

Traditional SysAdmin offerings have changed

  • Support these primitives
  • Virtualisation is isolation

We can leverage their work. And we must!

Ten Types of People

There are 10 types of people. Those who can read binary, and those who can’t.

– The internet

There’s more to it

[Good] ---------(Larry)-------------(Internet Explorer)- [Evil]
[Noodles] --(ShanXi)-------(Larry)---(GuangXi)---------- [Rice]
[LvRouHuoShao] ----(Larry)------------------------- [RouJiaMou]

And it’s all good.

Everyone is different.

No one is better.

Everyone has something to contribute.

Eliminating binaries and Barriers

  • People fall in their own (comfortable) place
  • Some are still at the extremes
  • On average, everyone is closer

How do you think this makes us better, as human beings? as a team?

In Practice

Everyone is more aware of others.

  • Their preferences
  • What annoys them
  • What they know, what they don’t
  • How they can help us, and we them

Bridging the Gap

[Ops] (Javier)----(Colin)-(Karel)---(Larry)-------(WenBo)---- [Dev]

We’ve all shifted towards each other.

Development heavy people

  • Write code, write tests, debug, blah blah
  • Bring software craftsmanship to the ‘opsy’ people’s tools
    • Testability
    • Code quality & metrics
    • Code reviews & pair programming
    • an Agile process

Operations heavy people

  • Maintain infrastructure, etc
  • Bring Operations experience and knowledge to devs
    • Architecture due-diligence
    • Handling scale & resiliency
    • General Ops capability e.g., a bit of paranoia

Collaboration & Experience

Both sides contribute. No one pulls. Everyone is better off.

Enabling Teams

  • Let teams be flexible: Software & versions
  • Let teams be in control: How & when they delivery software

In True Agile Fasion

Continuous Improvement

  • Identify Pain-Points
  • Identify Tech-Debt
  • Refactor & Simplify

Simplify even more. Reduce the barrier, to include more people.

That ‘DevOps’ Role

Ops people that code/Code people that Ops.

Useful is large organisations

  • Follow the 80/20 rule
  • Assigned in development teams
  • Offload work from the real Ops people
  • Hack code vs Hack servers
  • Facilitate role transitions

We Don’t do DevOps

Karel, Lauchlin, DanX, Larry don’t do DevOps

We make our teams better at it.

Is ther anything technical?

All you’ve talked about is human resources & project management.

How it’s done. 3 tenets.

Repeatability: Automation.

Visibility: Logging, Monitoring & Alerting.

Flexibility: Adequate reponse to change.

Repeatability

Repeatability is the ease with which a process can be re-done.

It’s important because it assumes a process and it’s dependencies have been thoroughly understood and distilled to a very simple form.

It’s executable documentation.

Repeatability

$ ftp production-site.com
> put index.php
> put lib/something.php
$ ssh ...

Error prone.

$ RAILS_ENV=production rake deploy

Not error prone. Anyone can run it, and see if it succeeds.

Makes few, if any assumptions.

Repeatability

Plenty of tools exist

  • AWS
  • Ansible
  • Puppet
  • Chef
  • Terraform
  • SaltStack
  • CFEngine
  • Docker
  • Vagrant
  • Git

Tools: Does it matter?

No!

What it is

Does your tool fit the following criteria?

  • Is simple?
  • Does the same thing every time?
  • Clearly tells you if something goes wrong?

Good

Visibility

Visibility is the ease with which one can extract valuable information

It is important because we need to know if something goes wrong, and be able to debug it to fix it.

Visibility

Too much information. Hard to extract.

$ cat /dev/random > /dev/console

Too little information. no value.

$ rake deploy 2&> /dev/null

Better: Organised, extractable

Nov 30 17:08:45 cnmlarry NetworkManager[1151]:  Activation (wlan0) Stage 4 of 5 (IPv6 Configure Timeout) scheduled...
Nov 30 17:08:45 cnmlarry NetworkManager[1151]:  Activation (wlan0) Stage 4 of 5 (IPv6 Configure Timeout) started...
Nov 30 17:08:45 cnmlarry NetworkManager[1151]:  Activation (wlan0) Stage 4 of 5 (IPv6 Configure Timeout) complete.

Visibility

Plenty of tools exist

  • CloudWatch
  • NewRelic
  • Dynatrace
  • Nagios
  • Sensu
  • Rieman
  • Splunk
  • Loggly
  • Logstash (ELK)
  • etc …

Tools: Does it matter?

No!

What it is

Does your tool fit the following criteria?

  • Is simple?
  • Identifies when something goes wrong?
  • Tells the right people when something goes wrong?
  • Lets you determine what went wrong?
  • Lets you determine why it went wrong?

Good

Flexibility

The tools you use, are they flexible enough to …

  • Fulfill current business needs?
  • Fulfill imaginable/plausible business needs?

Good

The Bottom Line

It’s about making IT delivery

  • As painless as possible
  • As safe as possible
  • As fast as possible

Questions? Comments? Rotten tomatoes?