In the last post, I described the ways in which DevOps and Continuous Integration/Continuous Delivery (CD/CI) enable Agile development, as well as highlighted some of the many benefits of building a DevOps organization. Here, I’ll take the conversation a step further and focus on what’s required to build a successful DevOps organization once you recognize and accept that DevOps is right for your organization.
In the last post, I focused on different definitions of DevOps and RO’s opinionated point of view that DevOps involves closer interface and connection among all facets of product development and all members of the product team. Here, I’ll highlight some similarities between DevOps and Agile development, some similarities between DevOps and Continuous Integration/Continuous Delivery (CI/CD) and some overall benefits of a DevOps implementation.
In this post, I’ll provide a breakdown of what many people think DevOps is and what ReactiveOps (RO) thinks it is (“the RO Solution”). I’ll go further to explain why DevOps matters, what cultural and philosophical changes are needed to implement DevOps effectively and how you can know whether your organization will embrace the shift in thinking required. My goal is to provide some insight that might help ensure you implement the DevOps solution that best serves your business.
Everyone wants zero downtime and fully automated deployments, logging, metrics, alerting, etc... the list is long, but it feels like everyone's list is basically the same. So we set out to see if we could solve these issues in a repeatable way and grow a business that could scale by templating complex infrastructure for that large group of companies who are too big for Heroku but too small to be Netflix.
Logs are great. They are easy when you're running on one or two boxes. However, they get harder when you're running in a modest distributed environment and even more so if you've embraced micro-services and containers. At ReactiveOps we use Kubernetes for our clients. Centralized logs are critical for them to understand their environments and for us to assist in troubleshooting. Sometimes, a client will have existing logging tools or services they would like us to use. This is the case with my current client, who utilizes Logentries. I’ve compiled some notes and thoughts on getting Logentries running in a Kubernetes environment.
Scaling and automation only work when detailed monitoring is put in place. Monitoring is critical, but a lot of companies don’t make time for it. Monitoring can be complicated, and it’s difficult (and not fun) to think about risk management. What can go wrong? A lot. The very thought can be overwhelming.
You need experienced people to support your infrastructure and solve complex problems. That said, it makes less sense to hire in-house experts to simply push the same buttons to do daily upgrades and to grow your infrastructure on an as-needed basis. And what happens one day when those in-house experts decide to leave? That’s why automated infrastructure is key.
Most businesses grow and shrink over time, and the business you have today isn’t the business you’ll have in six months, let alone in one, three or five years. Growth and business tides – upsurges and down surges – are key reasons why it's important to scale infrastructure the right way. The goal is to make sure your infrastructure fits where you are at any particular moment in time. With technology today, this kind of flexibility can be built-in.
Whether or not root volumes on AWS need to be encrypted is a subject of debate. The encrypted AMI is all about protecting data at rest. Some of our clients don’t keep the kind of sensitive information that mandates encryption, while others entrusted with such data and are under regulatory or compliance mandates demanding encryption.
Ansible’s simple requirements make it very easy to get started. Overall, it works extremely well, but once you get a bit deeper some things might end up causing discomfort. Here are 3 things I’ve learned about Ansible (or re-learned) the hard way.
In the last kops post, kops 101, I covered what kops is and why it’s the right choice for professional-grade Kubernetes installations. This week, I’ll cover the ReactiveOps Way of setting up Kubernetes with kops.
rok8s scripts are Kubernetes scripts that help manage the deployment of applications to Kubernetes via a continuous integration environment. rok8s-scripts are ReactiveOps’ fork of the original k8s repository created by Ross Kukulinski. Here, I’ll provide some more context about what rok8s scripts are, what they do and why you should consider using them.
In my last blog post, I focused on why Docker is beneficial to use and why it offers significant value in a broad range of use cases. In part II of this series, I’ll focus on why using it is much less difficult than you might think. I’ll share some basic commands and examples to show you how easy Docker is to use.
Docker is one of the most talked-about technologies of the past year, and adoption rates are increasing rapidly – for good reason. For some developers and operations engineers, Docker may seem confusing and hard to use, but I’d argue that it’s just misunderstood. Here, I’ll explain what Docker is, then walk through why it’s beneficial to use for development and operations.
Amazon Elasticsearch Service is a managed service intended to make it easy to deploy, operate and scale Elasticsearch clusters in the AWS Cloud. When we first looked at Amazon Elasticsearch Service shortly after it was released in October 2015, we weren’t very impressed. I recently took another look, and many of the observations are surprising. Below, I’ll highlight some of these observations, including supported versions, access controls and more dedicated master choices, along with a few additional features.
kops (kubernetes-ops) is the one-stop, open source solution for deploying Kubernetes clusters from the command line. kops was designed to make installation of secure, highly available clusters easy and automatable on AWS. (As the project grows, support for other cloud providers continues to improve as well.) Kops currently focuses on full-cycle provisioning – from networking and security to the installation of the software on the instances that will make up your cluster.
Developers, operations teams, and business users have always been nervous about deployments because maintenance windows had a tendency to expand, causing unwanted downtime. It’s no surprise then that developers have always wanted visibility into deployments. After all, it’s their sweat and pride on the line. Operations teams, in turn, have traditionally guarded their territory so no one would interfere with their ability to get their job done. But the days of secret back-office deployments are gone.
When I started ReactiveOps almost two years ago, we set out to create a Ruby on Rails-like framework for AWS on the cloud – to stitch together the best open source components and write as little of our own code as possible, thus providing something greater than the sum of its parts. Our goal was to offer that infrastructure to small to mid-sized companies, enabling them to leverage the platform rather than hiring an in-house DevOps team and building their own.
Why outsource operations if you can simply hire a bunch of operations engineers and administrators and build and maintain your own infrastructure? It’s an important question. Let’s take a deeper look at the top 4 concerns small to mid-sized SaaS, web and eCommerce companies have about outsourcing DevOps work.
High availability is the expectation that a system will operate continuously for a significant span of time. For example, with 8,760 hours in a year, 99% availability signals over 7 hours of downtime a month and 88 hours of downtime over the course of that year. In turn, 99.9% availability—“three nines”—adds up to over eight hours of unplanned downtime, while 99.99% (four nines) translates into under an hour.
Lost data can mean lost business, so how can you make sure the cloud doesn’t bring you down? By keeping close tabs on what information has changed, who changed it and when. That means identifying all of the failure or disaster situations that can occur and their potential business impact – in advance.
Picture this – The operations team has spent months developing a hard-won understanding of technical bottlenecks and application-specific performance issues only to discover that this knowledge is no longer valid after a new feature launch fundamentally alters the way the app runs. The result? Slowdowns, errors and crashes. What do you do?
We’re all familiar with the most common threat vector – a hacker or other bad actor gains access to your cloud infrastructure to exploit system vulnerabilities. A less talked about threat vector is the purposeful or accidental deletion of data in house. Either way, the threat’s sure to become a reality some time or another. So how can you be prepared when that time comes?
Wherein we address the challenges in determining the kind of partner you should choose when seeking outside help with your cloud infrastructure.
5 Takeways from KubeCon—Kubernetes is quickly becoming the essential Docker container orchestration tool in the DevOps world.
If you don't monitor and log all the things, then you'll always be playing at least partially in the dark.
Leveraging a Platform as a Service (PaaS) is a great way to quickly build, innovate, and deploy a new product or service.
I’ve had my eye on Amazon’s Lambda and API Gateway services for a few months. Running a web application without the costs or headaches of maintaining servers is attractive. Of course, no platform is without tradeoffs.
Terraform doesn’t have support for conditionals on resources. There’s nothing like Ansible’s `when` statement to conditionally create Terraform resources based on a boolean variable value. At least not yet.
Riffs on microservices, docker, AWS, and serverless architecture/AWS Lambda.