If they’re not seeing it, it does not exist


This long hiatus on the blog will allow me to pick up this old draft and introduce this idea in a very organic way. If we take a look at the activity on these articles, one might think that I completely forgot about my website. The reality is that I have a dozen of half-written drafts, three different code projects and some opinion articles that I actually did not dare to publish (and more on that on a later date, perhaps) However, from a reader’s point of view, this site is dead and abandoned since 2017.

This is something that I see happening a lot on big organizations with DevOps adoption efforts. When a chance for improvement is located, and somehow it affects either the tools available, or a repeatable process, it is usually followed by a Proof-of-Concept. I am not referring to a company-wide push, or a strategic decision, but rather some quality-of-life change, or perhaps some attempts at automating something when there’s no established framework available for it – specially related to the current culture of the company. Whether this effort might be successful or not , it does not matter: the problem is that the findings from the exercise are kept within the team instead of sharing them, because they are not mission-critical, or out of fear that they might be seen as a waste.

When the effort is successful, a small part of the company suddenly becomes more agile, their KPIs increase, and they grow steadily. However, if this improvement is not made known openly,  not everyone has the chance to take advantage of the new asset.

In all likelihood, it’s way more probable that the things not openly discussed are failures. The business world tends to see failure as  negligence, as the result of incompetence or as the consequence of a disaster.  An unsuccessful attempt at improving something might indicate that either there is no margin for progress or, more likely, a block or an oversight. This issue can be related to technology, or to immovable procedures related to business logic. And unless someone involved in management is participating in this experiment, this effort will go nowhere.

The bigger the company is, the bigger the chance that your department or group might be missing on potential enhancements. So, how can we help distribute this knowledge? I have previously talked about the concept of Communities of Practice. Associating coworkers this way provides the perfect space to discuss about the small improvements that can turn into a differentiating value in the long run.

Starting a community in your workplace


You may not be familiar with the concept “community of practice”, (CoP from now on) so let me start this entry by giving a short overview. Essentially, we will be talking about a group of people with a trade or craft in common, that create a social gathering with the aim of sharing knowledge, best practices, and experience. Although this is a very “startup” thing to do, it has been gaining presence within larger companies that aim to improve in more ways than just on their technological level.

And this is when I show up. I want to convince you to start this movement within your business. Let’s focus on three big questions: why should you start a set of CoPs, who should attend, and where would these meetings take place.

Why: creating communities is more than just plain networking, or socializing. The main benefit is that you are empowering your workers with means and tools to share expertise and transfer knowledge.

Who: this is a tricky issue. Don’t make a community a mandatory activity. Instead, openly announce the creation of such groups, and let the most enthusiast employees to gather together. In order to achieve this, a representative moderator should be chosen as figurehead of the community: someone who has strong knowledge and skills on the issue at hand. The community moderator should be a team player, because every member of the CoP brings a different skill and point of view to the table.

Where: regular in-person meetings are preferred, so a specific and recurring room would work the best. If this cannot be arranged, prepare an online space in a collaborative tool such as Confluence, or use social media to let all members to be in touch and interact.

Although it might not be yet formalized, chances are high that CoP-s are already taking place in your organization, either at a coffee break, or during lunch. Take the chance to enhance such structure and allow your people to improve their productivity.

Dealing with distractions as a DevOps

hourglass-620397_640Being a transverse department, every DevOps must learn how to deal with issues. It’s a common pitfall that, embracing the agile philosophy, the team ends up accepting that word of mouth is a tolerable method to track or inform problems. Although having a chat can be a good initial contact, it has the potential to turn any work environment into a toxic one. Let’s take a look on practical approaches to avoid these productivity killers.

Prioritize business demands

Not every task is born is equal. Most ticketing systems allow to define a level of priority and/or impact when defining a concern or issue. Having a view that lists all these tasks sorted by their urgency will allow for easy dispatching of the most pressing matters. Solving an environment block does not have the same gravity for a standard regular feature on the beginning of a sprint than that very same issue when it affects a critical hotfix.

Learning where to relieve pressure is just another tool in the DevOps toolbox (and, actually, in any other department).

Timebox non-focus tasks

Set a maximum alloted time to deal with those tasks that do bot belong to the current effort. Chances are high that you will have to deal with obstacles belonging to any of the main DevOps branches (development, quality and infrastructure), but as a team you will also have your own goals and objectives. Make time for both internal and external issues, and obey these restrictions responsibly.

Reconcile with the idea that there will always be work to do

In my experience, developers and QAs that make the jump to DevOps have the most trouble with this point. Used to dealing with a small task backlog that is strictly defined on intervals, moving to a position where the focus can potentially change several times in a day can be stressful. The impact that small blocker tasks can be underestimated, and thus they can feel that their own effort is useless.

Learn to let go. You must be a reliable part of the team, dealing with your share of work. But that does not mean that you must overextend and take care of everything by yourself.

Educate your coworkers

And my last point will be about what can easily become the biggest irritation: the usually impatient workers that, finding their work blocked by difficulties out of their reach, buzz around waiting and hoping that you deal with it. They stop by, breaking your attention and demanding action. The following comic illustrates what they are actually doing when they come by your desk and request help:


Originally posted on http://heeris.id.au/2013/this-is-why-you-shouldnt-interrupt-a-programmer/
Originally posted on http://heeris.id.au/2013/this-is-why-you-shouldnt-interrupt-a-programmer/

I am actually using this strip to let them know what happens when they interrupt us without a genuine cause. In a nonchalant way, we must strive to teach them to follow the usual procedures, without letting us being dragged down to the inflexibility of bureaucracy. Talk to your colleagues, and let them know when and what is acceptable, putting emphasis on whether the impact of the issue is worth it.

In conclusion, learning how to manage your own time to help others is as much important as knowing how much time you must set aside to perform other duties.

Improve your daily work using gitflow


Gitflow, as originally written by Vincent Driessen, has become one of the most used version control work models. Using gitflow allows to apply a pattern of best practices when developing software… Which actually it’s something that any other workflow model does. The main advantage that gitflow offers over its competitors is that it makes daily work easy. Naming conventions are clear, and a glance to your latest pushes gives you a clear overview on who works on what, and what’s the status of a release.

Reduced to its essentials, gitflow uses two branches: develop and master. The first one is used to prepare your next release, while the second one holds your deployments to production. The premise is that all code pushed to master is as stable as it can be. New branches are created and then merged back to any of these two branches. Features grow from develop, while hotfixes are created from your master branch. Finally, your release candidate is created from develop, and it’s your transition from “new iteration” to “current version”.

Most branching models advocate to hold a single main branch, and then tag specific changes to mark “releases”. While it’s true that gitflow adds a layer of technical complexity, I believe that improves the readability of a project, as it offers a different perception. Think of develop as a waiting room, a different space from master, inside which you store changes that will eventually get released. Hotfixing issues in the already released code gives you isolation against the current sprint, which is something critical in any fast-paced environment.

Git allows for very comfortable collaboration between developers. Changes can be shared without using a centralized repository, giving room to tinker with proof-of-concepts or making different approaches for a single feature. If something needs to be scrapped, it can be done without cluttering the main lines logs: it’s as simple as not merging back the changes to the parent branch.

An unexpected increased value that comes from using this model is how seamlessly fuses with CI/CD approaches. Watching for pushes on certain branches simplifies when code changes are meant to be deployed on integration, pre-production, and production environments. Artifact generation for Continuous Deployment can be built up with new features until tagged fit for release.

Gitflow can be easily adjusted to big teams as much as you can adapt it to small ones. It’s intuitive, has a small learning curve, and allows for harmless adoption without altering heavily your workspaces. It is my believe that any semantic-versioned project can benefit from these guidelines, and I would like to encourage you to try it for yourselves.

Culture change does not happen overnight

cocoon-1727983_640One of a development’s department most critical endeavours is Change Management. Even in a technological environment, people tend to use those tools and methodologies that they are more comfortable with. Stepping into the unknown throws your workers off-balance, affects their confidence and increases the chance that something might go wrong somewhere.

Switching paradigms brings most often the biggest issue. It’s not just about a change of tools, but a transition of thinking models: suddenly, the old way of doing doesn’t cut it anymore. There will be a strong learning curve while your teams learn what works, and what does not. And chances are high that they’ll be going back to their code then and again, refactoring what was considered a “closed feature” since their lack of awareness regarding the new culture might have brought an increase of technical debt.

It’s important to allow your people to adjust and learn to do things consistently better. A strong foundation will reduce unforeseen consequences in the future. If you are using an agile approach such as scrum, consider halving your velocity for a few sprints. Yes, that much! This will allow room for tampering, experimenting, investigating and fixing mistakes. A “sprint zero” of sorts is highly encouraged. Something as simple as a freeform week to test new stuff will do wonders with the outcome of your migration.

Take things slowly. If there are several teams or projects, work on your transfer one step at a time. Turning the department upside down with innovation could bring things to a halt, which we obviously do not want. Instead, turn your workers into “ambassadors” of change. A team that has successfully undergone their change process can support the rest of their coworkers. A meetup or knowledge transfer space smoothens the bumps in the road, as some other colleague might have run into the same issue that is blocking a team’s effort. If they work together, they are building a strong sense of community, and actively creating business culture.

After the whole ordeal is behind, your duty is not yet done. Now it’s the time for feedback gathering. You should be able to get some KPIs, or retrieve hard data that backs up what these improvements were trying to accomplish. Are things back on track? What are the rougher spots? Are further actions or refinements to be taken care of? These are the questions that you should find answers for.

If things are not going as expected, you should ask yourself: what to do if things do not get better? There is no easy answer to this question, but your own workers will help you figure it out. Listen to their feedback to find the missing pieces in your puzzle, and prepare a new plan to tackle those issues. Change management does not happen overnight, nor it ever truly ends!

The problems of legacy systems


Many companies do not understand that they actually are “technological companies”. The fact that their “product” (meaning whatever item or service they base their business on) is not directly tied to software development weights so heavily in their mindset that all development efforts are pushed back down in their priority list.

However, the reality of current corporate models makes all financial efforts dependant of the programs they work with. Nowadays we cannot imagine a bank succeeding without strong online support, or a retail company with no e-commerce in place making any profit at all.

Cutting down corners in their infrastructure only limits their possible growth in the future. Startups got it right: software must adapt to the business needs, and not the other way around. But that insight is not something that larger and older companies could afford to put to the test a few years ago. Having an urgent need, investments in systems now obsolete were made, and today they struggle to keep up with the new development paradigms and architecture strategies.

Slowly, those companies are now understanding that it was a mistake. Let’s take a look at what I consider the most pressing issues of legacy systems:

  • Reliant on outdated hardware: the first one is pretty obvious. Having a special snowflake in your server cluster or virtualizing a specific setup in order to run your programs is a pretty good indicator that you need to upgrade your software.
  • Poor integration capabilities: if including new pieces to your architecture puzzle becomes a struggle, and you must invest more hours to make new programs work together with your already existing ones, than the time you used to make any new software, you’re wasting your team’s effort. Money that could be spent on new features is instead thrown down the drain validating quirks, bugs and unusual behaviours.
  • Lost knowledge: due to either bad documentation, lack of knowledge transfer, or both, you’ve lost track of how certain parts of the system work. These applications become some “strange magic” that does work and nobody knows how or why. While this may feel like a lesser issue, you’re building upon a house of cards. It might be a good exercise to take into account what could be done if one day, for whatever reason, this program stops working.
  • Frankenstein’s Monster: this is a personal nickname for those heterogeneous systems that rely on different bits, technologies, languages and architectures to solve a single requirement. Two different databases tied to each other, along with a compiled program that exposes an API that a front-end then parses to show data, for example. Spinning up servers to hold these chimeras may not be worth it in the long run, and designing a brand-new application to do the same thing might be cheaper than what you think.
  • Bad performance: finally, for some businesses, your application performance can be tied to your profit. Users are whimsical, and dealing with services that do not meet their expectations will have an impact on your image.

If some of these issues, or all of them, apply to your company, change is a necessity. The harsh reality is that the cost to fix your legacy system will only increase with time.

But don’t get depressed by this message: change is always possible. There is talent available that can help through the process of transformation, allowing you to start reducing your technological debt before it becomes too late.

Define a valid Rollback strategy

train-tracks-70948_640How do you deal with broken builds? Applying methods like Continuous Integration or Continuous Delivery usually make the lower development environments (usually tagged “integration” and “UAT”) unstable. As several teams merge their changes to the software, the end-to-end tests might fail because of unforeseen circumstances: datasets may not be valid for the certain version, the models linked to the database could not reflect the state of a table, or miscommunication between members could generate software that just plainly refuses to start.

It just happens. We have to keep in mind that detecting errors as soon as possible means that we have the time to fix and polish the product. We actually want those environments to be broken as soon as possible.

However, that could also imply downtime for the working teams. Breaking a development environment is no big deal, as long as we have a mechanism in place to reinstall a working version of our programs as soon as possible. We call that “a rollback”. You might have heard the term being applied to databases. After executing a script to change the data or structure, if something goes wrong, this operation returns your information to a working state. The concept is just the same, in a different context.

There are several ways to deal with rollbacks. The most common involves a blue/green strategy: you install your application on a dynamic path, and have an indirect access that can be changed to the new destination. If something goes wrong, your rollback becomes just as easy as changing this symbolic link back.

A repository with tagged artifacts works just as well. Each change in your source code generates a new version that is stored somewhere in your infrastructure. Then, you provide some tool that can request a specific version, so it overwrites  the one currently installed on your server.

Talk with your teams to find the most suitable strategy for your business. A quick way to solve issues means that you are making sure that downtime is as short as it can be, so that your talent can be focused on what really matters: your product.

How to recruit a DevOps


In my blog I usually talk on technology on a “management” level. My aim was trying to explain technical concepts without the “technical” part. Why should any company invest money in something, unless they understand fully what are the benefits of it? Today I am going to drift away from this tendency, and offer a helping hand to recruiters. What makes a good DevOps? How to find out if someone fits in this role? Let’s talk about those things.

First, if you didn’t read this entry, I would recommend you to do so. It’s a brief introduction to what DevOps as a role is. I talk about silos, about communication, and how there is no specific “role description” for a DevOps unless you understand the business culture that the position will fit into. It is important to understand this bit: “DevOps”, as such is currently understood, is a verb, not a noun. Someone “does” DevOps.

Most often DevOps positions are related to deployment pipelines, release management, architecture and provisioning. Creating packages and deploying releases are one of the widely accepted responsibilities of such an occupation. This is because the process of creating and publishing a release is the main intersection between the Development and Operations departments. Thus, DevOps eases the joint effort to move forward new code.

Environment handling is also a burden on a DevOps shoulders. Keeping the machine settings for both the developers and testers collaborating on a release is critical for any mission. Again, the DevOps role works toward helping two units work together.


Going back to the post, let’s focus again on the image that accompanies the text. Specifically, take into account that shows the three different departments we mentioned so far, and brings them together: Development, Quality and Operations. Thus, now we can answer the first question.

What makes a good DevOps? Experience with these three branches of software development. Skills related to this process. I am even going to boldly state that you only need a good grasp on any two of these requirements. A developer with an infrastructure background could work. A QA tester with knowledge of automation and scripting can do. An operations agent that writes his/her own scripts that provision and deploy code from source control can be a perfect fit.

Now, on the second question: How to find out if someone fits in this role? The answer is the final ingredient in the mix of skills that I just brushed over. A DevOps is a team player. Find a collaborator, an enthusiast. Someone passionate, driven to reach his objectives. In my opinion, that is the perfect fit for a DevOps.

Everything as code

monitor-933392_640Your version control tool keeps your application’s source code safe by storing all revisions and changes that were done previously. This makes it easy to roll back to a previous state where things worked correctly in the case that a modification puts a release at risk.

It is possible that you are already keeping the structure and data files that form your database in your version control tool as well – so you have in place “database as code”. Storing your schema definition and minimum data in a centralized tool allows your team to deploy several instances easily in an automated, repeatable way. We already talked about versioning your database previously.

Thus, if it’s possible to store your scripts to execute them on demand, what else can we apply this model to?

Defining and scripting your infrastructure could be the next step. Describing the requirements for the machines that will execute your application turns your dedicated fixed server instances into a dynamic environment where machines can be discarded and re-instantiated in minutes. This means that, if a specific configuration is troublesome, there’s no need to open a ticket and wait for someone to fix the mess, blocking your team while this happens. Instead, issues can be detected easily, allowing room to experiment with new values.

Automating your tools grants the chance to try new things. If they don’t work you can discard them, whereas before time had to be invested to fix and roll back modifications.

After “infrastructure as code”, we will talk about “configuration as code”. Since not every environment should answer the same way to certain parameters, these differences must be documented somewhere. Otherwise, you risk to lose this information. Including this description as part of your code makes sense if you are booting virtual machines from file descriptors, as most of these systems already have in place tools to apply variables such as these.

Code, database, infrastructure and configuration as code models allows you to benefit from a fully automated setup for your development teams. This reduces the time necessary for bug fixing, issue solving, and empowers every employee to have full control of the environment they work on.

The benefits of automated testing

wheels-784865_640Read carefully this statement: If your development pipeline does not include any degree of automated testing, you are wasting money. Don’t lose your temper, breathe slowly… Now read that again. Yes: you are losing money. Testing is burning up your budget with each iteration, and, somehow, you are accepting this as “adequate”. I can give you two choices about it: you may find comfort in knowing that you are not alone at all – the amount of manual testing that is still being performed in the industry is astounding -, or you could so something about it.

I will expose three simple points that automated testing surpasses human inquire. And after that, I’ll leave you to make a choice.

First: performing regression tests the old-fashioned way may not cover all possible scenarios. And take note of that “may”, because that’s precisely the key of this issue. “May” or “may not”, because all hand-operated process is not reliable. Humans are prone to errors. Companies tend to mitigate this concern by duplicating: what could slip through the fingers of one operator might be caught by another. You may even add a third redundant check, just to make sure, in the behalf of quality.

If this is the road we are travelling, then every team has two – or three – assigned testers working on their QA. Or perhaps a cross-team pool of testers was conceived to work on different projects at the same time. Whatever the specific choice, you have too many employees engaged on what is essentially the same task. This is not fail-safing, but profit squandering.

How long does your proofing phase lasts? A week? Two? How many additional environments do you need to run those checks? Testing, pre-production, demo… Is your testing team applying regression more than once on each environment for some reason? Must specific data-sets be provided for a release inspection? How much time does these data-sets take to be prepared? Every additional step, every extra degree, is a potential delay to your time-to-market. What happens when something does not fit the requirements? A bug found on the fourth day of testing means that the whole release goes back into the board, and must be re-scheduled. This is the second concern I am raising from not having automated testing.

Covering the basic steps of your product with automated testing increases your reaction span to trouble. And if you’ve been long enough in this business, by now you know that there is always trouble of some sort. Taking a big load off the shoulders of your QAs will increase the amount of effort that they bring into the table, so they can focus on the specifics instead of the routine.

I will now make my final point, the one I believe that hurts the most: manual testing makes the effort of your specialists worthless. Sooner or later a new release will be created, and they will have to start from scratch. Their work is not creating any added value at all. Knowing that, like mindless drones, they have to do the same ever again and again damages their morale, and quickly “burns up” any employee. Testing should be increasingly automated because of this: with each iteration a bigger percentage of the functionalities are covered. Each new automated process is 100% reliable, repeatable, and improves your safety net.

If you’re with me so far, this leaves you to make a choice. This is usually when all the doubts and excuses pop up. “Requires specific workers”. “No technology covers all of my needs”. “It’s too late to start now with something new”. Well, the fact is that you don’t have to go fully automated on your first sprint. Start small, then go big. That’s how business is done and how success is achieved, isn’t it?