Lead an autonomous DevOps team at Scale: a true story

Microsoft Ignite 2016 conference

Jose Rady Allende and Matthew Manela provide the next update to the Microsoft ALM story, sharing some screenshots of how they refined their use of Kanban boards as their agile experience and devops maturity grew stronger.

Synopsis and Lessons Learnt

Agile / Kanban / Scrum

  • In the beginning development and testing were separate teams and the Kanban board had a separate development column and testing column. All the testing was manual. Each user story was dependent upon all the development being completed (and deployed) before any testing could be started.
  • When the development and testing teams merged into a single ‘engineering’ team some unit tests were developed in parallel with the code so there was no longer a need to wait for the all development of a story to be completed before any testing was started. The Kanban board had a single ‘engineering’ column where the team developed and tested a story. However there were only very few automated tests and not everything could be fully tested, so there was still a need for a ‘validation’ column to run the manual regression tests once each story was fully complete and deployed.
  • Over time the test automation at lower levels increased and the team significantly cut the time spent in ‘validation’. Devops practices were adopted such as automated build and deployments, introduction of feature toggles and the code branching strategy was improved. A pull request column was inserted between implementation and validation to improve visibility of the work that is complete but waiting for code (and test) review. There is still a validation after the code is merged and deployed although the time spent in this column is much shorter than it used to be.

Dealing With Unpredictable Team Velocity / Productivity

  • Early on the team often delivered late as they struggled with incidents and unplanned work. They now plan capacity for this work up front when they take work into the sprint, and are more reliable at delivering what they planned.
  • The team members used to work on multiple things at the same time and frequent interruption or context switching caused each person to spend a little extra time re-focusing on the new priority – even if its half an hour, it adds up every day. Rather than trying to do four things at once the team now focus on two things. It is more efficient and more effective to do two things really well than try and do four things at once.
  • When the team had eight things in the sprint at the same time everyone had different objectives. Now that the team focus on two things, everyone has the same common shared goal and the team collaborates together much more closely.
    I have heard some agile coaches recommend setting a WIP limit of 50% of the team to force members to collaborate together instead of working individually.
  • The team had single source of knowledge or single point of failure for some activities such as DBA or UI design. When critical staff were on leave the rest of the team struggled to deliver. They later split the team into two sub teams. The support sub team were dedicated to incidents and UAT, enabling the core team to focus on delivering the sprint features. The support team also had responsibility for getting feedback from the production environment and stakeholders. They also spent effort improving the internal engineering systems. To spread knowledge the members rotate between core and support teams every few days.
  • Pair programming has accelerated the communication and collaboration between team members and significantly cut rework and poor design decisions.
    Pair programming has significant success in many other organizations.