Agile Delivery at Scale — How to leverage the Power of Agile for Large Projects

Introduction
In this article I would like to present my point of view on the topic of ‘Doing Agile at Scale’. This is based on the collective experiences from multiple projects I have worked on over the past decade or so. Most of my work has been in the domain of Financial Services which means that some of the idiosyncrasies of the domain would inevitably creep into my point of view. Nevertheless, I believe that the core learnings are transferable across other industries.
To set the expectations right, this article does not get into the details of ‘What is Agile?’, ‘Why Agile?’ etc. It assumes that the reader is familiar with the agile methodology. Building on this assumption the article describes the familiar problems faced by Agile practitioners when the methodology is scaled up to meet the demands of large project delivery.
Agile methodology
Agile Methodology comes in a myriad number of flavours, every firm and team in the world has deviated from the Agile manifesto up to some extent and has adapted and evolved the methodology based on its own needs and experiences. My personal flavour of choice has been the Agile Scrum methodology. In this format there are only 3 roles in a Scrum team — The Product Owner, Scrum Master and The Team. I have had the privilege to have donned all three hats.
In this article I have not taken the view of any specific role per se, but I believe the contents would be useful to every Agile practitioner in whatever roles they take-up.
Large Financial Services Companies like Banks, Insurance Companies, Pension Funds etc. typically have a hierarchical structure. They have top-down command and control mechanisms that are very slow to respond to the changing business environment. When these firms take-up large transformational projects in-house they usually end-up missing deadlines, go over budget and cause attrition of top talent.
Planning — Most Waterfall projects are subject to the constraints imposed by the ‘Iron Triangle of Planning’ (the three sides being Scope, Resources, Timelines). Fixed in Scope with freedom to modify the resources and the timelines.
In contrast, truly Agile projects have no fixed scope but have fixed resources and time. This pure-form version of Agile has time and again proven to work at smaller scales but fails to deliver the promised results when scaled up. The main issue is the flexibility of Scope.
Large institutions have a need to define scope as there is a whole legacy machine that needs to be kept running — Finance, HR, Reporting and so on. One example within the Financial Services industry is Regulatory Projects. In these type of projects, scope is a given, it is non-negotiable. How does one still remain Agile while being squeezed by all three constraints simultaneously?
Unfortunately there is no magic formula. It is a an expertise that is developed over time and hence needs practice. However, there are a few key structural set-ups that increase the probability of success and I would like to sketch those out briefly below.
The structure of the content follows the ‘life-cycle’ of a generic Agile-delivery project. For the purpose of having some context, let us assume that your project is a large transformational project at a large Retail Bank involving multiple stakeholders and third party vendors that need to be integrated into.
Your firm is responsible for end-to-end Delivery. From an end user perspective, the outcome is a customer facing banking application in Web and Mobile form-factors. You have multiple Product Owners who have unique goals. However, the end product needs to seamlessly intertwine these goals to create a singular customer experience.
Genesis Era
If one were to draw a plot of Uncertainty vs Time, then this is the period when uncertainty is at almost infinity. You have developers, Architects, QAs (Quality Assurance), BAs (Business Analyst), Scrum-Masters and Product Owners in the room and the looming question is ‘What should we all do now?’
I like to approach this stage of the project as a sort of a sandbox for experimenting various Technology and Process choices that could be made.
It is important to make sure that you deliver a working product (demo ready) right from the first sprint.
It is usually a POC (Proof of Concept), it could be something as basic as a mock-up of what a home page could look like. In one of the projects I worked on, we demoed screen outlines on a white-board!
Discipline — I recommend using this time as a sort of an ‘indoctrination’ period into the discipline of Agile by having daily Stand-up meetings, adhering to Sprint based timelines, Sprint goals, Sprint demos etc. Create Tech-spikes and Design-spikes during these periods and let technical team members explore the realm of technology possibilities and limitations. BAs and Product Owners collaborate during this period in developing and defining the various themes and Epics.
People — Large firms have long ramp-up times and getting people assigned to a program can be a multi-week exercise. Start sprints with a core group. If new members join the group in the middle of a sprint, do not add them to the Sprint team immediately. Allow them to explore the documentation if any, have informal chats, figure out seating and other logistical and infrastructure issues. Bring them on into the subsequent sprint.
Infrastructure — One of the cornerstone of Agile is Co-Location. I cannot emphasize enough how game changing co-location can be. Realistically speaking, co-location is not a blocker for small Agile teams of sizes <10 . But doing Agile at scale often has total program population of >25. Getting to co-locate can be a challenge even at large organizations with financial resources. Added to this, there are other related infrastructure issues like networking and cyber security.
One approach to the problem that has worked well is renting out external co-working spaces. These work-spaces are professionally run and managed and can be a great change driver.
Just by the mere fact of working from a different space gives the teams the feeling that they don’t have to follow the stifling constraints of the old way. In my experience, I have seen teams becoming more open and perform better. One word of caution though. Financial Firms that deal with material customer information may not be very comfortable with this approach. One way I have seen this addressed is through the use of strict physical and network access controls.
So, how long should this era last? Well, it depends on multiple factors like the size of the program and the availability of resources (Financial, Human Capital, Infrastructure etc). As a heuristic, this phase should be < 4 weeks for a large transformation program (~1 Million Person-Hours).
Here you are at the end of the Genesis Era but, there is still confusion in the air. Stakeholders are getting antsy and often question you about ‘What value’ they are getting out of their investments and also ‘When?’ that value would be delivered. It is hard to answer those questions in the Agile model. Fear not, this is nothing new and most large organizations implementing large Agile programs face these challenges.
Although truly Agile projects do not have a fixed scope it is necessary to set some high-level scope boundaries. This is a compromise that is made in large programs. The Iron Triangle of Planning mentioned earlier in the article needs to be made slightly flexible.
MVP — Minimum Viable Product. It is essentially a set of features that deliver value to the end client. This fixes scope and resources leaving time as the only variable. So, you can go ahead and share the details of ‘What value?’ with the stakeholder group but hold back on the ‘When?’. I have often seen that a reasonable estimate with caveats and assumptions that are clearly documented go a long way in calming the often anxious stakeholders.
Now that you have won over your stakeholders, how do you go about with defining the MVP?
The Planning intermission
For the purpose of defining the MVP it is important to do the unthinkable. Drop everything! Preferably after the last demo day of the last sprint of the Genesis Era.
Have a dedicated off-site workshop for at least one whole day with every member of the program. The aim of this workshop is to come up with a shared understanding of what constitutes MVP.
In this workshop, Product owners take turns ‘pitching’ the high-level features of the product or the product component that they own. They could speak to it, use props like charts and sticky notes or do Powerpoint Presentations.
Once the initial round of presentations are done, the Product Owners display the features using whatever means they deem fit. In my experience I have seen that Sticky notes on a Chart seem to garner higher interest and interaction than powerpoint or Visio diagrams.
This format has a road-show or a School Science Fair atmosphere where Business and Technical members of the team go around asking questions and expressing interest to work on components that interest them.
The outcome of this exercise, which is the shared understanding of the MVP is achieved through the identification and definition of high-level program themes or Workstreams as I call it and the enumeration of the list of Epics and corresponding features. More importantly, it helps socialize the goals of multiple product owners which builds the foundational understanding required to create a ‘Singular Customer Experience’.
The Workstream Era
Workstreams — Workstreams are logical groupings of features under one or more Themes. Workstream teams are the Agile teams dedicated to the delivery of the features within that Workstream. (In this article and in the industry it is common to use the term Workstream and Workstream teams interchangeably)
These teams are the fundamental units for doing Agile at scale. They are known by several other names in the industry ex: Pods, Groups, Teams, Squads etc.. Essentially, Workstreams have all the necessary team members whose collective skills are sufficient to deliver a full-fledged product on their own.
Each Workstream has a Product Owner, a Scrum-Master and Team Members with Business Analysis, UI(User Interface), Back-end, QA, Automation QA, DevOps, Design and other skills. These workstream teams are self-organized based on the Agile Scrum model. One of the questions that comes up is the number of possible Workstreams? The answer is purely judgement based. Although there are no theoretical upper limits, a heuristic of <10 Workstream teams per program is a good model to follow.
Sprint Cycles — One of the most important discipline to practice is the definition of the sprint cycles and the alignment of those cycles between all the Workstreams.
Depending on the program requirements the sprint cycles or iterations can be of 1-week to 4-weeks in duration. 2-week iterations have worked well in my experience, as this pace is sustainable and does not cause too much stress or slack.
Every Workstream team should have the same start and end dates of the sprint cycles. This alignment constraint is necessary and the importance of which becomes apparent towards the end of any program where different components have to align and integrate.
As stated before, it is likely, especially during the early stages of the program that many of our delivery partners might not have the technical components that are needed for us to integrate into. This, however should not hamper progress. Negotiate interface point definitions and proceeded with mocked data.
Scrum of Scrums — Agile Scrum has scalability built into it that makes it an optimal choice for scaling Agile. I recommend at the minimum a weekly Scrum of Scrums.
This is essentially an abstracted scrum stand-up. A representative from each of the Scrum teams (Workstream teams) attends these meetings. The key difference is the focus of the discussions. Instead of providing status updates, the focus is to surface key dependencies and blockers between each of the Workstream teams. Just as a Daily Scrum stand-up, these meetings have to be strictly time-boxed.
Demo Days — It is important to have a combined single demo across the whole program. Individual Workstream teams get a chance to show-off the work they have done to other Workstreams. It helps build connections and is an important investment that bears fruit towards the later stages of the program when integration becomes increasingly complex and important.
So, How long does this Era last? It depends a lot on the Scope of the MVP defined. It is not unusual to have up to 20+ sprint cycles in this Era. Also, note that time is the only variable side of the Iron Triangle in this phase, so, in practice a 20% time creep to deliver the MVP is not unusual.
The Kanban Era
Kanban — It is a word that is of Japanese Origin. It literally translates to ‘Sign-board’. It originated from the manufacturing industry during the period of the Japanese quality revolution. In the manufacturing world it displays work items in various stages of manufacturing. When grouped by stage, every stage has a collection of items and the number of items in a stage is used to drive a lean inventory management system like JIT (Just in Time).
Analogously, in the IT world, think of it as a giant display of all the work items in various stages of the Software Delivery Life-cycle.
After every iteration, the product is incrementally refined with more features becoming available. You are gradually finishing up your MVP with the backlog rapidly drying up.
This also means that the balance of work done in the Workstream Teams moves more towards testing and bug-fixing. Not only that, the nature of the bugs requires working across Workstream Teams and partners as features are more and more interdependent.
This increased inter-dependency and the reduced number of MVP backlog items means that the Workstreams that were logically separated can now be merged into 1 team.
But first there are a few Prerequisites that have to be fulfilled.
DevOps— It is highly important to have a completely streamlined DevOps with all of the kinks of Continuous Integration and Deployment pipeline clearly ironed out. The reason being that the deployment cycles will be increasingly rapid with a simultaneous increase in the number of different Development, Test and Production environments.
Backlog Freeze — A prerequisite which is more of a discipline is to curb the temptation to add new features to the backlog. It is very tempting to realize that there are few enhancements or features that could be added. It is recommended to freeze the backlog so that the attention of the team is focused on delivering the MVP rather than ideating on features.
Having said this, I have personally taken a slightly lenient approach of reasonable accommodation. If the ask is something that is very minor I have chosen to implement rather than litigate.
It is tempting to ramp-down your human capital as there is no new development at this point of time, but, I would advise against it. The collective knowledge is at its peak in this stage and is highly effective in reducing turnaround times on any defects that would definitely arise.
Work items — A singular view of the total outstanding items is created. I have seen that there is usually a sense of panic when all the work items across all the workstreams are pooled into a single view/board.
This is typically visualized over the several stages of a Software Development life-cycle. You could also think of it as parallel queues where items traverse from left to right. For example, the columns or queues could be ‘Ready for Development’, ‘Development in Progress’, ‘Development Complete’, ‘Testing in Progress’, ‘Testing Complete’. There would be items loaded on to each of these queues.
In this Era most of the outstanding items would be features in some stage of the iterative QA cycle. Bugs that arise due to integration and regression testing, Technical housekeeping tasks etc.
In the Kanban Era there are no Sprint cycles. This means that the Scrum Ceremonies are no longer applicable. In its pure form, Kanban is completely voluntary in the sense that people pick tasks from a queue.
However, in large programs this is hard to execute. One compromise that has worked in my past experience is to have a queue manager. The former Scrum Masters are most well suited to take on this new role of Kanban Queue Manager.
Queue Manager — The Queue Managers work with the Product-Owners and BAs to prioritize items within the queues. They are also responsible to manage dependencies between queues by working with other Queue managers. They also interface with the technical members to ensure the appropriate division of workload.
So, How long does the Kanban Era last?
The Kanban Era goes on right until Go-Live. The number of outstanding issues increase at first but eventually keep going down. Typically this Era would last about 20% of time taken in the Workstream Era. However, this could vary due to complexity of the product being developed, Major blockers etc.
Ideally, one would want to close out all outstanding defects before going live. However, in practice it is hard to do as there would some defects that are hard to debug.
For example, infrastructure related defects that occur only intermittently, bugs that are dependent on fixes from third-party platforms like Android and iOS issues etc. I would recommend taking the pragmatic approach of going live with no critical or high severity defects. The marginal effort and resources required to address the 1% of the Edge-case issues are not worth stopping the Go-Live.
Also, in the true spirit of Agile, the work is never done. There is always room for continuous improvement.
The Go-Live Era
Hurrah! After months of high-paced work your product is finally in the hands of the end customer!
You can happily Go Live (as in living a life) hereafter! It is a reason to celebrate, but, it is not all over yet. Sure, this is the Era of stability but it does not mean that it is static. In spite of testing your product to the ground you should still expect to see issues, especially in the early weeks of the launch.
You should expect that most of the issues are either data issues i.e. data migration, data sync etc. or infrastructure issues i.e. Performance scaling of key components to the demands of production volumes.
In my personal experience I have seen end users vent out their frustrations on Social Media Platforms like twitter, Facebook, Reddit etc.. I recommend taking it in your stride and logging it as a defect. One of our product also got media coverage on a News Network. It can be overwhelming but not unusual.
To manage the initial chaos of the Go-Live Era one could retain the Kanban team for the first few weeks of the Go-live Era. Eventually, when things calm down, this team would be ramped down to leave behind a relatively small Agile Scrum team. This dedicated Agile Scrum team would support the product through its life-cycle.
After the application stabilizes in production the product backlog would be ‘de-freezed’. The product owners constantly look for user feedback and usage response and add new features or deprecate existing features. New features are added to the backlog, groomed, developed, tested and deployed. The product sees small, regular, incremental changes.
The team continues to use the robust Devops pipeline to deploy versions of the product at regular iterations.
Conclusion
Doing Agile at scale is an expertise, an organizational skill and requires practice. I have had the chance to apply Scaled Agile Principles to large transformational programs and have seen unprecedented levels of success beating the metrics implicit in the ‘Iron Triangle of Planning’ — In-time, Below cost, Greater Scope. I believe the difference between success and failure hinges a lot on the participation of individuals in creating high-performance teams. I have observed that regular Rewards and Recognition can go a long way in helping deliver above par results.
All the very best for doing your Agile Project at Scale!