*“Everything fails, all the time”* by Amazon’s CTO Werner Vogels.
## Little Story: The Downtime and the Disaster…
I must confess: living in this fast-paced world as much as you, I really don’t know how writing a book seemed appealing at first. When the “long” twenty-four hours of a day are still not enough to handle everything, setting aside a significant amount of time to structure complex ideas can be challenging and, for some, even discouraging. But I like to think that, besides being an experienced tech guy, I consider myself also a good storyteller, and the possibility of combining these two traits made the mission of “writing a book” seem, rather than a self-imposed sacrifice, like a mix of hobby and intellectual necessity.
So, let’s begin our journey with a little story that I’m sure that many of you can relate to (although it is pure fiction of my mind).
Meet Brendon Cosby, 36, Capricorn sun (analytic and reasonable as it should be) and Incident Manager at a company called ACME.com. Last Friday night around 11:30 p.m., Brendon received a call from the Command Center about a production incident affecting all users on the platform. After a quick first assessment, he realized:
1. The event started at around 9 p.m.;
2. There is a list of 37 changes scheduled to occur simultaneously in the same system, with 23 from the infrastructure team and 14 from the engineering teams;
3. The e-commerce system is out of reach, and customers are receiving a TypeError message after they log in;
4. The teams are already gathered to determine which change is affecting the e-commerce system, and none of them are willing to roll back their changes until Brendon proves which change is causing the error;
5. Brendon needs to find a solution in the shortest possible time.
**00:02 a.m.** – After some time discussing and analyzing monitoring and observability data, our fellow manager discovers that all infrastructure changes were implemented correctly; none of the infrastructure alarms are red and all components seems to be behaving properly. He also realizes that 100% of the users are currently impacted and must find the Single Point of Failure (SPOF) responsible for such incident.
**00:23 a.m.** – Brendon spends around 20 minutes chatting with the engineering change leaders and troubleshooting to figure out what is really happening on the user’s shopping page. He eventually decides to roll back all changes, one by one, following a specific order of probability that he imagines is reasonable.
**03:17 a.m.** – All changes were rolled back and to our manager’s surprise, the error is still there! What the heck!!! Well, Brendon then decides to call all infrastructure change leaders and bring them to the table. But time is running out: in four hours the e-commerce page must be working.
**03:42 a.m.** – When almost all infrastructure leaders are on the call, Brendon eventually notices that some of the changes were not described correctly and that most of them had some hidden tasks, like applying software and operating system patches.
**04:25 a.m.** – Brendon starts the infrastructure changes rollbacks.
**05:37 a.m.** – Change 23789 -> Upgrade the Websphere servers -> Immediately after Brendon rolled back these changes, the e-commerce was alive and kicking again. Uff!
We are all happy and proud of Brendon’s expertise and efficiency, but of course ACME.com will suffer the following collateral effects:
* **Image Impact** – Some users failed to purchase products during the breakdown, resulting in numerous megative comments about ACME.com on Twitter and Facebook;
* **Financial Impact** – Users found alternative places to make their purchases and may not return to ACME.com in the near future. A large team of IT employees worked overnight, resulting in extra hours and additional costs;
* **Business Impact** – Some of the planned changes involved introducing new products and features, designed to attract more users and increase sales on the e-commerce platform. Now, the product owners will have to wait two more weeks until the next change window opens.
**Info:** In 2023, IDC and AppDynamics reported that the Fortune 1000 companies lose between $1.5 billion and $2 billion annually due to unplanned outages[^1].
As I’ve told you before, this is not a real story, but it could be. Some of you might think, “Yes, I’ve experienced something like this at least once in my life!”. Situations like Brendon’s were much more common if you worked in IT in the 90s or 00s, but they still persist even today.
## Legacy Companies in the Modern Market
Diving deeper into the past, in the ’80s, technology was considered an expense by most companies. During that decade, the trend was reducing costs, personnel, and even investment: reduce to perform. However, at the end of the ’90s a new type of company emerged, breaking down market barriers and creating a new paradigm. Meet the so-called Exponential Organizations or ExOs[^2].
Overnight, linear businesses saw their market share and profits shrink — and, ultimately, disappear — as ExO companies grew. Traditional at their core and failing to understand the importance of reinvent their strategies, these companies were puzzled: why are we not competing equally with these new ExOs?
According to Ismail Salim, indo-Canadian serial entrepreneur, angel investor and technology strategist, ExOs grow ten times faster than linear/traditional companies, regardless of revenue, market share, size, or reputation. ExOs have truly revolutionized markets because they are more sustainable, cost-efficient, reliable, and customer-centric. We could spend days listing many exemplary EXOs, such as Uber, Airbnb, AWS, and Google.
But one question remains: how have these ExOs grown fast and become so flexible and adapted? Two key frameworks can explain this: SCALE and IDEA. According to Salim, these letters stand for important traits that help ExOs stay innovative and quick to handle new challenges.
**SCALE**
* **S**taff on demand allows ExOs to flexibly manage human resources, scaling up or down based on immediate needs without the constraints of traditional employment issues;
* **C**ommunity & crowd harnesses the power of large groups outside the traditional organizational boundaries, enabling innovation and scaling through collaboration;
* **A**lgorithms help automate processes and make data-driven decisions, enhancing efficiency and effectiveness. Similarly, Leveraged assets mean ExOs minimize their own asset ownership, which allows them to reduce costs and increase operational agility by using assets owned by others;
* **L**everaged assets means that ExOs minimize their own asset ownership, reducing costs and increase operational agility by using assets owned by others;
* **E**ngagement involves techniques such as gamification, incentive prizes, and digital feedback loops to actively engage customers and users, creating a compelling, interactive experience that drives loyalty and growth;
**IDEA**
* **I**nterfaces are carefully designed protocols and APIs that allow ExOs efficiently manage interactions between the core systems and the external community efficiently.
* **D**ashboards provide real-time, actionable insights through analytics, helping to monitor performance and guide decisions.
* **E**xperimentation is encouraged within ExOs to continuously testing new ideas and approaches, thereby fostering a culture of innovation and resilience. Complementing this, Autonomy is granted to teams and individuals, empowering them to swiftly make decisions and adapt to market changes, thus maintaining organizational flexibility.
* And, **A**utonomy is granted to teams and individuals, empowering them to make decisions and respond quickly to market changes. This decentralized approach is critical for maintaining the speed and flexibility that characterize Exponential Organizations.
Based on ExOs’ structures and cultures, many methodologies were created to assist legacy companies, teaching them new ways of think, build, create, and deliver IT. These companies began to reformulate themselves, seeking agility, time-to-market, customer-centricity, data-driven insights, and product improvements, such as:
* Agility, using SaFE, Scrum, and other frameworks
* DevOps Culture
* Site Reliability Engineering (SRE)
* Management 3.0
It’s safe to say that structural changes like these bring immense challenges. While many companies have thrived by implementing new cultures, governance, and processes, others failed along the way in the last decade. Those who successfully implemented these changes still feel some kind of gap, something that prevents them from achieving the same pace as the ExOs, making them ask themselves: “So, what now? What is missing in my organization?”.
Of course, there isn’t a one-size-fits-all answer, each company, each operation, and each business infrastructure require a careful and highly specific analysis. However, what we can affirm is the undeniable role of modern applications: they help organizations maintain a rapid pace of innovation, achieving multiple resilient and cost-efficient deployments. Just like every rock band requires the drummer + guitarist + bass player + vocalist combo to produce harmonious and melodious music, an IT department needs modern applications to ensure smooth and harmonious deployments. Remove any component and the process will likely become dissonant and ineffective.
## What are Modern Applications?
Stimulated by what I like to call “The ExOs Paradigm”, many legacy companies have been striving for better agility and code quality to keep up with these new players. Many of them have overcome the odds and transformed their culture, but for many more, something is still missing. Even with new frameworks, these traditional companies always see EXOs walking one step ahead, gaining more ground by continuously developing, improving, and creating what can be termed as “modern applications.” Let’s start unraveling it.
The term “modern applications” almost immediately alludes to something new, something disruptive, something quite desired in a world where modern nearly means “seller”. However, beware of the pitfall of a first fallacy: once you modernize, your job is done and you will never need to do it again. Absolutely untrue.
Today’s modern applications may not be considered modern tomorrow, and the test of the time can prove it. Comparing technology through the years, we can see RISC vs. CISC processors, mainframes vs. PCs/servers, virtualization, containers, serverless computing, service-oriented applications, and distributed applications, and much more.
Event-based applications are modern compared to those running Cobol on mainframes. However, in one or two decades (or even in a few months, what d’IA know?), we will surely have new technologies that can transform the current ones in legacy.
![Modern Applications vs Business](PartI-Pictures/2_ModernApplications.png)
Time after time, we have been labeling the next generation of applications as “modern”, no matter how different their purposes and peculiarities. However, all these Modern Applications present the same key benefits that every company have been looking for:
* **Faster Time-to-Market and Innovation** – A modern application can help a business move faster than ever, adding more value in terms of innovation, which means quickly delivering a wider array of digital products.
* **Customer-Centric** – All interfaces must provide user-friendly environments, making users comfortable with navigation, and providing feedback throughout its lifecycle.
* **Better Resiliency** – An application must be able to handle and recover from unexpected failures while ensuring continuity of service. This includes self-recovery from crashes, scalability to handle spikes in traffic, and maintaining data integrity and availability even during infrastructure disruptions.
* **Flexibility, Extensibility and Ease of Change** – This is the ability to be agnostic in its solutions and to allow various interfaces, or other applications, to call it and receive adaptive responses without needing to change code for each scenario. Additionally, it should be easy to add, remove, or modify internal functionalities with little or zero impact.
* **Data-Driven Applications** – The application should provide internal data about its own health and functionalities, as well as insights to improve business outcomes.
The main goal of this book is to share experiences and strategies for modern applications that have the power to remain useful over the years. The insights presented here will help you make decisions, prioritize, and understand modern applications better. And last but not least, my wish is to instill the notion that continuous adaptation and innovation are the true turning points for organizations that aspire to be successful and long-lasting. In a constantly changing technological world, only those willing to undergo continuous transformations can hope to achieve results similar to ExOs.
[^1]: https://www.appdynamics.com/newsroom/press-release/idc-releases-first-ever-devops-and-application-performance-survey
[^2]: Ismail, S., Malone, M. S., & van Geest, Y. (2014). *Exponential Organizations: Why new organizations are ten times better, faster, and cheaper than yours (and what to do about it)*. Diversion Books.