Cloud Infrastructure Part II: DevOps + Observability
A look into the DevOps and observability ecosystems today, and where they will go from here
DevOps
In the past decade, DevOps has emerged as an important enabler of software innovation. The simple premise of marrying software development (“dev”), i.e. the code behind applications, and IT operations (“ops”), where those applications are put into production, has improved collaboration and time-to-market in the software delivery process. Coupled with an explosion of software developers (~30M devs worldwide by 2025)1, DevOps as a category has produced transformative businesses like Atlassian, GitLab, GitHub, and HashiCorp. These companies abstract away an important class of problems that costs companies money and development resources.
DevOps isn’t new — frameworks to systematize the software delivery process date back several decades. When DevOps started, each team brought their own tools and methods in isolation, leading to disconnected development environments. Companies tried to manually cobble together these point solutions, but this didn’t improve the fragmentation problem. To streamline efficiency and avoid painfully chaining together disparate solutions, businesses like GitLab emerged to automate the software delivery process and create a more end-to-end workflow from source code to deployment.
DevOps really accelerated in the past decade as cloud-based delivery models became the norm and open-source offerings became more common. On the latter, open-source was an incredibly important catalyst for DevOps. Developers are a notoriously fickle target customer segment, with perpetually-evolving tastes and a disposition towards building versus buying. Open-source offerings allow developers to easily try out software for free and deploy them directly without relying on centralized IT approval.
Although there are large incumbent businesses within DevOps already, there is still much opportunity for emerging startups. A recent Bain report showed that only ~10% of organizations have “relatively mature” DevOps capabilities. There are important problems that still need to be solved and an increasing number of new DevOps startups entering the market.
In my first post, I highlighted an emerging set of names with recent funding that are pushing this category forward. I take a broad view on the definition of DevOps: any software that sells into and / or involves an engineering persona in the evaluation and deployment process.
Key Trends Within DevOps
1) Adoption of microservices. Microservices allow a large application to be separated into smaller independent parts, with each part having its own role. Companies are embracing service-oriented architectures, modularizing their applications to build more scalable, resilient, and agile systems.
2) Emergence of the developer experience function. Almost every developer-facing startup today has an open role for a ‘Head of Developer Experience.’ Developer Relations is a critical function for any developer-centric startup to build mindshare and drive business value. This role is important as a good developer experience function frames developers as customers, and understands the journey of the developer from trial to purchase.
3) Ops moving even closer to Dev. As ops moves closer to dev, it’s less about having siloed teams building and deploying software but rather platform engineering standing up guardrails and reusable components so that developers can own the build and deploy process from beginning to end. Startups like Pulumi enable organizations to use real programming languages to provision and decommission cloud resources and infrastructure. Fly.io lets developers run full stack apps (and databases) with no ops required. Just bring your code and let these vendors do the rest. It’s never been easier to deploy your code into production.
4) Ubiquity of JAMstack. The JAMstack describes a different way of building apps and websites. First coined by Mathias Biilman, CEO of Netlify, the JAMstack represents a major trend in web development. It decouples frontend pages and UI from backend apps and databases. This allows for more modular development, personalization, and scale-out of websites.
5) JIRA isn’t going anywhere. As much as developers complain about JIRA, we’ve yet to see a startup displace Atlassian’s popular project management and issue tracker tool in the enterprise segment.
6) Machine Learnings + DevOps. Recent DevOps startups have been injecting machine learning techniques into the developer workflow. For example, Harness, a CI/CD startup, uses a combination of supervised and unsupervised ML techniques to automate aspects of building, testing and deployment of applications.
The DevOps Budget
Historically, budgets for DevOps vendors started on the smaller side but have increased YoY at an aggressive clip. We surveyed 60 DevOps leaders in our network within organizations of 500+ FTEs, and ~25% of enterprises have DevOps-specific budgets greater than $1M. Another ~1/3 of the respondents fall within the $250K-$1M band. Looking ahead, DevOps budgets shouldn’t increase as quickly as other categories within cloud infrastructure, with the majority remaining constant or slightly increasing.
When looking at where these dollars are being spent, I took a similar approach to my last post when I surveyed data and ML leaders. To measure the priority of each subcategory relative to each other, I asked DevOps leaders to rank order solutions by total dollars spent. I gave each respondent an imaginary budget of $2M to go spend on various DevOps categories listed in the market map above. I asked everyone to do this now, and for the near future (5-10 years). It’s important to note that teams will use multiple of these vendors as each is an important element within a broader DevOps strategy. But not all categories are created equal, and spend is not distributed evenly.
At the top, a strong CI / CD strategy has several benefits and plays an important role as it allows teams to ship software quickly and efficiently. GitLab is the incumbent in this category and unified many parts of the dev lifecycle with its single codebase and unified data model. We’re seeing a lot of innovation within this category with startups like Dagger.io and Harness. Following CI / CD, we have solutions that own the “Deploy” phase of application delivery. I like to call this category “Heroku 2.0,” as these providers abstract away the complexity of deploying applications and accessing compute. In third, we have API tools like Postman that simplify each step of the API lifecycle to build more resilient APIs.
Where is DevOps Going?
Largest categories ($ spend) → CI / CD, Deploy, and API-related tooling cement their positions at the top. Microservice orchestrators like Orkes and Temporal will see more use as they allow teams to write highly reliable and scalable applications on top of microservice architectures. Build management startups like BuildBuddy and Engflow should see accelerated investment as they help developers compile and test code quickly.
Fastest-growing categories (% inc. in spend) → Apart from build management, deploy, and microservice orchestrators growing the quickest, frontend startups like Netlify and Vercel will see continued growth as they give dev shops AWS-like abilities for frontend development. Rounding out the top 5, we have “Environments,” or development environments for engineers to use during development, testing, and staging.
Observability
Observability is oftentimes grouped within DevOps, but I wanted it to have its own section to highlight how important of a category it’s become within cloud infrastructure. Over the past decade, enterprise infrastructure complexity dramatically increased, primarily driven by the rapid adoption of hybrid and multi-cloud architectures and the rise of distributed systems. As infrastructure complexity grew, the volume of log and machine data grew exponentially. You can think of log and machine data as “side data” of a transaction. For example, when a user purchases an item on Amazon, how many login attempts were made? Which server completed the request? There is a lot more log and machine data than transactional data: IDC estimates that enterprises will produce ~180 ZB of log data by 20252. Managing all those logs is a growing and expensive problem.
The observability market capitalized on this explosive growth of log and machine data, becoming a massive market with a TAM of $50B+. Existing vendors have already built enormous businesses; Splunk surpassed $3B in ARR last quarter. Datadog is a ~$1.3B ARR business growing 70% YoY3. The observability market is enormous, and these companies have substantial runway for continued growth as they introduce new products and offerings. At the same time, there are gaps in the market that new startups are addressing. Below is a snapshot of exciting early growth vendors to track.
Key Trends Within Observability
1) Application complexity → explosion in observability use cases. Companies are increasingly moving towards microservices-dominant architectures, and as a result there is an accelerating amount of log and machine data to collect and analyze. Observability use cases are abundant, ranging from application & infrastructure monitoring to security automation and log routing.
2) Splunkbundling (thanks Rak). Splunk is synonymous with observability and is the all-in-platform for logging. Yet as machine data explodes, Splunk is unwieldy and very expensive to use. Splunk historically priced on data ingestion, which doesn’t work as companies scale their observability efforts. I ran a survey with observability leaders and nearly ~20% of respondents want to get rid of Splunk entirely. There is a massive opportunity for companies like Cribl, Panther, and Tines that address various parts of the Splunk stack.
3) Fourth Pillar of Observability? Datadog rose to prominence as it unified the three pillars of observability: metrics, logs, and traces. Datadog broke down the silos between infrastructure monitoring, APM, and logs with its unified data model. But there will likely be other pillars to follow, and opportunity for startups. For example, Polar Signals is a continuous profiling project that offers another view into your observability footprint.
The Observability Budget
Observability budgets are usually bundled within DevOps. Yet, we can see that companies pay vendors like Datadog and Splunk favorably. ~20% of our respondents pay more than $1M for observability solutions, and enterprises should increase their investment towards these vendors over the next decade.
In terms of where these dollars are being spent today, observability follows a similar pattern to data infrastructure. The collection and storage of data comes before categories “higher in the stack,” i.e., data exploration / analysis (visualization) and routing.
In first, Chronosphere’s M3 engine is a powerful monitoring solution that can ingest, store, and index observability data at scale. Within visualization, Grafana has built a multi-product business, first starting at the metrics layer with its core product, and now moving into storage with Prometheus and tracing with Tempo. It’s becoming a full-stack observability play. In the routing category, Cribl moves log data from any source to any destination while simultaneously processing that data in-flight, allowing enterprises to greatly optimize their observability posture.
Where is Observability Going?
When looking into the future, the relative spend of each subcategory will stay the same over the next 5-10 years. The current 4 largest categories (visualization, monitoring / storage, routing, and SIEM) cement their positions at the top and will see more investment relative to the bottom 4 categories.
Spend towards vendors within routing (Cribl, Edge Delta), visualization (Grafana), and SIEM (Anvilogic, Panther) will increase the most over the next 5-10 years. These startups are all attacking Spunk from different angles, and it will be interesting to see how much inroads they make.
Join Our Infrastructure Group
Similar to our data + ML group, we have an awesome advisory group of infrastructure and DevOps practitioners at companies like Atlassian, DoorDash, PayPal, Stripe, and Twilio. We host seminars, happy hours, and virtual events throughout the year. If you own infrastructure & software development at your current company and would like to discuss new trends, up-and-coming startups, and meet other like-minded folks, please drop my colleague Chase Holmes or me a note. We would love to have you!
IDC
IDC
Company filings
Really love your insights Sai. Currently leading a SaaS startup but have deep roots in Infra (Thanks to my VMware/Cloud Infra background.) Please add me to any groups/discords you might have.
Cheers and keep the good stuff coming (LI:rahulkapoorca)
Hey Sai,
I’d like to join your group. My Twitter handle is @BluSuitDillon and I invest into a lot of software. I’d like to learn more and offer what I can. This was very helpful.
Thanks,
Dillon