Site Reliability Engineers (SREs) face very complicated problems every day. Our team at Bionic has recently been speaking with a lot of SRE leaders and learning about the ongoing challenges they face daily such as:
- “What happens when a cloud region or a zone fails?”
- “Which applications and services are affected by this failure?”
A common response in these situations is that documentation needs to be more complete and up to date, which is not the best answer. Keeping documentation manually updated in a modern CI/CD pipeline does not scale.
A direct quote from a notable SRE leader is that this documentation approach “Makes me want to throw up in my mouth”.
Manually updating documentation is not the answer – automation for full architectural visibility is the answer.
SREs are responsible for keeping applications running and continuously updated in a modern, aggressive CI/CD pipeline.
Even a few seconds of downtime is a disaster.
SREs need coding knowledge to be able to understand potential risks and misconfigurations in the production applications they manage. They also need to understand the cloud infrastructure in production, what applications are deployed, and what dependencies exist between the two. The cross-functional responsibilities of an SRE dictate they work with many different teams to understand the full picture.
Sounds like herding cats to me.
What tools does an SRE use?
The list of technologies that an SRE uses to accomplish their job responsibilities is long.
Technologies like Application Performance Management (APM), Configuration Management Databases (CMDB), and telemetry tooling are very common but do not give you the complete picture of the infrastructure and application architecture.
The two technologies that should be in an SREs arsenal of tools for complete architectural visualization are Application Security Posture Management (ASPM) and Cloud Security Posture Management (CSPM).
The reason why these two technologies are so important is that they provide an automated and complete visual blueprint of both the application and cloud infrastructure that an SRE would otherwise not have. How can an SRE know what areas they need to investigate if something goes wrong with incomplete information?
A full understanding of how the application and cloud infrastructure work together is vital. There are tremendous benefits that both ASPM and CSPM bring to the SRE.
How would you answer this question?
Going back to the example in the first paragraph.
How do you investigate the blast radius when a cloud region or zone fails?
This is a much easier question to answer when you have an ASPM solution like Bionic at your disposal.
How ASPM Helps an SRE
CSPM is a well-known technology and there are a ton of fantastic tools on the market today. ASPM on the other hand is a brand-new technology concept that solves a huge gap in the ecosystem an SRE manages. The gap that ASPM fills is represented by the green box in the graphic below.
Show Me How ASPM helps an SRE
To quote J Peterman (John O’Hurley) from Seinfeld “Well, this certainly looks like a lot of words”.
Now, you’re probably thinking “I know how CSPM helps an SRE, but I am still not sure about how ASPM helps?” I get it. I am a visual learner as well. Reading a bunch of words describing a new concept is interesting but I need to see it to believe it. Don’t worry I got you. To see exactly how ASPM helps an SRE get the full picture of application architecture in terms of risk, we created this short video.
An SRE’s responsibilities are very important to your organization. They are the last line of defense to ensure your mission-critical or revenue-generating applications are consistently running correctly and securely. The two most important tools in an SRE’s toolbox are ASPM and CSPM. The combination of these two technologies provides SREs full visibility into their production ecosystem that they otherwise do not have.