The Cloudcast #301 - SRE and Infrastructure Operations

The Cloudcast - A podcast by Massive Studios

Categorie:

Brian talks with Rob Hirschfeld (@zehicle, Founder/CEO of @RackN) about the concepts of SRE (Site Reliability Engineering), the challenges of maintaining infrastructure software, emerging tools and the next-generation of operations.

Show Links:


Show Notes:
  • Topic 1 - Welcome back to the show. Let’s start by talking about the concept of SRE (Site Reliability Engineering). Give us the basics and maybe explain how it differs from what people define in DevOps.
  • Topic 2 - Application development has been moving faster for quite a while (agile development, etc.). But now infrastructure/operations teams have to deal with faster software - especially around updates (e.g. Kubernetes releases every 3 months). How are companies managing this?
  • Topic 3 - Given that this pace of operations change may not slow down, how do you think about the challenge in terms of process/operations versus technology/tools?
  • Topic 4 - What are some of the steps that companies take to better prepare for this type of operational model? Tools, process, skills, etc.
  • Topic 5 - Do you see SRE as being a progression for existing infrastructure/operations people, or is this more focused on sysadmins or developers that want to get away from building applications?
Feedback?


Visit the podcast's native language site