Future of Reliability Engineering (Part 2)

In early May, I gave a presentation on the ‘Future of Reliability Engineering’. I wanted to break down the five new trends that I see emerging in a blog-post series: Blog Post Series: Evolution of the Network Engineer Failure is the new Normal (move towards Chaos-Engineering) Automation as a Service Cloud is King Observe & Measure Failure is the new Normal (move towards Chaos-Engineering) a) Breaking down Silo’s »

Future of Reliability Engineering (Part 1)

Last month at Interop ITX, I gave a presentation on the ‘Future of Reliability Engineering’. I wanted to break down the five new trends that I see emerging in a blog-post series: Evolution of the Network Engineer (towards Network Reliability Engineers) a) Breaking down Silo’s The network is no longer a silo. Applications run over the network in a distributed fashion requiring low-latency and large data-pipes. With these requirements, network engineers must understand these requirements, understand how applications are generally deployed for troubleshooting purposes and ensure that they have models to plan for capacity management. »

Publication Updates (June 05 2018)

Hi all, I’ve recently updated my publications page with my latest presentations from: Interop ITX 2018: The future of Reliability Engineering Velocity New York 2018: How to Monitor Containers Correctly SF Reliability Engineering - May Talks Devops Exchange SF April 2018: How to Build Production-Ready Microservices Information Week: 3 Myths about the Site Reliability Engineer, Debunked You can also find me later in the year at: PyBay 2018: Building Production-Ready Python Microservices Velocity New York 2018: How to Monitor Containers Correctly »

Publication Updates (May 27 2017)

Hi all, I just updated my publications page with links to my SRECon17 Americas talks, my new LinkedIn engineering blog post. It was announced this week I will also have the privilege of speaking at SRECon17 EMEA in Dublin later this year. You can find me talking about: Networks for SRE’s: What do I need to know for troubleshooting applications Reducing MTTR and false escalations: Event Correlation at LinkedIn »