Zelkova - 100 Million+ SMT Queries a Day
ABSTRACT
Zelkova is a distributed system that uses automated reasoning to answers logical questions about security policies, like "does principal A have access to resource B" or "is resource C accessible by unauthenticated principals." Zelkova does this by translating both the input policy and question into an SMT query and calling a portfolio of solvers. Launched five years ago, Zelkova has become a critical part of Amazon Web Services (AWS) "provable security" initiative, growing to process over a billion SMT queries a day.
This talk will focus on the lessons we've learned running Zelkova in production and at scale. In particular, we will cover the techniques that we use to measure and improve performance, availability, and correctness, and how these have evolved over time. We will talk about service level agreements (SLAs), how these differ from services that do not use automated reasoning, and how to communicate the differences and build trust with prospective customers. We will also talk about some of the unsolved challenges to-date and where we think Zelkova will go in the future.
This talk will be relevant to conference attendees who want to know more about building and operating an automated reasoning-based service in production and at scale.
Author
Jeremiah Dunham leads the Provable Permissions team at Amazon Web Services (AWS), and cares deeply about using math to make the world a better place. He has held a variety of roles at AWS and has received 8 patents during his 8+ years there. Prior to joining AWS, Jeremiah worked in a variety of engineering, consulting and leadership roles at companies including Microsoft and Alaska Airlines. Outside of work, Jeremiah loves running, reading, and craft beer.