ML models often exhibit unexpectedly poor behavior when they are deployed in
real-world domains. We identify underspecification as a key reason for these
failures. An ML pipeline is underspecified when it can return many predictors
with equivalently stron... (more…)
Read more »
DBMSs need careful tuning for efficient performance on specific hardware and workloads. Yet, manual tuning by experienced admins is impractical for extensive... (more…)
Read more »
In this series I want to explore some introductory concepts from statistics that may occur helpful for those learning machine learning or refreshing their knowledge. Those topics lie at the heart of… (more…)
Read more »
We're going to explore why the concept of vectors is so important in machine learning. We'll talk about how they are used to represent both data and models. ... (more…)
Read more »
In this blog series, we’ll explain common topics in privacy-preserving data science, from a single sentence to code examples.
We hope these posts serve as a useful resource for you to figure out the best techniques for your organization. (more…)
Read more »