Chad Scherrer
https://cscherrer.github.io/
Recent content on Chad ScherrerHugo -- gohugo.iochad.scherrer@gmail.com (Chad Scherrer)chad.scherrer@gmail.com (Chad Scherrer)Sat, 14 Sep 2019 16:21:50 -0700Fast and Flexible Probabilistic Programming with Soss.jl
https://cscherrer.github.io/post/fast-flexible-probprog/
Sat, 14 Sep 2019 16:21:50 -0700chad.scherrer@gmail.com (Chad Scherrer)https://cscherrer.github.io/post/fast-flexible-probprog/A few months ago, Colin Carroll posted A Tour of Probabilistic Programming Language APIs, where he compared the APIs of a variety of probabilistic programming languages (PPLs) using this model:
\[ \begin{aligned} p ( \mathbf { w } ) & \sim \mathcal { N } \left( \mathbf { 0 } , I _ { 5 } \right) \\ p ( \mathbf { y } | X , \mathbf { w } ) & \sim \mathcal { N } \left( X \mathbf { w } , 0.Variational Importance Sampling
https://cscherrer.github.io/post/variational-importance-sampling/
Sat, 29 Jun 2019 18:53:45 -0700chad.scherrer@gmail.com (Chad Scherrer)https://cscherrer.github.io/post/variational-importance-sampling/Lots of distributions are easy to evaluate (the density), but hard to sample. So when we need to sample such a distribution, we need to use some tricks. We'll see connections between two of these: importance sampling and variational inference, and see a way to use them together for fast inference.
Importance sampling Importance sampling aims to make it easy to compute expected values. Say we have a distribution \(p\), and we'd like to compute the average of some function \(f\) of the distribution (or equivalently, the expected value of a "push-forward along \(f\)").Confusion Confusion
https://cscherrer.github.io/post/confusion-confusion/
Sat, 23 Feb 2019 11:01:39 -0800chad.scherrer@gmail.com (Chad Scherrer)https://cscherrer.github.io/post/confusion-confusion/Harder Than it Needs to Be Say you've just fit a (two-class) machine learning classifier, and you'd like to judge how it's doing. This starts out simple: Reality is yes or no, and you predict yes or no. Your model will make some mistakes, which you'd like to characterize.
So you go to Wikipedia, and see this:
There's a lot of "divide this sum by that sum", without much connection to why we're doing that, or how to interpret the result.Soss.jl: Design Plans for Spring 2019
https://cscherrer.github.io/post/soss-update/
Sun, 27 Jan 2019 07:35:51 -0800chad.scherrer@gmail.com (Chad Scherrer)https://cscherrer.github.io/post/soss-update/If you've followed my work recently, you've probably heard of my probabilistic programming system Soss.jl. I recently had the pleasure of presenting these ideas at PyData Miami:
[N.B. Above is supposed to be an embedded copy of my slides from PyData Miami. I can see it from Chrome, but not Firefox. Very weird. ]
In April I'll begin another "passion quarter" (essentially a sabbatical) and hope to really push this work forward.Julia for Probabilistic Metaprogramming
https://cscherrer.github.io/post/soss/
Tue, 11 Sep 2018 00:00:00 +0000chad.scherrer@gmail.com (Chad Scherrer)https://cscherrer.github.io/post/soss/Since around 2010, I've been involved with using and developing probabilistic programming languages. So when I learn about new language, one of my first questions is whether it's a good fit for this kind of development. In this post, I'll talk a bit about working in this area with Julia, to motivate my Soss project.
Domain-Specific Languages At a high level, a probabilistic programming languages is a kind of domain-specific language, or DSL.A Prelude to Pyro
https://cscherrer.github.io/post/pyro/
Tue, 21 Aug 2018 00:00:00 +0000chad.scherrer@gmail.com (Chad Scherrer)https://cscherrer.github.io/post/pyro/Lately I've been exploring Pyro, a recent development in probabilistic programming from Uber AI Labs. It's an exciting development that has a huge potential for large-scale applications.
In any technical writing, it's common (at least for me) to realize I need to add some introductory material before moving on. In writing about Pyro, this happened quite a bit, to the point that it warranted this post as a kind of warm-up.Bayesian Optimal Pricing, Part 2
https://cscherrer.github.io/post/max-profit-2/
Sun, 03 Jun 2018 12:05:48 -0700chad.scherrer@gmail.com (Chad Scherrer)https://cscherrer.github.io/post/max-profit-2/This is Part 2 in a series on Bayesian optimal pricing. Part 1 is here.
Introduction In Part 1 we used PyMC3 to build a Bayesian model for sales. By the end we had this result:
A common advantage of Bayesian analysis is the understanding it gives us of the distribution of a given result. For example, we very easily analyze a sample from the posterior distribution of profit for a given price.Bayesian Optimal Pricing, Part 1
https://cscherrer.github.io/post/max-profit/
Sun, 06 May 2018 07:04:24 -0700chad.scherrer@gmail.com (Chad Scherrer)https://cscherrer.github.io/post/max-profit/Pricing is a common problem faced by businesses, and one that can be addressed effectively by Bayesian statistical methods. We'll step through a simple example and build the background necessary to extend get involved with this approach.
Let's start with some hypothetical data. A small company has tried a few different price points (say, one week each) and recorded the demand at each price. We'll abstract away some economic issues in order to focus on the statistical approach.The Bias-Variance Decomposition
https://cscherrer.github.io/post/bias-variance/
Wed, 04 Apr 2018 13:43:57 -0700chad.scherrer@gmail.com (Chad Scherrer)https://cscherrer.github.io/post/bias-variance/Say there's some experiment that generates noisy data. You and I each go through the process independently, and model the results. Would the resulting models be exactly the same?
Well no, of course not. That's the whole problem with noise. Instead, we'll usually end up with something like this (for a quadratic fit):
The idea is that we'd like to find an approximation to \(f(x)\), but we can never observe this function directly.Bayesian Changepoint Detection with PyMC3
https://cscherrer.github.io/post/bayesian-changepoint/
Sun, 25 Feb 2018 00:00:00 -0800chad.scherrer@gmail.com (Chad Scherrer)https://cscherrer.github.io/post/bayesian-changepoint/A client comes to you with this problem:
The coal company I work for is trying to make mining safer. We made some change around 1900 that seemed to improve things, but the records are all archived. Tracking down such old records can be expensive, and it would help a lot if we could narrow the search. Can you tell us what year we should focus on?
Also, it would really help to know this is a real effect, and not just due to random variability - we don't want to waste resources digging up the records if there's not really anything there.About me
https://cscherrer.github.io/page/about/
Mon, 01 Jan 0001 00:00:00 +0000chad.scherrer@gmail.com (Chad Scherrer)https://cscherrer.github.io/page/about/Senior Data Scientist at Metis
Under construction - more coming soon!