At Educative, we get to chat with developers from all over the world, get to know their story, who they are, and what inspired them to become developers and teach those around them.
Today, we sat down with Eric Lippert and got to learn more about his career and the exciting world of C#, the
system.random class, and probabilistic programming.
Here’s what we talked about:
Standout from the crowd with a probabilistic programming certificate.
Eric Lippert designs programming languages at Facebook. Other notable work includes designing C# analyzers at Coverity, and developing the Visual Basic, VBScript, JScript and C# compilers at Microsoft.
He also writes a blog about programming language design and other fabulous adventures in coding.
I was always fascinated by computers, even as a small child. I started programming when I was nine, by writing out on paper some little animation programs to make rocket ships fly around the screen. I had to use paper because I didn’t own a computer; I’d then type the programs in on the Commodore PET in the library after school to see if I got it right.
My elementary school librarian was a very kind and patient person, and we’re still occasionally in touch many decades later.
Pretty soon after that, my parents got me a Commodore 64, and I started programming in earnest. I worked at a compiler company as a summer job in high school, where I got my first understanding of how professionals program.
After that, I did a joint computer science/applied mathematics degree at Waterloo, and ended up working at Microsoft on the Visual Basic compiler as part of the co-op program.
It was then very easy for me to choose to come to Microsoft and continue to work on languages.
I left Microsoft in 2012 and went to work at Coverity on improving their C# static analysis product for a couple of years, and now I work on developer tools at Facebook.
Essentially I’ve been working on developer tools almost exclusively for some decades now; it’s a lot of fun working on the sorts of tools that I would like to use myself!
Recently, I have worked on a variety of developer tools at Facebook; two of them that we’ve published papers on are the “GetAFix” project the HackPPL probabilistic programming language.
GetAFix is an experimental developer tool where we analyze a corpus of code changes which we believe are fixes to particular defects, and then try to deduce what the common fix patterns are.
When presented with a novel fragment of code that might contain a similar defect, we deduce which fix pattern seen in the corpus is most likely to resolve the problem. We then present the proposed solution to the developer, and most of the time they agree that it is a good fix.
HackPPL adds probabilistic programming to Hack, which is a statically-typed variant of PHP. I did some architecture work on the Hack compiler proper and helped build the first prototype of the PPL extensions; it has been interesting to see how it has evolved since.
First off, C# is by far the language that I know best; I spent many thousands of hours studying it carefully and thinking about its design and implementation.
When faced with a novel problem, my thought process usually begins with “how would I do this in C#?”.
But that’s not what I love about it. C# was designed by professional, pragmatic programmers for professional, pragmatic programmers. It’s firmly in the OOP family of languages, but the design is not dogmatically OOP; the designers look at what is working well in functional languages, declarative languages, research languages, and so on, and incorporate the best ideas from those languages without losing sight of what makes C# feel like C#.
It was a privilege to work with that design team for so many years.
I started programming in C# when I worked at Microsoft in the very early days of the language.
Just as the design process for C# 3 was wrapping up, I joined the C# compiler team, where I implemented a lot of the semantic analyzer. I was then invited to join the C# design committee; I spent about seven years at Microsoft working on the design of C# 4, 5, and 6, and implementing the “Roslyn” version of the compiler, again mostly concentrating on the semantic analysis engine.
I’m particularly pleased to have worked on the overload resolution and type inference engine in several versions of C#; there are some interesting problems to solve! And as I noted before, I worked at Coverity for a couple of years on a Roslyn-based static analyzer that looks for defects in real-world C# programs.
I took my manager’s advice, and every time there was a question that I knew the answer to, I posted a complete and correct answer.
If there was a question about my area of expertise that I did not have a correct, complete answer to, I researched it until I did. And in a very short amount of time, I was the de facto expert on JS language semantics at Microsoft. That sort of cross-organization recognition is a big help when growing a career.
In the early 2000s, Microsoft had a company initiative to improve the image of the company within the developer community; it was seen as faceless and aloof, which seemed bizarre to those of us on the inside who were very focused on improving customer productivity.
By posting a lot of solid content and having really good interactions with the developer community, much the same thing happened as had happened in the 1990s: my blog became known as the place to go for insider information about C#.
And from that, it was very natural to continue supporting users by answering questions on SO, editing books about C#, and so on. It’s fun, and I learn a lot about where the pain points are in a language from all kinds of different perspectives.
The problems that experts have with a language are very different than those that beginners have, but they’re both important; by improving the learning experience for a lot of beginners we will grow the next generation of experts.
We face all kinds of problems in modern programming that involve statistical or probabilistic reasoning, but many modern, general-purpose programming languages do not present any kind of unified, consistent approach to helping developers solve these problems.
For example, cell phone sensors have some error associated with them, so even answering a simple question like, “is the phone moving or still?” involves some probabilistic reasoning, not to mention more complex problems like “is the phone moving on a route that will encounter a construction delay?”.
Almost any problem we face in modern programming has some sort of uncertainty. Think about a few of the probabilistic problems in travel management.
What is the probability that the user will need to make a change to their itinerary, or that any plane will be delayed causing a missed connection? What’s the probability that the recommendation that the user wants most is shown in the first three choices?
It’s safe to say that many problems involve making predictions of an unknowable future, and we can make better predictions if our tools support principled statistical reasoning right out of the box.
Just as object-oriented programming is programming with objects, and functional programming is programming with functions, probabilistic programming is, no surprise, programming with probabilities.
But you’d be right to point out that this tautological answer doesn’t tell us much about any of those programming paradigms.
The basic idea of a probabilistic programming language is that we build into the language itself the notion that a particular value may represent a distribution of possible values, and those values are used by the program to make choices.
Here’s some examples:
A image searching program may use probability to tag images. It may determine that a photo is 80% likely to be a labradoodle, 15% likely to be a pile of fried chicken, and 5% likely to be something else. How should it be tagged?
A road monitoring app may use this to decide notifications. It may determine the cell phone is 90% likely to be moving north but 50% of the time the user has been on this route, they stop for lunch on the next block; should we inform the user of the construction delay five blocks north of them?
Based on the control flow, the program then infers new probabilities based on combinations of old ones.
The question to the language designer is then: how do we represent those combinations? How do we represent “60% of all emails are not spam, but 99% of emails that mention Nigerian bank offices are spam”, and use that to make a good decision about whether to filter an incoming email? What specifically does such a program look like, and how can we make it natural and easy for the developer to write such a program?
In many modern languages we have created tools that provide a unified, consistent approach to solving problems involving sequences of data; think about LINQ in C#, or sequence comprehensions in Python. How did we do that?
We started by coming up with a unifying abstraction that all sequences have in common, and then we built language elements that allow developers to combine those abstractions in a powerful way.
There are some mathematical abstractions that are so much the “air we breathe” that we don’t even think of them as abstractions anymore, like addition or multiplication.
The genius of LINQ in C# was to say that, just as addition and multiplication are built into the language as operations on numbers, the operations of sort, filter, group, join, and project are built into the language as operations on sequences. Just as you say:
x = a + b * c;
and have a natural intuition about what that means, so too you can say:
results = from c in customers where c.City == "London" select c.LastName;
Even if you are not a C# programmer, it is pretty easy to see that we’ve got a collection of customers and we’re asking “what are the last names of the customers in London?”. These operations are baked into the language, just as addition is baked in.
We could have a similar approach to statistically distributed data similarly embedded into programming languages and their libraries. The connection between sequences and distributions is very strong; one of the ways to think about a distribution is that it is an infinite sequence of values. A six-sided die can be modeled as an unbounded sequence of rolls where each number appears some fraction of the time.
That said, the operations you typically perform on sequences can be very different than the operations you typically perform on distributions, so it is important to not go too far in treating two similar things as though they are the same thing.
Operations like, “sample from this distribution” or “compute a posterior from this prior and this observation”, could be similarly abstracted into the type system and then supported by new features in the language. But we are only just starting to see these sorts of features appear in line-of-business languages.
System.Randomand how stochastic techniques in C# fix that?
In C# there is a class called
System.Random that gives you two things: either a uniform distribution of fractions between
1.0, or a uniform distribution of integers between an upper and lower bound.
Historically, the implementations of this class have been pretty poor in that it is very easy to write a buggy program using it; we want the natural, easy way to use a library to also be the right way, and it is not.
Fortunately some of these problems have been fixed in .NET Core, but the true deficiency is deeper than the poor implementation choices. The real problem is that we are well beyond merely needing a source of uniform randomness on an interval to sample from; the problems we have to solve that use probabilities are orders of magnitude more complex.
If the problem you have is, "the probability of a random person in a population having a disease is
X%, and we have a diagnostic test that is correct
Y% of the time.
If a randomly chosen person tests positive, what is the probability that they have the disease?", then the answer is neither
Y, but a combination of the two which we can work out mathematically.
This kind of reasoning is hard for humans to do, even if they’re trained. But if we have elements in our programming languages that represent prior probabilities, observations, and posterior probabilities, then we can write very straightforward programs that answer these questions for us correctly, just as we can write a straightforward program that means “give me the last names of customers in London”.
There has been a lot of progress in these areas in research languages in academia; it’s an exciting prospect to consider moving these ideas into general-purpose languages.
I was intrigued by this idea, so I wrote a long series of articles in my blog to explore some of the possibilities. My introduction into the problem space was complaining about the deficiencies of
System.Random, so I called it “Fixing Random”, even though really it is about re-imagining how we treat probabilistic data in languages like C#.
I am very excited by where C# is going in C# 8; on the language design front, embracing non-nullable reference types in the language is an enormously bold move that will pay off in improved developer productivity and fewer user-impacting bugs.
But what is really exciting is how well Microsoft has embraced the open source ethos for the language, and how this encourages the spread of the language beyond the Windows ecosystem and into the broader software community. Making that transition was not easy, and I applaud my colleagues for embracing a new way of working.
I’ve seen no evidence at all that probabilistic programming in C# is on the design team’s radar; I hope it is now!
It’s apparent that probabilistic programming in C# can be tough, especially while using the system.random class, as Eric has alluded to.
In his course, Fixing Random: Techniques in C#, Eric shows you different approaches to improve the
If you’re a fan of C# looking to explore new ways to use the language while furthering your understanding of probabilistic programming, you’ll find this course valuable.
A free, bi-monthly email with a roundup of Educative's top articles and coding tips.