We show how to automatically synthesize probabilistic programs from real-world datasets. Such a synthesis is feasible due to a combination of two techniques: (1) We borrow the idea of “sketching” from synthesis of deterministic programs, and allow the programmer to write a skeleton program with “holes”. Sketches enable the programmer to communicate domain-specific intuition about the structure of the desired program and prune the search space, and (2) we design an efficient Markov Chain Monte Carlo (MCMC) based synthesis algorithm to instantiate the holes in the sketch with program fragments. Our algorithm efficiently synthesizes a probabilistic program that is most consistent with the data. A core difficulty in synthesizing probabilistic programs is computing the likelihood L(P | D) of a candidate program P generating data D. We propose an approximate method to compute likelihoods using mixtures of Gaussian distributions, thereby avoiding expensive computation of integrals. The use of such approximations enables us to speed up evaluation of the likelihood of candidate programs by a factor of 1000, and makes Markov Chain Monte Carlo based search feasible. We have implemented our algorithm in a tool called PSKETCH, and our results are encouraging—PSKETCH is able to automatically synthesize 16 non-trivial real-world probabilistic programs.
Tue 16 Jun Times are displayed in time zone: Tijuana, Baja California change
09:15 - 10:55 | Synthesis IResearch Papers at PLDI Main RED (Portland 256) Chair(s): Kathleen FisherTufts University | ||
09:15 25mTalk | Efficient Synthesis of Network Updates Research Papers Jedidiah McClurgUniversity of Colorado Boulder, Hossein HojjatCornell University, Pavol CernyUniversity of Colorado Boulder, Nate FosterCornell University Pre-print Media Attached | ||
09:40 25mTalk | Efficient Synthesis of Probabilistic Programs Research Papers Aditya NoriMicrosoft Research, UK, Sherjil OzairIIT Delhi, Sriram RajamaniMicrosoft Research, Deepak VijaykeerthyMicrosoft Research Media Attached | ||
10:05 25mTalk | FlashRelate: Extracting Relational Data from Semi-Structured Spreadsheets Using Examples Research Papers Dan BarowyUniversity of Massachusetts Amherst, Sumit GulwaniMicrosoft Research, Ted HartMicrosoft Research, Benjamin ZornMicrosoft Research Media Attached | ||
10:30 25mTalk | Synthesizing Data Structure Transformations from Input-Output Examples Research Papers Media Attached |