FlashRelate: Extracting Relational Data from Semi-Structured Spreadsheets Using Examples
With hundreds of millions of users, spreadsheets are one of the most important end-user applications. Spreadsheets are easy to use and allow users great flexibility in storing data. This flexibility comes at a price: users often treat spreadsheets as a poor man’s database, leading to creative solutions for storing high-dimensional data in a two dimensional grid. The trouble arises when users need to answer queries with their data. Data manipulation tools make strong assumptions about data layouts and cannot read these ad-hoc databases. Converting data into the appropriate layout requires programming skills or a major investment in manual reformatting. The effect is that vast amounts of real-world data is “locked-in” to a proliferation of one-off formats.
We introduce FlashRelate, a synthesis engine that lets ordinary users extract structured data from spreadsheets without programming. Instead, users drive the extraction process by specifying output examples, which FlashRelate uses to synthesize a program in Flare. Flare is a novel extraction language that extends regular expressions with a geometric constructs. We built an interactive user interface on top of FlashRelate that lets end-users generate Flare programs by point-and-click. We demonstrate that correct extraction programs can be synthesized in seconds from a small number of examples for 43 real-world scenarios. Finally, our case study shows that FlashRelate addresses the widespread problem of data trapped in corporate and government formats.
A video demonstration is available at: http://tinyurl.com/mh3bo3a
Tue 16 JunDisplayed time zone: Tijuana, Baja California change
09:15 - 10:55
|Efficient Synthesis of Network Updates|
Jedidiah McClurg University of Colorado Boulder, Hossein Hojjat Cornell University, Pavol Cerny University of Colorado Boulder, Nate Foster Cornell UniversityPre-print Media Attached
|Efficient Synthesis of Probabilistic Programs|
Aditya Nori Microsoft Research, UK, Sherjil Ozair IIT Delhi, Sriram Rajamani Microsoft Research, Deepak Vijaykeerthy Microsoft ResearchMedia Attached
|FlashRelate: Extracting Relational Data from Semi-Structured Spreadsheets Using Examples|
Dan Barowy University of Massachusetts Amherst, Sumit Gulwani Microsoft Research, Ted Hart Microsoft Research, Benjamin Zorn Microsoft ResearchMedia Attached
|Synthesizing Data Structure Transformations from Input-Output Examples|
John Feser Rice University, Swarat Chaudhuri Rice University, Isil Dillig University of Texas, AustinMedia Attached