Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Types from data: making structured data first-class citizens in F#
Petricek T., Guerra G., Syme D.  PLDI 2016 (Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, Santa Barbara, CA, Jun 13-17, 2016)477-490.2016.Type:Proceedings
Date Reviewed: Jul 13 2017

Imagine learning the data schema formats of an application just by analyzing JavaScript Object Notation (JSON), Extensible Markup Language (XML), or comma-separated values (CSV) examples. By using shape inference algorithms, the “shape” of these examples can be inferred, and then described with the F# data and type system. The authors introduce the F# data language and their approach to analyze sample documents and create first-class structured data objects, “citizens,” in F#. Using F# data type providers, processing sample data, and inferring common preferred shapes, F# types can be generated and used by programmers. The papers shows typical real-world JSON services exhibiting some of the more complicated shapes, such as numerical versus string literals, nested collections, and even optional fields.

Their approach flips the typical known type processing into a new “types from data” approach. By exploring structured data, such as JSON nodes, the authors’ shape inference algorithm can recursively find common shapes of all the child nodes of the sample document. These are then defined in the F# language while being related to known preferred shape relations of JSON, XML, or CVS languages. The paper describes their common preferred shape function rules and maps them into an advanced type system. Numerous worked examples are shown that map back to their common shapes.

A formal type model based on the flyweight object-oriented (FOO) calculus forms the basis for their F# data integration approach. With great detail, the paper shows various challenges with example data. With this approach, the type providers create a strong typing model helping a program work with unknown real-world data. Without strong typing, or known schemas, applications usually substitute with lots of dynamic typing and exception handling. Also as schemas or examples change, the inference engine can be rerun to incorporate new knowledge.

The authors have connected two lines of research: (1) extending programming language type systems to accommodate external data services, and (2) inferring types for these same real-world data sources. This paper is also unique in that it describes the programming language theory behind concrete type providers. The paper also provides examples showing how their approach works, and correctness proofs that match their definition of what they can process with desired results. A formal model helps prove relative type safety.

With a rich F# user community, many examples have been processed, which increases information on the shape inference approaches. Information on F# is available online, and libraries to incorporate this process into popular languages are available.

Reviewer:  Scott Moody Review #: CR145419 (1709-0616)
Bookmark and Share
  Featured Reviewer  
 
Language Constructs and Features (D.3.3 )
 
 
Data Models (H.2.1 ... )
 
 
Data Types And Structures (D.3.3 ... )
 
 
Logical Design (H.2.1 )
 
 
Data Storage Representations (E.2 )
 
Would you recommend this review?
yes
no
Other reviews under "Language Constructs and Features": Date
A stub generator for multilanguage RPC in heterogeneous environments
Gibbons P. IEEE Transactions on Software Engineering 13(1): 77-87, 1987. Type: Article
Aug 1 1987
Essentials of programming languages
Friedman D. (ed), Haynes C., Wand M., MIT Press, Cambridge, MA, 1992. Type: Book (9780262061452)
Feb 1 1994
Symbolic computing with Lisp
Cameron R., Dixon A., Prentice-Hall, Inc., Upper Saddle River, NJ, 1992. Type: Book (9780138778460)
Apr 1 1994
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy