In this article, we’ll look at the new Type Provider feature in F# 3.0, describe what they are, how they are useful, and how to use them. F# 3.0 is available as part of Visual Studio 2012 Professional or higher. You can also download the F# Tools extension for Visual Studio 2012 Express for Web to give F# and type providers a try.
Motivation
One pain point that developers frequently experience in any language is the need to write repetitive boilerplate code. For example, when building a .NET application that accesses data from several tables in a database, the code that’s used to map columns of a table to fields of a class may need to be repeated.
One of the most popular approaches for dealing with this is code generation—using an application that takes configuration information as input and produces source code as output. Some languages, like C#, have even added specific features such as partial classes to make working with generated code easier.
There are several advantages to code generation when compared to writing boilerplate code by hand:
- Code generation allows developers to save significant amounts of time by writing less code.
- Code generation prevents copy and paste errors that can occur when code is manually duplicated.
- For common problems, such as accessing a database, a code generator can be written and maintained by a single implementer and used by individuals across many organizations. For instance, Microsoft has implemented several code generators for .NET, such as the sqlmetal command line tool for generating code for LINQ to SQL.
- Code generators can be built into IDE extensions that provide easy-to-use design-time experiences. For instance, Visual Studio contains designers for LINQ to Entities that make it easy to specify various mapping details.
These advantages have made code generators a well-understood tool for developing applications using languages such as C#. However, there are several drawbacks of code generators:
- The intrusiveness of code generation may outweigh the benefits for small schemas, where hand generated mappings may suffice.
- Code generation is explicitly a two-step process, in that the code must be generated before it can be used.
- Code generators are extra-linguistic—they require using a tool external to the programming language being used. This forces switching back and forth between running the code generator and developing your code.
- Code generators only work when the schemas they represent are of a small size. They do not scale up well to scenarios where thousands of types might be needed.
- Since code generators are usually manually invoked outside of the build process, generated code can easily get out of sync with the source from which it was generated. This could delay errors due to obvious mismatches until runtime even though they could have been identified at compile time.
These drawbacks apply regardless of language, but in F# they may feel particularly acute since they interfere with F#’s emphasis on an interactive development style. One common alternative to code generation is dynamic binding, which provides a streamlined approach at the cost of type safety and tooling such as IntelliSense. The design of F# 3.0’s Type Provider mechanism achieves the best of both worlds, avoiding the downsides listed above while keeping the productivity benefits of code generation.
Type Providers
So, what is it then? In short, a type provider is a component which exposes metadata to the F# compiler so that it is visible as .NET types, methods, and properties. The type provider also provides information that allows the compiler to resolve calls to those provided types and members into corresponding calls to regular .NET members. The type provider is invoked during compilation, and the provider can be parameterized based on configuration information such as a database connection string, a web service endpoint, or a file location.
This approach eliminates the drawbacks from the previous list, while providing the benefits of a more dynamic approach:
- The type provider is invoked during compilation; there is no separate code generation phase.
- Type providers are integrated into the language. Assemblies containing type providers are simply referenced like any other library, and the provided types appear to the user to be normal .NET types. There is no need to step outside of the language to invoke the type provider, and type providers can be used for exploratory development from F# Interactive without any issues.
- Type providers don’t need to generate real .NET types for each provided type; the type provider can instead supply the compiler with the information needed to translate calls to members of provided types into calls to members of existing types. This allows type providers to scale up to situations where thousands or millions of logical types might be needed, which code generation would be unable to handle. For an example of this, see the Freebase type provider in Don Syme’s BUILD talk from last year.
- Since type providers are invoked during compilation, it is not possible for the provided types to become out-of-sync with the rest of the codebase.
Out of the box, F# 3.0 includes several type providers in its standard library. These type providers make it easy to access common data sources such as web services and relational databases. When combined with F# 3.0’s new extended support for LINQ queries (see MSDN for more information), these type providers make it easy to write succinct code for a variety of data access tasks. However, end users can also author custom type providers to make accessing other types of data just as easy. For more information on how to authoring a type provider, see the MSDN tutorial.
Using Type Providers
Using a type provider is extremely straightforward: just add a reference to the library that contains the type provider as well as any other libraries that it depends on. Then, provided types from the assembly will be available to the calling code just as normal .NET types would be. Tooling like IntelliSense works as expected, and the provided types can be used as part of a compiled F# application or interactively via an F# script.
As a concrete example, consider the ODataService type provider that is part of the F# 3.0 standard libraries. This provider presents a strongly typed interface to web services exposed as OData feeds (see http://www.odata.org/ecosystem for a list of many sites that provide OData access, including Netflix and eBay). This provider does for F# what the SvcUtil.exe code generator does for C#, only in a much more streamlined way.
For this example, we’ll use the OData feed provided by the Stack Overflow programming question and answer site to determine how many questions related to type providers have been asked on the site. First, we’ll add a reference to the standard F# type provider library and the System.Data.Services.Client assembly that it uses for OData access (this assumes that the code is being used from an F# script; if using a compiled F# application, then just add the references to the project):
#r "System.Data.Services.Client" #r "FSharp.Data.TypeProviders"
Next, we open the type provider namespace:
open Microsoft.FSharp.Data.TypeProviders
Then, we’ll use the ODataService type provider, passing in the Stack Overflow OData feed URL as a static argument to it:
type stackOverflowData = ODataService<"http://data.stackexchange.com/stackoverflow/atom"> let context = stackOverflowData.GetDataContext()
By providing the URL as a static argument, we enable the ODataService provider to provide specific types and properties tailored to the schema of the StackOverflow data. And just like that, without any explicit code generation, we have a strongly-typed way to access the Stack Overflow data including IntelliSense support, when we access the data context, as the following screenshot demonstrates:
Here we can see that the Stack Overflow service exposes types for Badges, Comments, Posts, etc., and drilling into each of these types would show the set of properties that the Stack Overflow API exposes for each one. Even if we’ve never looked at any documentation for this service, we can jump right in and interactively explore the data. When combined with F# 3.0’s added support for language integrated, strongly typed queries, this lets us easily query the Stack Overflow service to ask questions like “How many questions tagged as F# contain the text ‘type provider’?”. Note that in contrast to C#’s LINQ support, F# 3.0’s query expressions can contain the full set of LINQ operators, including aggregation operations like count:
let tpQuestionCount = query { for post in context.Posts do where (post.Tags.Contains("f#") && post.Body.Contains("type provider")) count }
Evaluating this expression will run the query against the server and return the answer (30, as of the writing of this article). This doesn’t need to be put into a compiled F# application—it works just as well from an F# script run via F# interactive, which makes it easy to iteratively refine the analysis being performed.
By contrast, consider how much less streamlined the corresponding approach in C# would be. First, the SvcUtil.exe code generator would need to be invoked (either explicitly, or via Visual Studio’s “Add Service Reference” feature) and the resulting C# files would need to be added to a C# project. Next, the code to perform the query would need to be written. However, C# doesn’t have any substitute for rapid iteration using F# interactive—any modifications to the query would need to run through another edit-compile-run cycle.
Summary
This article introduced F# Type Providers, a feature introduced in F# 3.0. We discussed the ideas behind the feature, including the advantages over a straightforward code generation approach. We also showed a practical example of how to use a type provider to access an OData web service and perform strongly typed queries against it. Other built-in type providers provide easy access to other data sources like databases or web services exposed by WSDL. When combined with F#’ interactive mode of development, this results in a powerful way to explore a wide variety of data in a strongly typed, language integrated fashion.
Comments