Pattern Matching in F# Part 1 : Out of the Box

Of all the features in F#, pattern matching is the killer app. It’s powerful, accessible, and extensible. Patterns condense a lot of decision-making power into little space without sacrificing readability, as you’ll see in this two-part series.

A simple pattern operates a bit like a C# switch statement, but the F# pattern match provides more than flow control. Patterns are a clear syntax for data extraction, transformation, recursive list iteration, and more. Their power lies in the ability to:

  • return a value for the expression as a whole
  • match any type or types
  • capture the values matched
  • embed functions that transform, recognize, and categorize data.

This article explores the first three of these capabilities, along with the patterns built in to F#. The sequel shows how to extend the language with custom patterns.

If you’re not familiar with F# syntax, read Intro to F#. The code samples included here were developed in the interactive console at tryfsharp.org; this is a great tool for following along and experimenting in the language. You can also use the F# Interactive window in Visual Studio if you prefer.

Match Expressions

At first glance a match expression in its simplest form appears to be a switch statement sporting a new syntax. However, it’s called a match expression because it returns a value. This makes pattern matching useful for data transformation.

Let’s take an example.

let stateName = 
  match "MO" with 
  | "MO" -> "Missouri"
  | "TN" -> "Tennessee"
  | _ -> "Unsupported state"

This is like a switch statement, except for the keywords, punctuation marks… okay, almost everything. Let’s break it down:

  1. The value to apply the patterns to, the test-expression, appears between the match and with keywords in line 2.
  2. The lines following match…with each start with whitespace, because in F# whitespace is significant.
    • The pipe character ’|’ that precedes the first pattern must be under the word "match" (or farther to the right). After that, the pipe ’|’ for each subsequent pattern should line up under the first.
    • Each result-expression may contain multiple expressions, separated by semicolons or newlines. Subsequent lines in a result-expression must be indented exactly underneath the start of the first line in the result-expression.
  3. On line 3, the pipe character | indicates the beginning of a pattern.
  4. After the pattern, the arrow -> points to the result of a match.
  5. Everything after the arrow is the result-expression, and it fires when the pattern matches.
  6. The last pattern here (line 5) is a catchall; underscore is a wildcard character. I read it as “whatever-the-heck.”

Exactly one result-expression is evaluated per execution; the first pattern that matches is selected and all others are ignored. In this example, the identifier stateName in line 1 gets bound to “Missouri.”

Everything from “match” to the end (lines 2-5) constitutes the match expression. The expression as a whole returns a value, and the compiler must determine the type of that value. Therefore each result-expression must return the same type as the first result-expression or the compiler will complain: "FS0001: This expression was expected to have type…."

For the match-expression to return a value, some pattern must match every time. If a test-expression doesn’t match any pattern, what is returned? Nothing — MatchFailureException is thrown instead. The friendly F# compiler tries to head this off, warning if you have not covered every possible input: "FS0025: Incomplete pattern matches on this expression…." This warning is a hint; the compiler can’t know about every feasible test-expression. To preclude MatchFailureException and get rid of the warning, add a wildcard pattern to the match expression like the one in line 5. To continue the C# analogy, this acts as the default option in a switch statement.

Returning a value turns pattern matching into a data extraction and transformation tool. It is especially powerful when combined with F#’s flexibility to match against any type and to capture matched values.

Match Any Type

The switch statement in C# has serious type limitations on the test-expressions it accepts. F# is much more flexible; we can test against int or string or int-and-string or list of int-string-float-SomeUserDefinedType – any type and any number of values at the same time. This power enables more flexible decision-making.

Here we match against two strings:

let taxRate = 
   match state, county with
   | "TN", _ -> 0.10
   | "MO", "St. Louis County" -> 0.14
   | "MO", _ -> 0.13
   | _ -> 0.0

This matches against a tuple of string and string. Every pattern also has this type. The match expression can check the value of one or more parts of the tuple; the wildcard character is used for each part it doesn’t care about. Because F# is strongly typed, the quantity and combination of types in the test-expression and each pattern must coincide. The exception is one wildcard character by itself – that will still match whatever-the-heck.

F# can also match against user-defined types. This works particularly well with discriminated unions. Simple discriminated unions can serve the same purpose as C# enums:

type Return =
 | Individual
 | Joint
 | MarriedFilingSeparately

let returnType = Individual

let taxPreparationFee = 
   match returnType with
   | Individual | MarriedFilingSeparately -> 30.00
   | Joint -> 45.00

Notice the "OR" pattern with the single pipe ’|’, where two distinct cases share the same result-expression.

More complicated discriminated unions add more matching options. Each case in the discriminated union can have values associated with it, and the pattern for that case matches against those values. In the following example, the State case has one value associated with it, and so its pattern has one value to match; County has two values, and its pattern matches two values.

type TaxJurisdiction = 
   | Federal
   | State of string
   | County of string * string

let tj = County ("MO", "St. Louis City")

let additionalFee =
   match tj with
   | State "TN" -> 3.00
   | County ("MO", _) -> 1.50
   | _ -> 0.0

Another user-defined type well-suited to pattern matching is a record. The match can specify some or all of the fields in the record:

type CharitableContribution = {taxExemptCategory:string; amount:float; receipt:bool}

let donation = {amount = 50.00; taxExemptCategory="501c3"; receipt=true}

let isDeductible =
   match donation with
   | { taxExemptCategory = "509a4" }-> true
   | { taxExemptCategory = "501c3" ; receipt=true }-> true
   | _ -> false

Techniques for matching any other sort of user-defined type include guard clauses (described later in this article) and active patterns (described in a later article). Pattern matching is flexible enough to operate on any type, built-in or user-defined.

We can match on any type of test-expression, so can we match the type of the test-expression? Yes, as long as it’s distinguishing among subclasses with a common superclass. Since Object is a common superclass to almost everything, this isn’t much of a restriction. However, the type-match operator :? won’t work on primitives — the compiler complains “error FS0016: The type ’int’ does not have any proper subtypes and cannot be used as the source of a type test.” The trick is to box up the primitives:

let x = 3

match box x with 
 | :? System.Int32 -> printfn "this is an integer"
 | :? System.Single -> printfn "this is a float"
 | _ -> printfn "some other type"

Prints:

this is an integer

Since the checked-for types and the test-expression all extend Object, type safety is satisfied. If x happens to be an Object already, the box operator will not affect it.

There are more special pattern syntaxes for matching lists and arrays. These become far more useful when the matched values can be named for later use.

Keep What You Find

Binding all or part of the matched input to a variable makes pattern matching the elegant solution for data extraction and decomposition. One extremely common idiom in F# uses a special list matching syntax. The "cons pattern" matches the first item before the :: operator and the rest of the list afterward. It is great for recursive functions that operate on a list, handling one item at a time.

let rec doSomethingWithAList = function
    | [] -> printfn "that's all"
    | head :: tail -> printf "%s and " head ; doSomethingWithAList tail

doSomethingWithAList [ "MO" ; "TN" ]

prints

MO and TN and that's all

In this example, the first pattern matches an empty list. The second pattern matches a list with at least one item in it, and binds the first item to "head" and the rest of the list to "tail". Notice the pattern-matching function syntax: it omits the parameter declaration and replaces the “match x with” clause with function. Bonus — even less code.

Let’s look at a second example containing the other special list-matching syntax. It distinguishes 0, 1, 2, or 3-item lists:

let taxPreparationCharge2 =
  match listOfStates with
   | [] -> failwith "empty list!" 
   | [state] -> "10.00 for " + state
   | [first; second] -> "15.00 for " + first + " and " + second
   | [first; _; _] -> "18.00 for " + first + " and two other states"
   | _ -> "20.00 for more than three states"

As we’re breaking apart the list, we can capture each value. When the pattern matcher sees a token starting with a lowercase letter, it matches whatever-the-heck just like a wildcard, but binds the value to that identifier for use in the result expression. (It will also bind into variables beginning with an uppercase letter, but this is a bad idea. An uppercase-beginning name might refer to a constant or an active pattern of some type. Lowercase-beginning names are always identifiers for value capture.)

A very similar syntax works for arrays:

let taxpayers = [|"Jess";"Eric"|]

let emailSubject = 
   match taxpayers with
   | [|only|] -> "Individual return for " + only
   | [|primary;spouse|] -> "Joint return for " + primary + " and " + spouse
   | x -> failwith ("invalid number of taxpayers: " + string x.Length)

What if you want to capture the value but also want to match against it? The "AND" pattern lets you do both. The single ampersand ’&’ requires both patterns to match the test-expression. In the TaxJurisdiction discriminated union example from the previous section, we can capture values and add a warning message:

let additionalFee =
   match tj with
   | State "TN" & State s -> printfn "Warning: extra fee for %s" s; 3.00
   | County ("MO", _) & County (state, county) -> printfn "Warning: extra fee for %s, %s" county state; 1.50
   | _ -> 0.0

Easy value capture makes pattern matching the standard F# way to extract data from tuples. Say a function returns a tuple of string and string and int, and all we care about is the third value:

let getLocalTaxDistrict(city:string) = ("MO", "St. Louis County", 0.14)

let taxRate = 
   match getLocalTaxDistrict "Rock Hill" with
   | _,_,t -> t

The pattern matches any three-part tuple, discarding the first two parts and binding the last into a variable t. We can make this match expression even nicer and shorter using an "as pattern":

let _,_,taxRate = getLocalTaxDistrict "Rock Hill"

This is pattern matching without the "match" keyword. That’s some very succinct data extraction.

There is one more way to bind the matched value into a name: the as keyword. This method always binds the entire matched value. It comes in handy to return parts of a record other than the ones matched against. For instance, "as" is useful with the type test pattern:

let whatKindIsIt (o : obj) =
    match o with
        | :? System.Int32 as i -> printfn "Integer value: %i" i
        | :? System.String as s -> printfn "String value: %s" s
        | _ as x -> printfn "some other type: %A" x

In this example, the compiler knows the type of the match-scope variable more precisely than the method’s argument, so the printfn expression can be specific to the matched type. Value capture is essential when working with a type more complex than an array, list, or tuple.

Weed Out the Riff-raff

Let’s match against a user-defined type. Custom types don’t lend themselves to matching against constants, so F# provides something called a guard clause. A guard clause is a boolean expression that must evaluate to true for the match to be accepted. Attach a guard clause to a pattern with the when keyword, and use it to narrow the match.

The following example defines a custom type and then a matching function that uses guard clauses to select an opinion.

type StateTax(a:string,i:float,p:float) =
   member info.Abbr = a
   member info.IncomeTaxRate = i
   member info.PropertyTaxRate = p

let highIncomeTax = 0.10
let taxDescription (s : StateTax) =
   match s with
   | sti when sti.IncomeTaxRate >= highIncomeTax -> "Income tax is pretty high"
   | sti when sti.PropertyTaxRate > 0.08 -> "Income tax okay, property tax high"
   | sti -> "Taxes are low"

printfn "Missouri - %s" (taxDescription (StateTax ("MO", 0.08, 0.1)))

The last line outputs:

Missouri - Income tax okay, property tax high

Each pattern here binds the test-expression into “sti” and then performs some other checks on the value in a guard clause. The guard stops the match from proceeding if its condition is not met.

Notice that variables defined outside of the pattern match can be referenced in the guard expression. This doesn’t work within a pattern. Within a pattern, only literals and constants can be directly compared against the test-expression and its parts. To declare an identifier for use in a pattern expression, you must mark it as [<Literal>] and name it beginning with an uppercase letter.

There is another method to match even more flexibly: we can create our own functions to do it. This method is worthy of its own article, and so it shall have one. Watch for Part 2 in this two-part series and learn how to create active patterns.

Conclusion

We’ve seen match expressions that compare, extract, and decompose. We have used guard expressions, banana clips, and wildcard characters. The syntax is concise but not cryptic. The types you can match are unlimited. Active Patterns can do even more – Part 2 in this series will fully empower your match-expressions.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“The generation of random numbers is too important to be left to chance.” - Robert R. Coveyou