OCaml - Lesson 2

▸Goals

Distinguish curried and uncurried functions.
Discover parametric polymorphism.
Play carefully with recursive functions and higher-order functions..

Requirements

You have completed OCaml - Lesson 1.
You are familiar with recursive functions.
You know how lists are implemented in C (or other low-level languages).

▸Mission 2

Functions

▸OCaml-Top

To begin with, launch ocaml-top: $ ocaml-top(If there is a warning about pp_get_formatter_out_functions, get relaxed, that's ok).
You have 45 seconds to understand the meaning of all buttons.
To check that everything works, define a variable l1 of type (bool * int) list using a top-level let.

▸Uncurried functions

Write your first OCaml function:
(* div_pair (x,y) divides x by y or returns 0 if y is 0 *) let div_pair ( (x:int), (y:int) ) = if y = 0 then 0 else x / y ;;
Be careful, all the parentheses are necessary, for the moment.
Take some time to check all of the following:
- You have recognised a top-level let.
- It defines a value (a function), named div_pair.
- You must be able to understand the type of this function.
  We consider that this function expects only one argument which is a pair (a tuple). This function is uncurried, that is, its argument is a tuple.
Try these versions, which are equivalent to the first version above:
let div_pair ( (x,y) : int*int ) = if y = 0 then 0 else x / ylet div_pair = function ((x,y) : int*int) -> if y = 0 then 0 else x / y
Remove type information:let div_pair (x,y) = if y = 0 then 0 else x / ylet div_pair = function (x,y) -> if y = 0 then 0 else x / y
Ocaml benefits from type inference: it can guess the type of any typeable ML function.
For the curious:
Theorem: ML type inference is sound and complete.
- Robin Milner. A theory of type polymorphism in programming. Journal of Computer and System Sciences, 17(3):348–375, December 1978.
- A modern eye on ML type inference, F. Pottier, INRIA 2005.
Exercise : Uncurried functions
- Remember the type of (+). It expects two arguments of type int.
- Define a function addp of type (int*int) → int this is the uncurried version of + ; you have to use + to define it..
- Define an uncurried function choose of type (int*int*int) → bool which returns true iff the argument (a,b,c) is such that a < b + c.
  Test it on several examples.
A first glance at pattern-matching. Consider this function:
let check = function | (0,0) -> true | (x,y) -> x < y
- Guess its type.
- It is again an uncurried function - it takes two arguments given in one pair.
- Apply it to several examples to understand how it works.
  (We will explain pattern-matching in detail later.)
The following function confused1 does not have type bool → int. Why ?
let confused1 x = function | true -> 1 | false -> 2

▸Curried functions and closures

Write your first curried fonction:
let divp (x:int) (y:int) = if y = 0 then 0 else x / y
Checklist:
- Understand the difference between the type of divp (curried) and the type of div_pair (uncurried).
- Since ML type inference is complete (theorem), you have already tried to remove the type information: let divp x y = ... and of course it works.
- A curried function takes a first argument and returns another function.
  What is the type of let f = divp 10?
  Apply f to different arguments. Remember that f is called a closure, that is a function where some arguments already have a value.
Note that the OCaml compiler is optimized for curried functions. This is why the builtin functions are all curried, take for example the bitwise operations.
▸FUN vs FUNCTION.
- fun and function are used to define functions:
  let incr = fun x -> x + 1
  let incr = function x -> x + 1
- With fun one may define a (curried) function expecting several arguments:
  let add = fun x y -> x + y
  let padd = fun (a,b) (c,d) -> (a+c,b+d)
  Note that both functions expect two arguments: padd expects two pairs.
- With function one may define a function expecting only one argument, but it can be matched against several patterns:
  let snd = function | [] -> 0 | [a] -> 0 | a :: b :: _ -> b
  let tadd = function | (0,_,_) -> 0 | (a,b,c) -> a - (b + c)
  snd expects one argument (a list). tadd expects one argument (a triple of ints).
The example above can also be written in these equivalent forms:
- let divp = function x -> function y -> if y = 0 then 0 else x / y
- let divp = fun x -> fun y -> if y = 0 then 0 else x / y
- let divp = fun x y -> if y = 0 then 0 else x / y
Read the comparison between fun and function next.
A function can return several values at once, using a tuple:let g x y = (x > 0, 2 * y) the function g returns a pair of values.
Read the inferred type and test g with several arguments.

Exercise : Curried functions

Write the curried versions of choose (to be found in the previous exercise about uncurried functions)
You can try to guess the type of both functions before writing them.
Write the curried and uncurried versions of pmul which takes two arguments and returns a pair (a tuple) containing the sum and the product: pmul applied to 2 and 3 returns the pair (5, 6).
Invoke pmul with some arguments, and put the results in two variables u and v. Remember that a let can bind several variables at once.

▸Lambdas (anonymous functions)

Define two functions mul2 and mul3 which take one argument and multiply it by 2 (or 3).
You have used a top-level let, haven't you?
Put both functions in a list mul_list. You should obtain exactly:
val mul_list : (int -> int) list = [<fun>; <fun>] (We are not interested in applying these functions now).
Notice (again) that functions are first-class values: we can put functions in a list.
Anonymous functions, also called lambdas in reference to the λ calculus, can be used to define functions directly, without a name :
let lambda_list = [ (fun x -> x * 2) ; (fun x -> x * 3) ]
It also works with several arguments:
fun x y -> x * x + y
Anonymous functions are introduced with the keyword fun or function (the difference between fun and function is explained in the previous section).
OCaml is a functional language, so the construction (fun args -> body) is a standard value, like an int. It can be put in a list, a tuple, ... or bound to a variable with a top-level let:
let mul2 = (fun x -> x * 2)
A definition as let f x y z = x - y * z
is in fact automatically replaced by: let f = (fun x y z -> x - y * z)
This is called syntactic sugar (when the compiler automatically replaces a given expression by another expression).
Try it. Then, try to define f similarly (with fun), but in uncurried form.
Try to define mul2 and mul3 with a single let (and not using the and keyword).

▸Polymorphism

The ML type system relies heavily on parametric polymorphism (not to be confused with Java's ad-hoc polymorphism).

Write the following and understand its type:
let mk_pair (a:int) (b:int) = (a,b)
(yes, with the type annotations)
▸Polymorphic types
- Some types may mention letters such as 'a or 'b:
  Example (Ocaml)
  let f x y = (x,y) ;;
  val f : 'a -> 'b -> 'a * 'b = <fun>
- 'a should be read α. It is a type variable.
- Similarly, 'b, 'c, ... correspond to β, γ, ...
- When a type variable appears in a type, it is implicitly universally quantified (∀).
  Thus, the above type is actually ∀αβ. α → β → α×β. It is a polymorphic type.
- A polymorphic type can be instantiated by replacing type variables by other types.
  Thus, the type int → bool → int×bool is an instance of our polymorphic type (we have replaced α and β by int and bool).
- This implies that the function f admits an infinity of types, in particular int → bool → int×bool
  Do not get confused by type variables such as '_a or '_weak which are not quantified (these are not polymorphic variables).
  These variables '_a or '_weak happen with the value restriction, to be seen later.
Let OCaml type inference work. Remove the type annotations
let mk_pair a b = (a,b)
To understand the inferred type, read about polymorphic types.
Apply mk_pair to arguments of different types (float, bool, list, tuples...) and check that you understand what happens.
Study another function: let mk_list a b = [ a ; b ]
Explain why the inferred type is less polymorphic than mk_pair (there is only one universal variable: α).
Exercise : Polymorphic functions
- Guess the types of:
  let a x = mk_pair true x let b x = mk_list true x let c x = mk_pair (true, 4) x let d x = mk_list (true, 4) x let e x = mk_pair [ 5 ] x let f x = mk_list [ 5 ] x let g x = mk_list (+) x
- Explain the type of the (<) operator
- Try to guess what the function fst does, by just looking at its type.
- Guess the polymorphic type of let fun_if a b c = if a then b else c
What is the type of the empty list []? (it is polymorphic).
A new type: the type 'a option is used to represent a value that can be present (Some x) or absent (None). The type X option basically represents the set of values X ∪ {None}.
Look at the types of these values: let a = None let b = Some 10 let c = Some true let d = Some "ok"
Such a type is useful, for instance, when looking for an element in a table. You may have a function with type :
find_element: table → int → element option (where 'table' and 'element' are replaced by relevant types)
Here is a function which returns an option. It filters numbers by keeping only positive numbers and by replacing negative numbers by None :
let f x = if x < 0 then None else Some x
What is the type of f, of f (-5), and of f 10 ?
A few polymorphic functions:
- Find the type of functions (=)(==)(<>)(!=) (remember what an infix function is?)
- (=) is the structural equality. It recursively compares the structure of values. It cannot compare functions.
- (==) is the physical equality. It just compares pointers. Thus, it can compare function pointers. (but may return false when comparing equal infix functions).
- (<>) is the negation of (=)
- (!=) is the negation of (==)
- Try to compare two equal lists using both operators.
- Try to compare two functions.

▸Recursive functions

What is in the `list` type?

The list type is a recursive type. Indeed, a value of type list is either :

the empty list[], also used to mark the end of all lists (in C, we use the null pointer).
or, a cell containing an element (here 42) and the rest of the list, called the tail: 42 :: rest, where the rest is also of type list.
Thus, each cell in a list contains another list (the tail). The list type structure is recursive.

Vocabulary: the tail of a list [ 1 ; 2 ; 3 ; 4 ] is the list [ 2 ; 3 ; 4 ]. Its head is 1.

Recursive types require recursive functions

Most fonctions operating on lists need to iterate over all the cells in the list. You are used to do the job with a while loop (which exists in OCaml too), but functional programmers prefer a pure approach using a recursive function.

Note: a while loop can always be transformed into a recursive function, and conversely.

Here are two recursive functions: length (which computes the length of a list) and incr, which adds 1 to every element of a list:

let rec length = function
   | [] -> 0
   | x :: rest -> 1 + length rest

let rec incr = function
   | [] -> []
   | x :: rest -> x + 1 :: incr rest

Note the rec keyword, which is mandatory for recursive functions.
Why is the inferred type of length polymorphic?
The function keyword introduces a pattern matching. Here two cases are considered: the empty list [] and a cell x :: rest consisting of a value (x) and a tail (rest).
Try to apply both functions to a few lists of yours.
Apply it to [ 1, 2, 3 ], what should be the result? Check it, you may be surprised.

Check it

At this point, you should make the difference between the patterns | x :: rest -> ... and | [ x ; rest ] -> ....
One matches lists with at least one element, the other matches lists with exactly two elements (besides, the type of 'rest' is different).

Here is a variant, which uses an accumulator:

let rec length_bis acu = function
   | [] -> acu
   | x :: rest -> length_bis (acu + 1) rest

let rec incr_bis acu = function
   | [] -> acu
   | x :: rest -> incr_bis (x + 1 :: acu) rest

Both functions expect two arguments: an accumulator (int) and a list.
To apply length_bis, start with an accumulator equal to 0: let n = length_bis 0 [ 1 ; 2 ; 3 ; 4 ; 5 ]
To apply incr_bis, the starting accumulator is []. Note that the result is reversed.

Exercise : Recursive functions

If you struggle to write the next recursive functions, in some cases you may need to reset the interpreter, in order to erase old definitions that have been memorized. To do so, just use the left button in the toolbar.

Write a function count_ones which counts the number of 1 inside a list.
Write two versions: with/without an accumulator.
Here is what you should get for the accumulator version:

`count_ones 0 []`	0
`count_ones 0 [ 1 ]`	1
`count_ones 0 [ 8 ]`	0
`count_ones 0 [ 8 ; 9 ]`	0
`count_ones 0 [ 8 ; 1 ; 9 ; 1 ; 9 ]`	2
`count_ones 0 [ 1 ; 1 ; 1 ; 9 ; 1 ]`	4
`count_ones 10 [ 1 ; 1 ; 1 ; 9 ; 1 ]`	14	When starting with an accumulator x, the (sum) result is simply added to x

▸Termination of recursive functions

When writing a recursive function, you must make sure that the recursive calls will eventually terminate.

Check that you have considered the final case (here, the empty list).
At every recursive call, check that at least one of the arguments decreases in some way.
Here, each recursive call is made on a smaller list, with exactly one element removed.
Formally, there should exist a measure (here, the list length) with a well-founded order (meaning that we cannot decrease forever ; here the list length is necessarily ≥ 0) such that each recursive call decreases the measure.
When the recursion is guaranteed to terminate, we say that the recursion is well-founded.
Finally, in the very common case where the recursion only visits a finite, existing data structure with no cycle (such as a list), it is a structural recursion, which is necessarily well-founded.

Write a function sum which adds all the elements of a list (write two versions, with/without an accumulator).
sum 0 [] 0
sum 0 [ 1 ] 1
sum 0 [ 8 ] 8
sum 0 [ 8 ; 9 ] 17
sum 0 [ 8 ; 1 ; 9 ] 18
sum 0 [ 8 ; 1 ; 9 ; 1 ; 9 ] 28
sum 0 [ 1 ; 1 ; 1 ; 9 ; 1 ] 13
sum 10 [ 1 ; 1 ; 1 ; 9 ; 1 ] 23
Write a function perms which permutes all pairs of a list. Its type is (α × β) list → (β × α) list. The accumulator version also reverses the list.
perms [] [] (this is the version without accumulator)
perms [ (1, true) ] [ (true, 1) ]
perms [ ('a', 8) ; ('b', 7) ] [ (8, 'a') ; (7, 'b') ]

Exercise : Test tail-recursion (that is, recursion with an accumulator)

Using an accumulator in a recursive function has a benefit: it usually makes your function tail-recursive (to be explained precisely during a lecture). As a consequence functions with accumulators are expected to work with very long lists, whereas the equivalent function without an accumulator will fail with lists longer than the call stack.
Let's see.

Write a function mk_list such that mk_list 6 returns[6;5;4;3;2;1] and mk_list 0 returns the empty list (and so on, you get the idea).
Write the accumulator version mk_aculist. Notice that the accumulator is a list here. Indeed, the accumulator contains the current result, which is returned when the recursion ends. Notice also that the numbers are increasing.
mk_aculist [] 0 []
mk_aculist [] 5 [1;2;3;4;5]
mk_aculist [99] 5 [1;2;3;4;5;99]
Find the limit of mk_list: let l1 = mk_list 100000 will probably work, whereas let l1 = mk_list 1000000 will probably fail.
Try the same values with mk_aculist. It should work for longer lists (the limit being now the available memory and the time you have to wait).
let l2 = mk_aculist [] 1000000
Try your previous functions sum and count_ones with l2. Non-accumulator functions should fail, whereas accumulator versions should succeed.
Update : apparently recent versions of the ocaml compiler are now able to optimize the non-accumulator version of count_ones: it surprisingly operates with a constant stack size.
The current stack size can be printed with :
let st = Gc.quick_stat () in Printf.printf "Stack size = %d %!" Gc.(st.stack_size) ;(to be used at the deepest level of recursion of course, not once the recursion is finished.)
At home, you may watch Tail-call recursion, the musical.

▸Higher-order functions

In OCaml, functions are first-class citizens, meaning that functions can be used as any other value.
Try to guess the types of:let fun_list = [ count_ones ; sum ] let fun_pair = ( count_ones, sum )
This also implies that a function may expect another function as an argument:
Example
let fsum f = f 0 + f 1 + f 2 + f 3
val fsum : (int -> int) -> int = <fun>
- fsum expects one argument f and computes the sum f(0) + f(1) + ...
- It works with any function f on integers: fsum absfsum (fun x -> 2 * x)fsum (fun y -> if y > 2 then 1 else -1)
- Look carefully at the type. The argument, f, is of type (int → int)
  We say that fsum is a higher-order function since it expects an argument which is itself a function.
Try to guess the (polymorphic) type of:
let flist f = [ f 0 ; f 1 ; f 2 ; f 3 ]
Test it with functions of various types:
flist string_of_int
flist (λx.x+1) (translate this into OCaml)
flist (λx.x)
Exercise : Higher-order functions
- Write a function fsumlist such that fsumlist f [ a ; b ; c ; ... ] computes the sum f a + f b + f c + ...
  Of course, fsumlist f [] returns 0.
- You should write two versions of fsumlist: with/without an accumulator. Each version is only three lines of code.
- Have you noticed that fsumlist is polymorphic?
  Try to understand why and test it with values of different types. You may need these functions: int_of_float, int_of_string.
- fsumlist is a particular case of function fold_left
  - Check the type of fsumlist (the accumulator version).
  - In utop, write List.fold_left ;; so that you get its type.
  - Compare both types. (fold_left will be seen later, but you may google it)
Exercise : Classical higher-order, polymorphic functions
Brace yourself to touch now the very essence of functional programming: higher-order and polymorphic functions.
- Write a function map such that map f [ a ; b ; c ; ... ] returns a new list [ f a ; f b ; f c ; ... ]
  It is a recursive function, without accumulator. Again, it takes three lines.
  map (fun x -> x + 1) [] []
  map (fun x -> x + 1) [ 5 ] [ 6 ]
  map (fun x -> x + 1) [ 5 ; 10 ; 15 ] [ 6 ; 11 ; 16 ]
  map string_of_int [ 5 ; 10 ; 15 ] [ "5" ; "10" ; "15" ]
  Guess the type of map (fun x -> (x, string_of_int x)) [ 5 ; 10 ; 15 ]
- Look at the type of map, which is polymorphic in two variables α and β.
  Congratulations, you have written a polymorphic function without even noticing.
  Theorem: ML type inference finds the most polymorphic type of any expression.
  More precisely:
  ML has principal types which means that any typeable expression admits a type which is more polymorphic than all the admissible types for this expression.
  The ML type inference algorithm is able to find the principal type of any expression.
  Corollary: The ML type system works for you.
- Write a function find such that find f [ a ; b ; c ; ... ] returns the first element of the list, x, such that f x is true.
  Obviously, f is a predicate (a function returning a boolean).
  If no element in the list satisfies f, you can raise an exception like this: raise Not_found (the exception Not_found is already defined in OCaml).
  find (λx . true) [] Not_found
  find (λx . x > 10) [ 5 ; 12 ; 7 ; 8 ] 12
  find (λx . x > 10) [ 5 ; 6 ; 7 ; 8 ] Not_found
  find (λx . true) [ 5 ; 10 ; 15 ] 5
- You were expecting this: the type of find is polymorphic.
- Both functions map and find already exist in the OCaml standard library. Find them (in the List module).
  Check that you have the same types.
In the future, if you wish to use existing functions of the standard library (e.g. map or find), you have to write List.map or List.find.
Exercise : Map on options
Write a function omap of type (α → β) → α option → β option
- This is not a recursive function.
- You should write your pattern-matching like this: | None -> ... | Some x -> ...
omap (λx . true) None None
omap (λx . true) (Some 10) Some true
omap (λx . not x) (Some true) Some false
omap (λx . not x) None None
omap (λx . x + 1) None None
omap (λx . x + 1) (Some 100) Some 101

▸Outcomes

You now know two interpreter environments: utop and ocaml-top
At first glance, you identify if a function is curried or uncurried.
When you see'a in a type, you say α and a ∀ flashes in your mind.
You always try to prove that your recursive function is well-founded.
You know two classical higher-order and polymorphic functions: find and map, and you are eager to discover fold.

`sum 0 []`	0
`sum 0 [ 1 ]`	1
`sum 0 [ 8 ]`	8
`sum 0 [ 8 ; 9 ]`	17
`sum 0 [ 8 ; 1 ; 9 ]`	18
`sum 0 [ 8 ; 1 ; 9 ; 1 ; 9 ]`	28
`sum 0 [ 1 ; 1 ; 1 ; 9 ; 1 ]`	13
`sum 10 [ 1 ; 1 ; 1 ; 9 ; 1 ]`	23

`perms []`	[]	(this is the version without accumulator)
`perms [ (1, true) ]`	[ (true, 1) ]
`perms [ ('a', 8) ; ('b', 7) ]`	[ (8, 'a') ; (7, 'b') ]

`mk_aculist [] 0`	[]
`mk_aculist [] 5`	[1;2;3;4;5]
`mk_aculist [99] 5`	[1;2;3;4;5;99]

`map (fun x -> x + 1) []`	[]
`map (fun x -> x + 1) [ 5 ]`	[ 6 ]
`map (fun x -> x + 1) [ 5 ; 10 ; 15 ]`	[ 6 ; 11 ; 16 ]
`map string_of_int [ 5 ; 10 ; 15 ]`	[ "5" ; "10" ; "15" ]

`find (λx . true) []`	`Not_found`
`find (λx . x > 10) [ 5 ; 12 ; 7 ; 8 ]`	12
`find (λx . x > 10) [ 5 ; 6 ; 7 ; 8 ]`	`Not_found`
`find (λx . true) [ 5 ; 10 ; 15 ]`	5

`omap (λx . true) None`	None
`omap (λx . true) (Some 10)`	Some true
`omap (λx . not x) (Some true)`	Some false
`omap (λx . not x) None`	None
`omap (λx . x + 1) None`	None
`omap (λx . x + 1) (Some 100)`	Some 101