You know how lists are implemented in C (or other low-level languages).
▸Mission 2
Functions
▸OCaml-Top
To begin with, launch ocaml-top: $ ocaml-top(If there is a warning about pp_get_formatter_out_functions, get relaxed, that's ok).
You have 45 seconds to understand the meaning of all buttons.
To check that everything works, define a variable l1 of type (bool * int) list using a top-level let.
▸Uncurried functions
Write your first OCaml function: (* div_pair (x,y) divides x by y or returns 0 if y is 0 *)letdiv_pair ( (x:int), (y:int) ) =
if y = 0then0else x / y
;; Be careful, all the parentheses are necessary, for the moment. Take some time to check all of the following:
You have recognised a top-level let.
It defines a value (a function), named div_pair.
You must be able to understand the type of this function. We consider that this function expects only one argument which is a pair (a tuple). This function is uncurried, that is, its argument is a tuple.
Try these versions, which are equivalent to the first version above: letdiv_pair ( (x,y) : int*int ) =
if y = 0then0else x / y
letdiv_pair = function ((x,y) : int*int) ->
if y = 0then0else x / y
Remove type information:letdiv_pair (x,y) =
if y = 0then0else x / y
letdiv_pair = function (x,y) ->
if y = 0then0else x / y
Ocaml benefits from type inference: it can guess the type of any typeable ML function.
For the curious: Theorem: ML type inference is sound and complete.
Robin Milner. A theory of type polymorphism in programming. Journal of Computer and System Sciences, 17(3):348–375, December 1978.
Remember the type of (+). It expects two arguments of type int.
Define a function addp of type (int*int) → int this is the uncurried version of + ; you have to use + to define it..
Define an uncurried function choose of type (int*int*int) → bool which returns true iff the argument (a,b,c) is such that a < b + c. Test it on several examples.
A first glance at pattern-matching. Consider this function: letcheck = function
| (0,0) -> true
| (x,y) -> x < y
Guess its type.
It is again an uncurried function - it takes two arguments given in one pair.
Apply it to several examples to understand how it works. (We will explain pattern-matching in detail later.)
The following function confused1 does not have type bool → int. Why ? letconfused1 x = function
| true -> 1
| false -> 2
▸Curried functions and closures
Write your first curried fonction: letdivp (x:int) (y:int) =
if y = 0then0else x / y
Checklist:
Understand the difference between the type of divp (curried) and the type of div_pair (uncurried).
Since ML type inference is complete (theorem), you have already tried to remove the type information: letdivp x y = ...
and of course it works.
A curried function takes a first argument and returns another function. What is the type of letf = divp 10? Apply f to different arguments. Remember that f is called a closure, that is a function where some arguments already have a value.
Note that the OCaml compiler is optimized for curried functions. This is why the builtin functions are all curried, take for example the bitwise operations.
The example above can also be written in these equivalent forms:
letdivp = function x -> function y ->
if y = 0then0else x / y
letdivp = fun x -> fun y ->
if y = 0then0else x / y
letdivp = fun x y ->
if y = 0then0else x / y
Read the comparison between fun and function next.
A function can return several values at once, using a tuple:letg x y = (x > 0, 2 * y)
the function g returns a pair of values. Read the inferred type and test g with several arguments.
Exercise : Curried functions
Write the curried versions of choose (to be found in the previous exercise about uncurried functions) You can try to guess the type of both functions before writing them.
Write the curried and uncurried versions of pmul which takes two arguments and returns a pair (a tuple) containing the sum and the product: pmul applied to 2 and 3 returns the pair (5, 6).
Invoke pmul with some arguments, and put the results in two variables u and v. Remember that a let can bind several variables at once.
▸Lambdas (anonymous functions)
Define two functions mul2 and mul3 which take one argument and multiply it by 2 (or 3). You have used a top-level let, haven't you?
Put both functions in a list mul_list. You should obtain exactly: val mul_list : (int -> int) list = [<fun>; <fun>] (We are not interested in applying these functions now). Notice (again) that functions are first-class values: we can put functions in a list.
Anonymous functions, also called lambdas in reference to the λ calculus, can be used to define functions directly, without a name : letlambda_list = [ (fun x -> x * 2) ; (fun x -> x * 3) ]
It also works with several arguments: fun x y -> x * x + y
Anonymous functions are introduced with the keyword fun or function (the difference between fun and function is explained in the previous section).
OCaml is a functional language, so the construction (fun args -> body)
is a standard value, like an int. It can be put in a list, a tuple, ... or bound to a variable with a top-level let: letmul2 = (fun x -> x * 2)
A definition as letf x y z = x - y * z
is in fact automatically replaced by: letf = (fun x y z -> x - y * z)
This is called syntactic sugar (when the compiler automatically replaces a given expression by another expression).
Try it. Then, try to define f similarly (with fun), but in uncurried form.
Try to define mul2 and mul3 with a single let (and not using the and keyword).
▸Polymorphism
The ML type system relies heavily on parametric polymorphism (not to be confused with Java's ad-hoc polymorphism).
Write the following and understand its type: letmk_pair (a:int) (b:int) = (a,b)
(yes, with the type annotations)
Let OCaml type inference work. Remove the type annotations letmk_pair a b = (a,b)
To understand the inferred type, read about polymorphic types.
Apply mk_pair to arguments of different types (float, bool, list, tuples...) and check that you understand what happens.
Study another function: letmk_list a b = [ a ; b ]
Explain why the inferred type is less polymorphic than mk_pair (there is only one universal variable: α).
Exercise : Polymorphic functions
Guess the types of: leta x = mk_pair true x
letb x = mk_list true x
letc x = mk_pair (true, 4) x
letd x = mk_list (true, 4) x
lete x = mk_pair [ 5 ] x
letf x = mk_list [ 5 ] x
letg x = mk_list (+) x
Explain the type of the (<) operator
Try to guess what the function fst does, by just looking at its type.
Guess the polymorphic type of letfun_if a b c = if a then b else c
What is the type of the empty list []
? (it is polymorphic).
A new type: the type 'a option is used to represent a value that can be present (Some x) or absent (None). The type X option basically represents the set of values X ∪ {None}. Look at the types of these values: leta = Noneletb = Some10letc = Sometrueletd = Some"ok" Such a type is useful, for instance, when looking for an element in a table. You may have a function with type : find_element: table → int → element option (where 'table' and 'element' are replaced by relevant types) Here is a function which returns an option. It filters numbers by keeping only positive numbers and by replacing negative numbers by None : letf x = if x < 0thenNoneelseSome x
What is the type of f, of f (-5), and of f 10 ?
A few polymorphic functions:
Find the type of functions (=)
(==)
(<>)
(!=)
(remember what an infix function is?)
(=)
is the structural equality. It recursively compares the structure of values. It cannot compare functions.
(==)
is the physical equality. It just compares pointers. Thus, it can compare function pointers. (but may return false when comparing equal infix functions).
(<>)
is the negation of (=)
(!=)
is the negation of (==)
Try to compare two equal lists using both operators.
Try to compare two functions.
▸Recursive functions
What is in the list type?
The list type is a recursive type. Indeed, a value of type list is either :
the empty list[]
, also used to mark the end of all lists (in C, we use the null pointer).
or, a cell containing an element (here 42) and the rest of the list, called the tail: 42:: rest
, where the rest is also of type list. Thus, each cell in a list contains another list (the tail). The list type structure is recursive.
Vocabulary: the tail of a list [ 1 ; 2 ; 3 ; 4 ] is the list [ 2 ; 3 ; 4 ]. Its head is 1.
Recursive types require recursive functions
Most fonctions operating on lists need to iterate over all the cells in the list. You are used to do the job with a while loop (which exists in OCaml too), but functional programmers prefer a pure approach using a recursive function.
Note: a while loop can always be transformed into a recursive function, and conversely.
Here are two recursive functions: length (which computes the length of a list) and incr, which adds 1 to every element of a list:
letrec length = function
| [] -> 0
| x :: rest -> 1 + length rest
letrec incr = function
| [] -> []
| x :: rest -> x + 1:: incr rest
Note the rec keyword, which is mandatory for recursive functions.
Why is the inferred type of length polymorphic?
The function keyword introduces a pattern matching. Here two cases are considered: the empty list [] and a cell x :: rest consisting of a value (x) and a tail (rest).
Try to apply both functions to a few lists of yours.
Apply it to [ 1, 2, 3 ], what should be the result? Check it, you may be surprised.
Check it
At this point, you should make the difference between the patterns | x :: rest -> ... and | [ x ; rest ] -> .... One matches lists with at least one element, the other matches lists with exactly two elements (besides, the type of 'rest' is different).
Here is a variant, which uses an accumulator:
letrec length_bis acu = function
| [] -> acu
| x :: rest -> length_bis (acu + 1) rest
letrec incr_bis acu = function
| [] -> acu
| x :: rest -> incr_bis (x + 1:: acu) rest
Both functions expect two arguments: an accumulator (int) and a list.
To apply length_bis, start with an accumulator equal to 0: letn = length_bis 0 [ 1 ; 2 ; 3 ; 4 ; 5 ]
To apply incr_bis, the starting accumulator is []. Note that the result is reversed.
Exercise : Recursive functions
If you struggle to write the next recursive functions, in some cases you may need to reset the interpreter, in order to erase old definitions that have been memorized. To do so, just use the left button in the toolbar.
Write a function count_ones which counts the number of 1 inside a list. Write two versions: with/without an accumulator. Here is what you should get for the accumulator version:
count_ones 0 []
0
count_ones 0 [ 1 ]
1
count_ones 0 [ 8 ]
0
count_ones 0 [ 8 ; 9 ]
0
count_ones 0 [ 8 ; 1 ; 9 ; 1 ; 9 ]
2
count_ones 0 [ 1 ; 1 ; 1 ; 9 ; 1 ]
4
count_ones 10 [ 1 ; 1 ; 1 ; 9 ; 1 ]
14
When starting with an accumulator x, the (sum) result is simply added to x
Write a function sum which adds all the elements of a list (write two versions, with/without an accumulator).
sum 0 []
0
sum 0 [ 1 ]
1
sum 0 [ 8 ]
8
sum 0 [ 8 ; 9 ]
17
sum 0 [ 8 ; 1 ; 9 ]
18
sum 0 [ 8 ; 1 ; 9 ; 1 ; 9 ]
28
sum 0 [ 1 ; 1 ; 1 ; 9 ; 1 ]
13
sum 10 [ 1 ; 1 ; 1 ; 9 ; 1 ]
23
Write a function perms which permutes all pairs of a list. Its type is (α × β) list → (β × α) list. The accumulator version also reverses the list.
perms []
[]
(this is the version without accumulator)
perms [ (1, true) ]
[ (true, 1) ]
perms [ ('a', 8) ; ('b', 7) ]
[ (8, 'a') ; (7, 'b') ]
Exercise : Test tail-recursion (that is, recursion with an accumulator)
Using an accumulator in a recursive function has a benefit: it usually makes your function tail-recursive (to be explained precisely during a lecture). As a consequence functions with accumulators are expected to work with very long lists, whereas the equivalent function without an accumulator will fail with lists longer than the call stack. Let's see.
Write a function mk_list such that mk_list 6 returns[6;5;4;3;2;1] and mk_list 0 returns the empty list (and so on, you get the idea).
Write the accumulator version mk_aculist. Notice that the accumulator is a list here. Indeed, the accumulator contains the current result, which is returned when the recursion ends. Notice also that the numbers are increasing.
mk_aculist [] 0
[]
mk_aculist [] 5
[1;2;3;4;5]
mk_aculist [99] 5
[1;2;3;4;5;99]
Find the limit of mk_list: letl1 = mk_list 100000 will probably work, whereas letl1 = mk_list 1000000 will probably fail.
Try the same values with mk_aculist. It should work for longer lists (the limit being now the available memory and the time you have to wait). letl2 = mk_aculist [] 1000000
Try your previous functions sum and count_ones with l2. Non-accumulator functions should fail, whereas accumulator versions should succeed. Update : apparently recent versions of the ocaml compiler are now able to optimize the non-accumulator version of count_ones: it surprisingly operates with a constant stack size. The current stack size can be printed with : letst = Gc.quick_stat () inPrintf.printf "Stack size = %d
%!"Gc.(st.stack_size) ;
(to be used at the deepest level of recursion of course, not once the recursion is finished.)
In OCaml, functions are first-class citizens, meaning that functions can be used as any other value. Try to guess the types of:letfun_list = [ count_ones ; sum ]
letfun_pair = ( count_ones, sum )
This also implies that a function may expect another function as an argument:
Example
letfsum f = f 0 + f 1 + f 2 + f 3 val fsum : (int -> int) -> int = <fun>
fsum expects one argument f and computes the sum f(0) + f(1) + ...
It works with any function f on integers: fsum abs
fsum (fun x -> 2 * x)
fsum (fun y -> if y > 2then1else -1)
Look carefully at the type. The argument, f, is of type (int → int) We say that fsum is a higher-order function since it expects an argument which is itself a function.
Try to guess the (polymorphic) type of: letflist f = [ f 0 ; f 1 ; f 2 ; f 3 ]
Test it with functions of various types:
flist string_of_int
flist (λx.x+1) (translate this into OCaml)
flist (λx.x)
Exercise : Higher-order functions
Write a function fsumlist such that fsumlist f [ a ; b ; c ; ... ]
computes the sum f a + f b + f c + ... Of course, fsumlist f [] returns 0.
You should write two versions of fsumlist: with/without an accumulator. Each version is only three lines of code.
Have you noticed that fsumlist is polymorphic? Try to understand why and test it with values of different types. You may need these functions: int_of_float, int_of_string.
fsumlist is a particular case of function fold_left
Check the type of fsumlist (the accumulator version).
In utop, write List.fold_left ;; so that you get its type.
Compare both types. (fold_left will be seen later, but you may google it)
Brace yourself to touch now the very essence of functional programming: higher-order and polymorphic functions.
Write a function map such that map f [ a ; b ; c ; ... ]
returns a new list [ f a ; f b ; f c ; ... ]
It is a recursive function, without accumulator. Again, it takes three lines.
map (fun x -> x + 1) []
[]
map (fun x -> x + 1) [ 5 ]
[ 6 ]
map (fun x -> x + 1) [ 5 ; 10 ; 15 ]
[ 6 ; 11 ; 16 ]
map string_of_int [ 5 ; 10 ; 15 ]
[ "5" ; "10" ; "15" ]
Guess the type of map (fun x -> (x, string_of_int x)) [ 5 ; 10 ; 15 ]
Look at the type of map, which is polymorphic in two variables α and β. Congratulations, you have written a polymorphic function without even noticing.
Theorem: ML type inference finds the most polymorphic type of any expression. More precisely:
ML has principal types which means that any typeable expression admits a type which is more polymorphic than all the admissible types for this expression.
The ML type inference algorithm is able to find the principal type of any expression.
Corollary: The ML type system works for you.
Write a function find such that find f [ a ; b ; c ; ... ]
returns the first element of the list, x, such that f x
is true. Obviously, f is a predicate (a function returning a boolean). If no element in the list satisfies f, you can raise an exception like this: raiseNot_found (the exception Not_found is already defined in OCaml).
find (λx . true) []
Not_found
find (λx . x > 10) [ 5 ; 12 ; 7 ; 8 ]
12
find (λx . x > 10) [ 5 ; 6 ; 7 ; 8 ]
Not_found
find (λx . true) [ 5 ; 10 ; 15 ]
5
You were expecting this: the type of find is polymorphic.
Both functions map and find already exist in the OCaml standard library. Find them (in the List module). Check that you have the same types.
In the future, if you wish to use existing functions of the standard library (e.g. map or find), you have to write List.map
or List.find
.
Exercise : Map on options
Write a function omap of type (α → β) → α option → β option
This is not a recursive function.
You should write your pattern-matching like this: | None -> ...
| Some x -> ...
omap (λx . true) None
None
omap (λx . true) (Some 10)
Some true
omap (λx . not x) (Some true)
Some false
omap (λx . not x) None
None
omap (λx . x + 1) None
None
omap (λx . x + 1) (Some 100)
Some 101
▸Outcomes
You now know two interpreter environments: utop and ocaml-top
At first glance, you identify if a function is curried or uncurried.
When you see'a in a type, you say α and a ∀ flashes in your mind.
You always try to prove that your recursive function is well-founded.
You know two classical higher-order and polymorphic functions: find and map, and you are eager to discover fold.