A monad is a monoid in the category of endofunctors. So, what’s the problem?
(Jokingly attributed to P. Wadler, from A Brief, Incomplete, and Mostly Wrong History of Programming Languages by J. Iry)
As a mathematician, the joke is still lost on me. As a matter of fact I said the very same thing to myself when I first heard about the use of monads in computer science and then saw the torrent of posts and explanations on the subject on the interwebs. As I understand it, one reason why they are used is because functional languages such as Haskell mostly shun side effects and monads are seen as one possible method to manipulate/simulate/keep track of side effects. I am not going to comment on that. (Addendum: Here is another exposition I really like on the subject.)
Today, I will dig into three notions with examples from common lisp:
Anyone who has some experience with a functional programming language
would know that what separates this class of languages is the relative
ease of passing functions as arguments to other functions. Such
functions which accept other functions as arguments are called
higher order functions. The standard mapcar
is such an example from common lisp. Here is another example:
(defun compose (f g)
(lambda (x) (funcall f (funcall g x))))
COMPOSE
This function takes two lisp functions and returns a new function which is obtained by composing its arguments.
(defparameter new (compose (lambda (x) (* x x)) '1+))
NEW
This function new
is the mathematical function \(f(x) = (x + 1)^2\) written as a composition
of an anonymous function and the standard lisp function 1+
.
Let us test if it works as intended:
(loop for i from 1 to 10 collect (let ((x (random 9))) (cons x (funcall new x))))
((4 . 25) (3 . 16) (2 . 9) (5 . 36) (1 . 4) (6 . 49) (5 . 36) (1 . 4)
(1 . 4) (2 . 9))
A functor is a higher order function which is compatible with composition. Mathematically speaking, a higher order function \(F\) is a functor when for two composable ordinary functions \(g\colon X\to Y\) and \(f\colon Y\to Z\) one has \[ F(f\circ g) = F(f)\circ F(g) \] where \(\circ\) denotes the binary operation of composition. I will give a non-example, and then an example.
Consider the following higher order function
(defun twice (f) (lambda (x) (* 2 (funcall f x))))
TWICE
which takes a numerical function as an argument and returns another
numerical function whose return value is twice of the argument function.
This is not a functor because the functions
(twice (compose f g))
and
(compose (twice f) (twice g))
need not return the same
value when called with the same argument. For example take the identity
function which returns its argument as is
(defun id (x) x)
ID
and then compare these functions at some random point
(funcall (twice (compose 'id 'id)) 10)
20
and
(funcall (compose (twice 'id) (twice 'id)) 10)
40
These values are not the same. So, twice
is a higher
order function but not a functor.
Now, consider the following higher order function
vectorize
(defun vectorize (f) (lambda (x) (mapcar f x)))
VECTORIZE
which takes a function \(f:X\to Y\)
as an argument which operates some type of input \(X\) to create an output of type \(Y\), and promotes it to another function
(vectorize f)
which takes a list of objects of type \(X\) to create list of objects of type \(Y\). For example
(funcall (vectorize '1+) '(0 1 2 3))
(1 2 3 4)
This higher order function is compatible with composition of functions.
(defun LHS (f g) (vectorize (compose f g)))
LHS
and
(defun RHS (f g) (compose (vectorize f) (vectorize g)))
RHS
are going to produce the same effect on every pair of composable functions. For example consider the following random list of integers
(defparameter some-list (loop for i from 1 to 10 collect (random 10)))
SOME-LIST
Now we compare LHS
and RHS
on this list
(funcall (rhs (lambda (x) (* x x)) '1+) some-list)
(36 9 4 81 100 25 81 64 81 25)
and
(funcall (lhs (lambda (x) (* x x)) '1+) some-list)
(36 9 4 81 100 25 81 64 81 25)
Of course, seeing RHS
and LHS
producing the
same output on some random list does not guarantee that RHS
and LHS
will produce the same output for all inputs but
think for few seconds you will see that they should be the same. In
other words, you will convince yourself that vectorize
is a
functor.
If you think that functors are completely theoretical constructs and
devoid of any real-life applications in every day computing,
think again: imagine you have a file data.csv
consisting of
two columns of numbers and you need the sums of the first column and the
second column and written on the third column.
I would use a awk one liner
awk '{ print $1 "\t" $2 "\t" ($1+$2); }' data.csv
and voila! What you should realize now is that you used a functor
similar to vectorize
I defined above :)
There is another aspect of functors I did not get into above. Yes, a functor operates on a function and give you another function, and yes it is also compatible with the composition of function. But what about the type signatures of the functions? A functor \(T\) might transform the type signatures of the functions but it should do it without altering the composability of the functions. If \(f\colon Y\to Z\) and \(g\colon X\to Y\) is a composable pair of functions with their respective type signatures, then the resulting functions \(T(f)\) and \(T(g)\) should have compatible type signatures as well. We mathematicians write the resulting type signatures as \(T(f)\colon T(Y)\to T(Z)\) and \(T(g)\colon T(X)\to T(Y)\).
So, for the vectorize
example above if we were to given
a function of type g:: Int -> Char
then the resulting
function vectorize(g)
would have the type signature
vectorize(g):: [Int] -> [Char]
(Here I am using Haskell
notations and conventions.) And if have another function
f:: Char -> String
then vectorize(f)
will
produce a function with the type signature
vectorize(f):: [Char] -> [String]
without altering
composability of f
and g
.
OK, so you know that a monad is a monoid in the category of endofunctors. But what does that even mean?
First of all a monad \(T\) is a
functor. The “endo” part indicates that the type signature of \(T\) itself (it is a higher order function
afterall, and therefore has its own type signature) is special. It must
be of the form \(T\colon A\to A\). I
would have a hard time describing this in Haskell. In lisp lingo I would
say \(T\) sends a lambda expression to
another lambda expression. I suppose I can write
T: lambda -> lambda
. As such, one can compose \(T\) with itself many times. So, we can look
at the whole sequence of lambda expressions \(T^n(f)\). If \(T\) is a monoid (i.e. a monad) in this
context then we have an extra piece of information to tell us how to
flatten these iterated objects \(T^n(f)\) consistently to level of \(T(f)\).
Let me explain this on the vectorize
example I defined
above. If f
is a function of type signature
f:Int -> Char
then (vectorize f)
sends
lists of integers to lists of characters
[Int] -> [Char]
. Now imagine you applied
vectorize
twice. The new function you get
(vectorize (vectorize f))
will operate on a list of lists
of integers and will produce a list of lists of characters
[[Int]] --> [[Char]]
. In this case the consistency
requirement is as follows: Consider the generic function
flatten:[[a]]->[a]
which takes a nested list and
flattens it. Then we ask that the resulting list from
(flatten (funcall (vectorize (vectorize 'f)) my-list-of-lists))
and from
(funcall (vectorize 'f) (flatten my-list-of-lists))
should be the same list of characters for any list of lists of
integers my-list-of-lists
.
I will give just a simple example:
(defun flatten (x)
(cond ((null x) nil)
((atom x) (list x))
(t (mapcan 'flatten x))))
(defun f (x) (+ (* x x) x 1))
(defparameter my-list-of-lists
(mapcar (lambda (n) (loop for i from 0 to n collect (random 12)))
(loop for i from 1 to 5 collect (1+ (random 7)))))
my-list-of-lists
((4 1 9 3 3) (9 3 11 2) (11 1 6 4 6) (2 6 5 4 7 1) (1 1 7 8 5 11 1 10))
Now, we calculate
(flatten (funcall (vectorize (vectorize 'f)) my-list-of-lists))
(21 3 91 13 13 91 13 133 7 133 3 43 21 43 7 43 31 21 57 3 3 3 57 73 31 133 3 111)
and
(funcall (vectorize 'f) (flatten my-list-of-lists))
(21 3 91 13 13 91 13 133 7 133 3 43 21 43 7 43 31 21 57 3 3 3 57 73 31 133 3 111)
We call such pairs (vectorize,flatten)
as monads. You
need an endofunctor \(T\) and a generic
function \(\eta\) which
flattens \(T^2\) to \(T\) in such a way that any sort of
potentially different flattenings of higher order terms \(T^n\) to \(T\) via any use of consecutive \(\eta\)’s produce the same effect.