Monadification as a refactoring

Preface

We are interested in finding out from the Haskell community what sort of monadic programs people write, and in particular how they arrive at those programs. We have (Deling and Martin) developed a particular algorithm and tool for automatic monadification of Haskell programs, and (Huiqing, Claus and Simon) have built the HaRe tool for Haskell refactoring. We want to offer a monadification option within HaRe. There are various different, and apparently incompatible, styles of monadification, and we are therefore keen to find out what would be most effective for Haskell practitioners.

To illustrate some of the different styles of monadification, the body of this note is a small program together with five example monadifications; we have included these as examples, but we make no claim to have covered the whole space of possibilities. We are interested in finding out

We are happy to receive simple choices as well as more detailed responses. Could you please send your comments, thoughts, code examples and so on to refactor-fp@kent.ac.uk, including the word "monadification" in the subject line. We'll post a summary of this to the Haskell mailing list.

Example program: a simple interpreter

The interpreter works over a type of arithmetical expressions:
module Expr where

data Expr = Lit Int 
	      | BinOp Op Expr Expr

data Op = Add | Div
and the simple interpreter, without monads, is given by:
module Interp where

import Expr 

eval :: Expr -> Int

eval (Lit n) = n
eval (BinOp op e1 e2) 
    = evalOp op (eval e1) (eval e2)

evalOp :: Op -> Int -> Int -> Int

evalOp Add v1 v2 = v1 + v2
evalOp Div v1 v2 = v1 `div` v2
We have chosen to interpret the operators in a separate function, since this allows us to illustrate the way in which curried functions can be given a number of different monadifications. The first two monadifications reflect the continuation-passing transformations described in, for example, the paper by Hatcliff and Danvy, A Generic Account of Continuation-Passing Styles, POPL 1994.

Style 1: Full call-by-value monadification

This style encapsulates a call-by-value approach, under which arguments are fully evaluated before being passed to functions, and functions are evaluated before arguments are passed to them.

In the running example, the type of eval becomes

Monad m => m (Expr -> m Int)
The full code looks like:
evalM1 :: Monad m => m (Expr -> m Int)

evalM1 
  = return (\e ->
            case e of
              Lit n          -> return n
              BinOp op e1 e2 -> 
                      do
                        eo   <- evalOpM1
                        eop  <- eo op
                        em1  <- evalM1
                        v1   <- em1 e1
                        eop1 <- eop v1
                        em2  <- evalM1
                        v2   <- em2 e2
                        v    <- eop1 v2
                        return v
              )
              
evalOpM1 :: Monad m => m (Op -> m (Int -> m (Int -> m Int)))

evalOpM1
  = return (\op ->
            case op of
              Add   -> apply2 (+)
              Div   -> apply2 div
            )

apply2 :: Monad m => (a -> b -> c) -> m (a -> m (b -> m c))

apply2 f 
  = return (\a -> return (\b -> return (f a b)))

Style 2: Full call-by-name monadification

This again is a full monadification, but with arguments to functions passed unevaluated (i.e. wrapped in the monad). The type of eval becomes
Monad m => m (m Expr -> m Int)
and the full transformed program is
evalM2 :: Monad m => m (m Expr -> m Int)

evalM2 
  = return (\em ->
                do e <- em
                   res <- case e of
                     Lit n          -> return n
                     BinOp op e1 e2 -> 
                      do
                        eo   <- evalOpM2
                        eop  <- eo (return op)
                        em1  <- evalM2
                        eop1 <- eop (em1 (return e1))
                        em2  <- evalM2
                        v    <- eop1 (em2 (return e2))
                        return v
                   return res     
              )

evalOpM2 :: Monad m => m (m Op -> m (m Int -> m (m Int -> m Int)))

evalOpM2
  = return $
     liftM (\op ->
            case op of
              Add   -> apply2 (+)
              Div   -> apply2 div
            )

apply2 :: Monad m => (a -> b -> c) -> (m a -> m (m b -> m c))

apply2 f 
  = (liftM (\a -> (liftM (f a))))

Style 3: Restricted call-by-name monadification

This and the remaining two styles are all restrictions of full monadifications in that evaluation of functions is not effectful: the effect of monadification on a function is to produce a function, rather than a computation of a function. It has a call-by-name flavour since all function arguments are passed within the monad.

The evaluator type becomes

Monad m => m Expr -> m Int
and the full code is
evalM3 :: Monad m => m Expr -> m Int

evalM3 em 
  = do e <- em
       res <- (case e of
                Lit n          -> return n
                BinOp op e1 e2 -> 
                  evalOpM3 (return op) 
                          (evalM3 (return e1)) 
                          (evalM3 (return e2)))
       return res     

evalOpM3 :: Monad m => m Op -> m Int -> m Int -> m Int

evalOpM3 mop me1 me2
  = mop >>= \op -> 
       case op of
         Add   -> apply2 (+) me1 me2
         Div   -> apply2 div me1 me2

apply2 :: Monad m => (a -> b -> c) -> (m a -> m b -> m c)

apply2 f e1 e2
  = do v1 <- e1
       v2 <- e2
       return (f v1 v2)

Style 4: Data-directed monadification

Monadification in this style is data-directed: it is decided that a certain type is replaced by computations over that type, and this information is percolated through the program.

The types of the evaluation functions become:

evalM4   :: Monad m => Expr -> m Int
evalOpM4 :: Monad m => Op -> m Int -> m Int -> m Int
The rationale for this is that expressions are evaluated for their effects, transforming the Int type into m Int; this transformation is propogated through the functions called by eval, hence the new type for evalOp. The full transformation is given by:
evalM4 :: Monad m => Expr -> m Int

evalM4 e 
  = case e of
      Lit n          -> return n
      BinOp op e1 e2 -> 
        evalOpM4 op 
             (evalM4 e1) 
             (evalM4 e2)

evalOpM4 :: Monad m => Op -> m Int -> m Int -> m Int

evalOpM4 op me1 me2
  = case op of
         Add   -> apply2 (+) me1 me2
         Div   -> apply2 div me1 me2

apply2 :: Monad m => (a -> b -> c) -> (m a -> m b -> m c)

apply2 f e1 e2
  = do v1 <- e1
       v2 <- e2
       return (f v1 v2)
This can be seen as a variant of the third style: some arguments, namely those of the monadified type, are passed unevaluated, and others, of different type, are passed in non-monadic form. The fourth style therefore mixes call-by-name and call-by-value approaches.

Style 5: Restricted call-by-value monadification

The fifth style moves to a thoroughgoing call-by-value treatment.
evalM5 :: Monad m => Expr -> m Int

evalM5 (Lit n) = return n
evalM5 (BinOp op e1 e2)
  = do a <- evalM5 e1
       b <- evalM5 e2
       evalOpM5 op a b
 
evalOpM5 :: Monad m => Op -> Int -> Int -> m Int
evalOpM5 Add v1 v2 = return (v1 + v2)
evalOpM5 Div v1 v2 = return (v1 `div` v2)
This is the monadification that is produced by the algorithm of Erwig and Ren. A variant of this gives a monadification of evalOpM5 with the type
evalOpM5 :: Op -> Int -> Int -> Int
This function is not monadified, and so this variant will minimally monadify the program.

Code examples

All code examples can be downloaded from here.