Feval: F-Algebras for expression evaluation

Using f-algebras to produce a statically typed functional programming language

A few months ago, I stumbled upon the article Understanding F-Algebras by Bartosz Milewski. After reading the article I became interested in trying to implement a functional programming language using f-algebras. The result was the statically typed functional programming language Feval.

The following is a review of the ideas that went into the creation of the language.

Basic Background

So what is an f-algebra? Well, at it's most basic level, an f-algebra is something that reduces a structure with holes in it. For instance, you can have a grammar that represents addition

data Addition a =
      Num Int
    | Add a a

then the type a is the hole. Since we can have addition expressions like 4 + 3 + 2, we need to this grammar to be recursive. We can do this by defining the type

newtype Fix f = Fx (f (Fix f))

now we can represent 4 + 3 + 2 in our grammar as

expr = Fx $ Add (Fx $ Add (Fx $ Num 4) (Fx $ Num 3)) (Fx $ Num 2)

Great, but how does doing any of this help us? Well, now we can define a catamorphism that allows us to use an algebra

type Algebra f a = f a -> a

to evaluate an expression in our grammar

unFix :: Fix f -> f (Fix f)
unFix (Fx x) = x

cata :: Functor f => Algebra f a -> Fix f -> a
cata alg = alg . fmap (cata alg) . unFix

So in order for us to be able to use the catamorphism, our grammar must be an instance of Functor, which essentially means that the type of the grammar is an object with a hole that we can run functions on

instance Functor Addition where
    fmap f (Num i) = Num i
    fmap f (Add x y) = Add (f x) (f y)

One thing to note is that the catamorphism evaluates the grammar tree from the nodes up. This will become important later when we talk about functions and typechecking. But for now we just need to know that an algebra is a rule for combining the already evaluated children of a node. For instance our algebra for addition is

alg :: Algebra Addition Int
alg (Num i) = i
alg (Add x y) = x + y

and our evaluation function is

eval :: Fix Addition -> Int
eval e = cata alg e

Onto Eval

When thinking about how to create an evaluation scheme for a language there are two important things to consider:

1. What happens if the evaluation fails at some point in the expression? (Consider the expression 4 + True)
2. How can we delay evaluation until it is applicable? (We cannot evaluate Function x -> x + 8 until we know the value applied to x)

To deal with the first point we need to return Maybe a result instead of just a result. In this case Nothing will mean that some sort of type mismatch has occurred (Note: this cannot happen once we have our typechecker). But in order to utilize Maybe we need to somehow incorporate this into our catamorphism. Fortunately, Maybe is a monad so all we need to is redefine our catamorphism. First we introduce a new type of algebra which we will call a monadic algebra

type MAlgebra m f a = f (m a) -> m a

which threads monadic values through the structure of the grammar. Then we can define our monadic catamorphism as

mcata :: Functor f => MAlgebra m f a -> Fix f -> m a
mcata alg = alg . fmap (mcata alg) . unFix

These definitions allow us to define an algebra for a larger grammar (we are now deriving the functor instance, you will need to include {-# LANGUAGE DeriveFunctor #-} at the top of your file)

data DumbAddition a =
      Num Int
    | Buul Bool
    | Add a a
    deriving Functor

like so

alg :: MAlgebra Maybe DumbAddition Int
alg (Num i) = Just i
alg (Buul b) = Nothing
alg (Add x y) = do
    n <- x
    m <- y
    return $ n + m

eval :: Fix DumbAddition -> Maybe Int
eval e = mcata alg e

Now onto the second point. In order to delay evaluation we once again have to redefine our catamorphism. This time we use the grammar (you will need {-# LANGUAGE FlexibleInstances #-} at the top of your file)

data Iffy a b =
      Buul Bool
    | And b b
    | If b a a    -- If b Then a Else a
    deriving Functor

In this case we want to delay the evaluation of either side of the if expression until we know which we actually have to return. For this new style of grammar we define a new Fix

newtype LazyFix f = Fx' (f (LazyFix f) (LazyFix f))

lazyUnFix :: LazyFix f -> f (LazyFix f) (LazyFix f)
lazyUnFix (Fx' x) = x

for our new catamorphism which allows us to delay the evaluation of the a hole in our definition of Iffy. Our new lazy catamorphism is

lazyCata :: Functor (f (LazyFix f)) => Algebra (f (LazyFix f)) a -> LazyFix f -> a
lazyCata alg = alg . fmap (lazyCata alg) . lazyUnFix

Now that this lazyCata only propagates the evaluation through the type b in the definition of Iffy, we can define an algebra as follows

alg :: Algebra (Iffy (LazyFix Iffy)) Bool
alg (Buul b) = b
alg (And x y) = x && y
alg (If p x y) = if p then eval x else eval y

eval :: LazyFix Iffy -> Bool
eval e = lazyCata alg e

Note that our algebra only evaluates the side of the if expression that it needs to.

Putting both of these ideas together we define our overall catamorphism

lazyMCata :: Functor (f (LazyFix f)) => MAlgebra m (f (LazyFix f)) a -> LazyFix f -> m a
lazyMCata alg = alg . fmap (lazyMCata alg) . lazyUnFix

which incorporates both the Maybe and lazy evaluation schemes. To see this in action, consider the grammar

data FunBool a b =
      Buul Bool
    | Var String
    | And b b
    | Or b b
    | Function String a
    | Appl b a -- Function application
    deriving Functor

We can evaluate this using the following

data RVal = RBool Bool | RFunction String (LazyFix FunBool)

-- Substitute a value in for a variable
substitute :: String -> LazyFix FunBool -> LazyFix FunBool -> LazyFix FunBool
substitute _ _ (Fx' (Buul b)) = Fx' $ Buul b
substitute s v (Fx' (Var s')) = if s' == s then v else Fx' $ Var s'
substitute s v (Fx' (And x y)) = Fx' $ And (substitute s v x) (substitute s v y)
substitute s v (Fx' (Or x y)) = Fx' $ Or (substitute s v x) (substitute s v y)
substitute s v (Fx' (Function x p)) = if x == s
    then Fx' $ Function x p
    else Fx' $ Function x $ substitute s v p
substitute s v (Fx' (Appl x y)) = Fx' $ Appl (substitute s v x) (substitute s v y)

boolean_operation :: (Bool -> Bool -> Bool) -> RVal -> RVal -> Maybe RVal
boolean_operation f (RBool x) (RBool y) = Just . RBool $ f x y

convert :: RVal -> LazyFix FunBool
convert (RBool b) = Fx' $ Buul b
convert (RFunction x p) = Fx' $ Function x p

apply :: RVal -> LazyFix FunBool -> Maybe RVal
apply (RFunction x p) e = eval e >>= \v -> eval $ substitute x (convert v) p
apply _ _ = Nothing

alg :: MAlgebra Maybe (FunBool (LazyFix FunBool)) RVal
alg (Buul b) = Just $ RBool b
alg (Var s) = Nothing
alg (And x y) = x >>= \x' -> y >>= \y' -> boolean_operation (&&) x' y'
alg (Or x y) = x >>= \x' -> y >>= \y' -> boolean_operation (||) x' y'
alg (Function x p) = Just $ RFunction x p
alg (Appl f x) = f >>= \f' -> apply f' x

eval :: LazyFix FunBool -> Maybe RVal
eval e = lazyMCata alg e

Translation

Now that we have discussed evaluation, we will consider how to use f-algebras to translate a more expressive language into a smaller and more easily codable language. Suppose we have implemented an eval for

data SmallLanguage a =
      Num Int
    | Buul Bool
    | Not a
    | And a a
    | Or a a
    | Equal a a
    | LessThan a a
    | Function String a
    | Appl a a

we want to add more language features without having to rewrite our evaluator. We can easily expand our language using f-algebras! If we want to implement

data BiggerLanguage a =
      BNum Int
    | BBuul Bool
    | BNot a
    | BAnd a a
    | BOr a a
    | BEqual a a
    | BLessThan a a
    | BLessThanOrEqual a a
    | BGreaterThan a a
    | BGreaterThanOrEqual a a
    | BFunction String a
    | BAppl a a
    | BLet String a a    -- Let s = a In a
    deriving Functor

Our translation algebra is simply

alg :: Algebra BiggerLanguage (Fix SmallLanguage)
alg (BNum i) = Fx $ Num i
alg (BBuul b) = Fx $ Buul b
alg (BNot b) = Fx $ Not b
alg (BAnd x y) = Fx $ And x y
alg (BOr x y) = Fx $ Or x y
alg (BEqual x y) = Fx $ Equal x y
alg (BLessThan x y) = Fx $ LessThan x y
alg (BLessThanOrEqual x y) = Fx $ Or (Fx $ LessThan x y) (Fx $ Equal x y)
alg (BGreaterThan x y) = Fx . Not . Fx $ Or (Fx $ LessThan x y) (Fx $ Equal x y)
alg (BGreaterThanOrEqual x y) = Fx . Not . Fx $ LessThan x y
alg (BFunction s p) = Fx $ Function s p
alg (BAppl x y) = Fx $ Appl x y
alg (BLet s x y) = Fx $ Appl (Fx $ Function s y) x

with our translator defined as

translate :: Fix BiggerLanguage -> Fix SmallLanguage
translate e = cata alg e

Types Abound

For Feval I used an equational type system. Essentially what this means is that we run through the grammar of the language and for each instruction we add an equation to a set, which says what the type of the holes should be. For instance suppose that we have have encountered Add x y then we would add the equations x = Int and y = Int to our set to ensure that we will not encounter a type mismatch. Suppose we have the grammar

data AddOr a =
      Num Int
    | Buul Bool
    | Add a a
    | Or a a
    deriving Functor

Then we can write an f-algebra that returns our set like so

import qualified Data.Set as Set

data Type = TInt | TBool deriving (Eq, Ord)

type Equation = (Type, Type)

type Equations = Set.Set Equation

-- Add both the equation and its reflection
addEquation :: Equation -> Equations -> Equations
addEquation (t, t') e = Set.insert (t, t') $ Set.insert (t', t) e

doubleAdd :: Equation -> Equation -> Equations -> Equations -> Equations
doubleAdd (t, t') (s, s') e e' = addEquation (t, t') $ addEquation (s, s') $ Set.union e e'

type TypeAndEquations = (Type, Equations)

alg :: Algebra AddOr TypeAndEquations
alg (Num _) = (TInt, Set.empty)
alg (Buul _) = (TBool, Set.empty)
alg (Add (t, e) (t', e')) = (TInt, doubleAdd (t, TInt) (t', TInt) e e')
alg (Or (t, e) (t', e')) = (TBool, doubleAdd (t, TBool) (t', TBool) e e')

This algebra returns a TypeAndEquations, which is a pair of the type of the expression and a set of equations from within the expression. Then, to make sure that the expression is correctly typed we need to first find the closure of our set of equations, which means we need to add all other equations that our equation implies. We can do this as follows

-- The boolean value in this means that the equation was not already in the set
addNew :: Equation -> Equations -> (Bool, Equations)
addNew eq e = if Set.member eq e then (False, e) else (True, Set.insert eq e)

-- The boolean value in this means that something new has been added
addTransitives :: Equations -> (Bool, Equations)
addTransitives e = Set.foldr process (False, e) e
    where process (t, t') (b, e) = Set.foldr (addMatch t t') (b, e) e
          addMatch t t' (s, s') (b, e) = if s == t'
                then let (b', e') = addNew (t, s') e in (b || b', e')
                else (b, e)

close :: Equations -> Equations
close e = let (b, e') = addTransitives e in
    if b then close e' else e'

Now all we have to do is make sure that our set is not inconsistent, i.e. there are no equations like TBool = TInt, but this is easy

inconsistent :: Equations -> Bool
inconsistent e = Set.foldr check False e
    where check (TInt, TBool) _ = True -- Recall that we have added all reflections
          check _ b = b

Then our final typechecker is

typecheck :: Fix AddOr -> Maybe Type
typecheck a = let (t, e) = cata alg a in
    let e' = close e in
    if inconsistent e' then Nothing else Just t

Great, but how could we typecheck a more complicated grammar with functions? In this case we need to assign something called a type variable to the argument of the function before typing the function's expression. For instance, to type Function x -> x + 5 we would assign the type variable TVar 0 to x and then typecheck x + 5, which would yield the equations {TVar 0 = TInt, TInt = TInt} with the resulting type TVar 0 -> TInt. Finally we would substitute for the type variable using the equations and find the type of the expression to be TInt -> TInt. Note that we have to delay the typechecking of the Function until we have assigned x a type variable. Furthermore, this variable must be unique, otherwise a boolean variable and an integer variable might be assigned the same type variable and we would find our equations to be inconsistent even if they aren't. In order to implement these principles we will apply similar procedures to the ones we used while discussing evaluation. Consider the following grammar

data FunAdd a b =
      Num Int
    | Var String
    | Add b b
    | Function String a
    | Appl b b
    deriving Functor

First we redefine Type

data Type =
      TInt
    | TVar Int -- Our type variable
    | TArrow Type Type
    | NotClosed
    deriving (Eq, Ord)

The type NotClosed corresponds to the case that there is an expression that is not closed like the expression Function x -> y (note that y has no definition). Then we create a state monad that will produce unique integers for our type variables

import Control.Monad.State

type Counter = State Int

doNothing :: Counter Int
doNothing = state (\i -> (i, i))

newHandle :: Counter Int
newHandle = state (\i -> (i, i + 1))

Then we redefine our typechecker as follows

import Control.Applicative

-- The assignments of arguments to type variables
type Hypotheses = [(String, Type)]

lookupVar :: String -> Hypotheses -> Maybe Type
lookupVar v h = foldr check Nothing h
    where check (s, t) Nothing = if s == v then Just t else Nothing
          check _ r = r

alg :: Hypotheses -> MAlgebra Counter (FunAdd (LazyFix FunAdd)) TypeAndEquations
alg _ (Num _) = (\_ -> (TInt, Set.empty)) <$> doNothing
alg h (Var s) = (\_ -> let r = lookupVar s h in case r of
    Nothing -> (NotClosed, Set.insert (NotClosed, NotClosed) Set.empty)
    Just t -> (t, Set.empty)) <$> doNothing
alg _ (Add x y)
    = (\(t, e) (t', e') -> (TInt, doubleAdd (t, TInt) (t', TInt) e e')) <$> x <*> y
alg h (Function x p) = newHandle >>= \n -> let v = TVar n in
    typecheck' ((x, v) : h) p >>= \(t, e) -> return (TArrow v t, e)
alg _ (Appl x y) = (\n (t, e) (t', e') -> let v = TVar n in
    (v, addEquation (t, TArrow t' v) (Set.union e e'))) <$> newHandle <*> x <*> y

typecheck' :: Hypotheses -> LazyFix FunAdd -> Counter TypeAndEquations
typecheck' g e = lazyMCata (alg g) e

typecheck :: LazyFix FunAdd -> Maybe Type
typecheck e = let (t, e') = evalState (typecheck' [] e) 0
    in let e'' = close e'
    in if inconsistent e'' then Nothing else Just (substitute t e'')

where we have defined

doubleAddNew :: Equation -> Equation -> Equations -> (Bool, Equations)
doubleAddNew eq eq' e = let (b, e') = addNew eq e in
    let (b', e'') = addNew eq' e' in (b || b', e'')

addArrows :: Equations -> (Bool, Equations)
addArrows e = Set.foldr process (False, e) e
    where process (TArrow t t', TArrow s s') (b, e) =
            let (b', e') = doubleAddNew (t, s) (t', s') e in (b || b', e')

close' :: Equations -> (Bool, Equations)
close' e = let (b, e') = addTransitives e in
    let (b', e'') = addArrows e' in (b || b', e'')

close :: Equations -> Equations
close e = let (b, e') = close' e in
    if b then close e' else e'

inconsistent :: Equations -> Bool
inconsistent e = Set.foldr check False e
    where check (TInt, TArrow _ _) _ = True
          check (NotClosed, _) _ = True
          check _ b = b

and a substitution function for the resulting type of the expression

-- This function chooses the best possible substitution
choose :: Int -> Equations -> Equation -> Type -> Type
choose _ _ _ TInt = TInt
choose n e (TVar n', y) (TVar n'') = if n /= n' then TVar n'' else case y of
    TInt -> TInt
    TArrow x y -> TArrow (substitute x e) (substitute y e)
    TVar n' -> if n' < n
        then if n'' < n' then TVar n'' else TVar n'
        else if n'' < n then TVar n'' else TVar n
choose _ _ _ r = r

substitute :: Type -> Equations -> Type
substitute TInt _ = TInt
substitute (TVar n) e = Set.fold (choose n e) (TVar n) e
substitute (TArrow x y) e = TArrow (substitute x e) (substitute y e)

Afterword

Hopefully this has given you a good idea of how useful f-algebras can be in the realm of programming languages. Using these principles I have constructed a language that I call Feval, which includes such features as lists and recursive let definitions like Let f x = If f = 0 Then 0 Else x + f (x - 1). With these constructs we can evaluate things like the following

Let compare x y = If x > y Then -1 Else If x = y Then 0 Else 1 In
Let combine x y = Case x Of
      []       -> y
    | (z : zs) -> Case y Of
          []       -> x
        | (w : ws) -> Let r = compare z w In If r <= 0
            Then w : z : combine zs ws
            Else z : w : combine zs ws In
Let mergesort x = Case x Of
      [] -> []
    | (y : ys) -> Case ys Of
          [] -> [y]
        | (a : b) -> Let mergesort' x l r = Case x Of
              []       -> combine (mergesort l) (mergesort r)
            | (z : zs) -> Case zs Of
                  []       -> combine (mergesort (z : l)) (mergesort r)
                | (w : ws) -> mergesort' ws (z : l) (w : r)
            In mergesort' x [] []
In mergesort [4, 23, -34, 3]

Check it out for more complicated examples of f-algebras and evaluation, as well as a parser and a REPL.