Feval: F-Algebras for expression evaluation
A few months ago, I stumbled upon the article Understanding F-Algebras by Bartosz Milewski. After reading the article I became interested in trying to implement a functional programming language using f-algebras. The result was the statically typed functional programming language Feval.
The following is a review of the ideas that went into the creation of the language.
Basic Background
So what is an f-algebra? Well, at it's most basic level, an f-algebra is something that reduces a structure with holes in it. For instance, you can have a grammar that represents additiondata Addition a =
Num Int
| Add a a
then the type a
is the hole. Since we can have addition
expressions like 4 + 3 + 2
, we need to this grammar to
be recursive. We can do this by defining the type
newtype Fix f = Fx (f (Fix f))
now we can represent 4 + 3 + 2
in our grammar as
expr = Fx $ Add (Fx $ Add (Fx $ Num 4) (Fx $ Num 3)) (Fx $ Num 2)
Great, but how does doing any of this help us? Well, now we can define a
catamorphism that allows us to use an algebra
type Algebra f a = f a -> a
to evaluate an expression in our grammar
unFix :: Fix f -> f (Fix f)
unFix (Fx x) = x
cata :: Functor f => Algebra f a -> Fix f -> a
cata alg = alg . fmap (cata alg) . unFix
So in order for us to be able to use the catamorphism, our grammar must be
an instance of Functor, which essentially means that the type of the grammar
is an object with a hole that we can run functions on
instance Functor Addition where
fmap f (Num i) = Num i
fmap f (Add x y) = Add (f x) (f y)
One thing to note is that the catamorphism evaluates the grammar tree from the
nodes up. This will become important later when we talk about functions and
typechecking. But for now we just need to know that an algebra is a rule for
combining the already evaluated children of a node. For instance our
algebra for addition is
alg :: Algebra Addition Int
alg (Num i) = i
alg (Add x y) = x + y
and our evaluation function is
eval :: Fix Addition -> Int
eval e = cata alg e
Onto Eval
When thinking about how to create an evaluation scheme for a language there are two important things to consider:-
1. What happens if the evaluation fails at some point in the expression?
(Consider the expression
4 + True
) -
2. How can we delay evaluation until it is applicable?
(We cannot evaluate
Function x -> x + 8
until we know the value applied to x)
Maybe
a result
instead of just a result. In this case Nothing
will mean that
some sort of type mismatch has occurred (Note: this cannot happen once we have
our typechecker). But in order to utilize Maybe
we need to
somehow incorporate this into our catamorphism. Fortunately, Maybe
is a monad so all we need to is redefine our catamorphism. First we introduce
a new type of algebra which we will call a monadic algebra
type MAlgebra m f a = f (m a) -> m a
which threads monadic values through the structure of the grammar. Then we can
define our monadic catamorphism as
mcata :: Functor f => MAlgebra m f a -> Fix f -> m a
mcata alg = alg . fmap (mcata alg) . unFix
These definitions allow us to define an algebra for a larger grammar (we are now
deriving the functor instance, you will need to include
{-# LANGUAGE DeriveFunctor #-}
at the top of your file)
data DumbAddition a =
Num Int
| Buul Bool
| Add a a
deriving Functor
like so
alg :: MAlgebra Maybe DumbAddition Int
alg (Num i) = Just i
alg (Buul b) = Nothing
alg (Add x y) = do
n <- x
m <- y
return $ n + m
eval :: Fix DumbAddition -> Maybe Int
eval e = mcata alg e
Now onto the second point. In order to delay evaluation we once again have
to redefine our catamorphism. This time we use the grammar (you will need
{-# LANGUAGE FlexibleInstances #-}
at the top of your file)
data Iffy a b =
Buul Bool
| And b b
| If b a a -- If b Then a Else a
deriving Functor
In this case we want to delay the evaluation of either side of the if expression
until we know which we actually have to return. For this new style of grammar we
define a new Fix
newtype LazyFix f = Fx' (f (LazyFix f) (LazyFix f))
lazyUnFix :: LazyFix f -> f (LazyFix f) (LazyFix f)
lazyUnFix (Fx' x) = x
for our new catamorphism which allows us to delay the evaluation of the
a
hole in our definition of Iffy
. Our new lazy
catamorphism is
lazyCata :: Functor (f (LazyFix f)) => Algebra (f (LazyFix f)) a -> LazyFix f -> a
lazyCata alg = alg . fmap (lazyCata alg) . lazyUnFix
Now that this lazyCata only propagates the evaluation through the type
b
in the definition of Iffy
, we can define an algebra
as follows
alg :: Algebra (Iffy (LazyFix Iffy)) Bool
alg (Buul b) = b
alg (And x y) = x && y
alg (If p x y) = if p then eval x else eval y
eval :: LazyFix Iffy -> Bool
eval e = lazyCata alg e
Note that our algebra only evaluates the side of the if expression that it needs to.
Putting both of these ideas together we define our overall catamorphism
lazyMCata :: Functor (f (LazyFix f)) => MAlgebra m (f (LazyFix f)) a -> LazyFix f -> m a
lazyMCata alg = alg . fmap (lazyMCata alg) . lazyUnFix
which incorporates both the Maybe
and lazy evaluation schemes. To
see this in action, consider the grammar
data FunBool a b =
Buul Bool
| Var String
| And b b
| Or b b
| Function String a
| Appl b a -- Function application
deriving Functor
We can evaluate this using the following
data RVal = RBool Bool | RFunction String (LazyFix FunBool)
-- Substitute a value in for a variable
substitute :: String -> LazyFix FunBool -> LazyFix FunBool -> LazyFix FunBool
substitute _ _ (Fx' (Buul b)) = Fx' $ Buul b
substitute s v (Fx' (Var s')) = if s' == s then v else Fx' $ Var s'
substitute s v (Fx' (And x y)) = Fx' $ And (substitute s v x) (substitute s v y)
substitute s v (Fx' (Or x y)) = Fx' $ Or (substitute s v x) (substitute s v y)
substitute s v (Fx' (Function x p)) = if x == s
then Fx' $ Function x p
else Fx' $ Function x $ substitute s v p
substitute s v (Fx' (Appl x y)) = Fx' $ Appl (substitute s v x) (substitute s v y)
boolean_operation :: (Bool -> Bool -> Bool) -> RVal -> RVal -> Maybe RVal
boolean_operation f (RBool x) (RBool y) = Just . RBool $ f x y
convert :: RVal -> LazyFix FunBool
convert (RBool b) = Fx' $ Buul b
convert (RFunction x p) = Fx' $ Function x p
apply :: RVal -> LazyFix FunBool -> Maybe RVal
apply (RFunction x p) e = eval e >>= \v -> eval $ substitute x (convert v) p
apply _ _ = Nothing
alg :: MAlgebra Maybe (FunBool (LazyFix FunBool)) RVal
alg (Buul b) = Just $ RBool b
alg (Var s) = Nothing
alg (And x y) = x >>= \x' -> y >>= \y' -> boolean_operation (&&) x' y'
alg (Or x y) = x >>= \x' -> y >>= \y' -> boolean_operation (||) x' y'
alg (Function x p) = Just $ RFunction x p
alg (Appl f x) = f >>= \f' -> apply f' x
eval :: LazyFix FunBool -> Maybe RVal
eval e = lazyMCata alg e
Translation
Now that we have discussed evaluation, we will consider how to use f-algebras to translate a more expressive language into a smaller and more easily codable language. Suppose we have implemented aneval
for
data SmallLanguage a =
Num Int
| Buul Bool
| Not a
| And a a
| Or a a
| Equal a a
| LessThan a a
| Function String a
| Appl a a
we want to add more language features without having to rewrite our
evaluator. We can easily expand our language using f-algebras! If we want to
implement
data BiggerLanguage a =
BNum Int
| BBuul Bool
| BNot a
| BAnd a a
| BOr a a
| BEqual a a
| BLessThan a a
| BLessThanOrEqual a a
| BGreaterThan a a
| BGreaterThanOrEqual a a
| BFunction String a
| BAppl a a
| BLet String a a -- Let s = a In a
deriving Functor
Our translation algebra is simply
alg :: Algebra BiggerLanguage (Fix SmallLanguage)
alg (BNum i) = Fx $ Num i
alg (BBuul b) = Fx $ Buul b
alg (BNot b) = Fx $ Not b
alg (BAnd x y) = Fx $ And x y
alg (BOr x y) = Fx $ Or x y
alg (BEqual x y) = Fx $ Equal x y
alg (BLessThan x y) = Fx $ LessThan x y
alg (BLessThanOrEqual x y) = Fx $ Or (Fx $ LessThan x y) (Fx $ Equal x y)
alg (BGreaterThan x y) = Fx . Not . Fx $ Or (Fx $ LessThan x y) (Fx $ Equal x y)
alg (BGreaterThanOrEqual x y) = Fx . Not . Fx $ LessThan x y
alg (BFunction s p) = Fx $ Function s p
alg (BAppl x y) = Fx $ Appl x y
alg (BLet s x y) = Fx $ Appl (Fx $ Function s y) x
with our translator defined as
translate :: Fix BiggerLanguage -> Fix SmallLanguage
translate e = cata alg e
Types Abound
For Feval I used an equational type system. Essentially what this means is that we run through the grammar of the language and for each instruction we add an equation to a set, which says what the type of the holes should be. For instance suppose that we have have encounteredAdd x y
then
we would add the equations x = Int
and y = Int
to
our set to ensure that we will not encounter a type mismatch. Suppose we have
the grammar
data AddOr a =
Num Int
| Buul Bool
| Add a a
| Or a a
deriving Functor
Then we can write an f-algebra that returns our set like so
import qualified Data.Set as Set
data Type = TInt | TBool deriving (Eq, Ord)
type Equation = (Type, Type)
type Equations = Set.Set Equation
-- Add both the equation and its reflection
addEquation :: Equation -> Equations -> Equations
addEquation (t, t') e = Set.insert (t, t') $ Set.insert (t', t) e
doubleAdd :: Equation -> Equation -> Equations -> Equations -> Equations
doubleAdd (t, t') (s, s') e e' = addEquation (t, t') $ addEquation (s, s') $ Set.union e e'
type TypeAndEquations = (Type, Equations)
alg :: Algebra AddOr TypeAndEquations
alg (Num _) = (TInt, Set.empty)
alg (Buul _) = (TBool, Set.empty)
alg (Add (t, e) (t', e')) = (TInt, doubleAdd (t, TInt) (t', TInt) e e')
alg (Or (t, e) (t', e')) = (TBool, doubleAdd (t, TBool) (t', TBool) e e')
This algebra returns a TypeAndEquations
, which is a pair of the
type of the expression and a set of equations from within the expression.
Then, to make sure that the expression is correctly typed we need to first
find the closure of our set of equations, which means we need to add all
other equations that our equation implies. We can do this as follows
-- The boolean value in this means that the equation was not already in the set
addNew :: Equation -> Equations -> (Bool, Equations)
addNew eq e = if Set.member eq e then (False, e) else (True, Set.insert eq e)
-- The boolean value in this means that something new has been added
addTransitives :: Equations -> (Bool, Equations)
addTransitives e = Set.foldr process (False, e) e
where process (t, t') (b, e) = Set.foldr (addMatch t t') (b, e) e
addMatch t t' (s, s') (b, e) = if s == t'
then let (b', e') = addNew (t, s') e in (b || b', e')
else (b, e)
close :: Equations -> Equations
close e = let (b, e') = addTransitives e in
if b then close e' else e'
Now all we have to do is make sure that our set is not inconsistent, i.e.
there are no equations like TBool = TInt
, but this is easy
inconsistent :: Equations -> Bool
inconsistent e = Set.foldr check False e
where check (TInt, TBool) _ = True -- Recall that we have added all reflections
check _ b = b
Then our final typechecker is
typecheck :: Fix AddOr -> Maybe Type
typecheck a = let (t, e) = cata alg a in
let e' = close e in
if inconsistent e' then Nothing else Just t
Great, but how could we typecheck a more complicated grammar with functions? In
this case we need to assign something called a type variable to the argument of
the function before typing the function's expression. For instance, to type
Function x -> x + 5
we would assign the type variable
TVar 0
to x
and then typecheck x + 5
, which
would yield the equations {TVar 0 = TInt, TInt = TInt}
with the
resulting type TVar 0 -> TInt
. Finally we would substitute for the
type variable using the equations and find the type of the expression to be
TInt -> TInt
. Note that we have to delay the typechecking of the
Function
until we have assigned x
a type variable.
Furthermore, this variable must be unique, otherwise a boolean variable and an
integer variable might be assigned the same type variable and we would find our
equations to be inconsistent even if they aren't. In order to implement these
principles we will apply similar procedures to the ones we used while
discussing evaluation. Consider the following grammar
data FunAdd a b =
Num Int
| Var String
| Add b b
| Function String a
| Appl b b
deriving Functor
First we redefine Type
data Type =
TInt
| TVar Int -- Our type variable
| TArrow Type Type
| NotClosed
deriving (Eq, Ord)
The type NotClosed
corresponds to the case that there is an expression
that is not closed like the expression Function x -> y
(note that
y
has no definition). Then we create a state monad that will produce
unique integers for our type variables
import Control.Monad.State
type Counter = State Int
doNothing :: Counter Int
doNothing = state (\i -> (i, i))
newHandle :: Counter Int
newHandle = state (\i -> (i, i + 1))
Then we redefine our typechecker as follows
import Control.Applicative
-- The assignments of arguments to type variables
type Hypotheses = [(String, Type)]
lookupVar :: String -> Hypotheses -> Maybe Type
lookupVar v h = foldr check Nothing h
where check (s, t) Nothing = if s == v then Just t else Nothing
check _ r = r
alg :: Hypotheses -> MAlgebra Counter (FunAdd (LazyFix FunAdd)) TypeAndEquations
alg _ (Num _) = (\_ -> (TInt, Set.empty)) <$> doNothing
alg h (Var s) = (\_ -> let r = lookupVar s h in case r of
Nothing -> (NotClosed, Set.insert (NotClosed, NotClosed) Set.empty)
Just t -> (t, Set.empty)) <$> doNothing
alg _ (Add x y)
= (\(t, e) (t', e') -> (TInt, doubleAdd (t, TInt) (t', TInt) e e')) <$> x <*> y
alg h (Function x p) = newHandle >>= \n -> let v = TVar n in
typecheck' ((x, v) : h) p >>= \(t, e) -> return (TArrow v t, e)
alg _ (Appl x y) = (\n (t, e) (t', e') -> let v = TVar n in
(v, addEquation (t, TArrow t' v) (Set.union e e'))) <$> newHandle <*> x <*> y
typecheck' :: Hypotheses -> LazyFix FunAdd -> Counter TypeAndEquations
typecheck' g e = lazyMCata (alg g) e
typecheck :: LazyFix FunAdd -> Maybe Type
typecheck e = let (t, e') = evalState (typecheck' [] e) 0
in let e'' = close e'
in if inconsistent e'' then Nothing else Just (substitute t e'')
where we have defined
doubleAddNew :: Equation -> Equation -> Equations -> (Bool, Equations)
doubleAddNew eq eq' e = let (b, e') = addNew eq e in
let (b', e'') = addNew eq' e' in (b || b', e'')
addArrows :: Equations -> (Bool, Equations)
addArrows e = Set.foldr process (False, e) e
where process (TArrow t t', TArrow s s') (b, e) =
let (b', e') = doubleAddNew (t, s) (t', s') e in (b || b', e')
close' :: Equations -> (Bool, Equations)
close' e = let (b, e') = addTransitives e in
let (b', e'') = addArrows e' in (b || b', e'')
close :: Equations -> Equations
close e = let (b, e') = close' e in
if b then close e' else e'
inconsistent :: Equations -> Bool
inconsistent e = Set.foldr check False e
where check (TInt, TArrow _ _) _ = True
check (NotClosed, _) _ = True
check _ b = b
and a substitution function for the resulting type of the expression
-- This function chooses the best possible substitution
choose :: Int -> Equations -> Equation -> Type -> Type
choose _ _ _ TInt = TInt
choose n e (TVar n', y) (TVar n'') = if n /= n' then TVar n'' else case y of
TInt -> TInt
TArrow x y -> TArrow (substitute x e) (substitute y e)
TVar n' -> if n' < n
then if n'' < n' then TVar n'' else TVar n'
else if n'' < n then TVar n'' else TVar n
choose _ _ _ r = r
substitute :: Type -> Equations -> Type
substitute TInt _ = TInt
substitute (TVar n) e = Set.fold (choose n e) (TVar n) e
substitute (TArrow x y) e = TArrow (substitute x e) (substitute y e)
Afterword
Hopefully this has given you a good idea of how useful f-algebras can be in the realm of programming languages. Using these principles I have constructed a language that I call Feval, which includes such features as lists and recursive let definitions likeLet f x = If f = 0 Then 0 Else x + f (x - 1)
. With these constructs
we can evaluate things like the following
Let compare x y = If x > y Then -1 Else If x = y Then 0 Else 1 In
Let combine x y = Case x Of
[] -> y
| (z : zs) -> Case y Of
[] -> x
| (w : ws) -> Let r = compare z w In If r <= 0
Then w : z : combine zs ws
Else z : w : combine zs ws In
Let mergesort x = Case x Of
[] -> []
| (y : ys) -> Case ys Of
[] -> [y]
| (a : b) -> Let mergesort' x l r = Case x Of
[] -> combine (mergesort l) (mergesort r)
| (z : zs) -> Case zs Of
[] -> combine (mergesort (z : l)) (mergesort r)
| (w : ws) -> mergesort' ws (z : l) (w : r)
In mergesort' x [] []
In mergesort [4, 23, -34, 3]
Check it out for more complicated examples of f-algebras and evaluation, as well
as a parser and a REPL.
comments powered by Disqus