Purescript Effects map 1:1 to global state and process-wide coupling

I’ve been working on a little secret project which involves solving some pernicious bugs in something mostly written in C by applying the discipline of using Purescript’s Eff monad, purescript-native and some of the general late-night, against-the-current hacking I was so fond of when I was younger.

This code is heavily threaded, and since it lives in an unusual runtime environment, is subject to fairly complicated rules regarding invisible state internal to the threading system and the OS. My ultimate goal is to use Purescript’s type system as a proof that I’ve solved at least one use-after-free, at least one deadlock, and other junk along the way.

While considering how to write the FFI for this code, I was initially irked by the lack of an auxiliary variable one could pass to the extra invocation done to code under an effect in Purescript’s FFI. To illustrate, here’s part of some purescript-native FFI code I wrote for interfacing with a C library that has global state:

foreign import data Netconn :: *
foreign import data NetconnUnlocked :: *
foreign import data Netconn_evt :: *
foreign import data Netconn_type :: *
foreign import data LWIP :: !
foreign import netconn_tcp :: Netconn_type
foreign import netconn_type_compare :: Netconn_type -> Netconn_type -> Int
instance eqNetconnType :: Eq Netconn_type where
eq = \a b -> netconn_type_compare a b == 0

foreign import ps_netconn_new :: forall eff. Netconn_type -> Eff (lwip :: LWIP | eff) NetconnUnlocked
foreign import ps_netconn_type :: forall eff. Netconn -> Eff (lwip :: LWIP | eff) Netconn_type
foreign import ps_with_netconn :: forall a b eff. NetconnUnlocked -> a -> (Netconn -> a -> Eff (lwip :: LWIP | eff) b) -> Eff (lwip :: LWIP | eff) b

Despite what it looks like, there are 3 different kinds of values exported from the foreign code here:

  1. A data value: netconn_tcp
  2. A pure function: netconn_type_compare
  3. Functions with an effect: ps_netconn_new, ps_netconn_type, ps_with_netconn

Note that there’s nothing associated with the effect row, LWIP . That’s the first clue about the intent.

Secondly, when writing the C++ side of the FFI:

auto ps_with_netconn(const any &c, const any &a, const any &f) -> any {
// Locking here wouldn't do you any good.
return [=] () {
// ... an RAII mutex thingy should go here? This layer of
// invocation is done by purescript.
return f(c)(a);
};
}

auto netconn_type_compare(const any &a, const any &b) -> any {
// You could use a mutex here to lock and release properly, but
// it's hidden from purescript. It'd be invisible.
auto aty = static_cast<netconn_type>(cast<int>(a));
auto bty = static_cast<netconn_type>(cast<int>(b));
return bty — aty;
}

Interestingly, there’s an indirection in the case of the effectful function; a wrapper function that’s returned and executed as though an IIFE in javascript.

// Meanwhile, in the generated caller:
const any& v1 = LwIP::ps_with_netconn(v, Data_Unit::unit, [=](const any& conn) -> any {
return [=](const any& v1) -> any {
return LwIP::ps_netconn_type(conn);
};
})();

The way I think about it is that a side-effecting function is a function that yields a procedure that purescript must then eagerly evaluate (purescript is eager, but perhaps the haskell roots are showing here?).

Initially I was a bit angry that there wasn’t a datum associated with an effect row. My years of OO along with efforts to practice disciplined OO style have led me to avoid coupling by always using interfaces and always injecting references to what I’ll now coin as “celebrity state”. I wanted to have an object representing a held lock on this bit of less-tame C code, and consume it in the next stateful invocation like this (in my head anyway):

// An unordered collection of objects indexed by type using
// std::is_assignable_from or whatever to index.
template <typename First, typename... Rest> struct type_tuple
{ ... };
auto main() -> any {
auto lwip = LWIP(); // Declare the effect row under the Eff monad.
auto console = CONSOLE();
// ...
const any& v1 = LwIP::ps_with_netconn(v, Data_Unit::unit, [=](const any& conn) -> any {
return [=](const any& v1) -> any {
return LwIP::ps_netconn_type(conn);
};
})(make_type_tuple(lwip, console));

Then:

auto ps_netconn_new(const any &type) -> any { 
return [=] (const any &eff) {
auto lwip = (cast<type_tuple_reader>(eff).get<LWIP>(); // reference_wrapper<LWIP> or runtime error.
auto _locked = lwip.acquire();
auto ty = static_cast<netconn_type>(cast<int>(type));
auto res = netconn_new(ty);
return res;
};
}

What I realized was that there’s only one scenario where this has practical value in the generated code; when the library being accessed uses thread local state and we’re using the effect to sequence only legal transitions on a hidden state machine. Can be valuable, but it’s not the most common case.

That’s when it hit me that the meaning of an effect row is a coupling to the universe. Either via global memory state, the states of handles owned by the process, or procedure calls that alter the state of the universe beyond the computing device, the meaning ascribed here is precisely that a specific, identifiable entity outside the purescript code will be altered.

With that in mind, a few more things that I had trouble materializing made sense:

A global list that must be interacted with by external code and purescript should not use a generic “GLOBAL_LIST” effect, each list should have its own effect type if their mutexes can be acquired separately. I haven’t explored it, but I think higher kinded types might allow one to parameterize an effect type with an effect instance.

You should use the absence of open ended effect types to order access to resources that have a definite hierarchy but must exclude resources from other hierarchies (you can define higher kinded types that represent valid combinations of resources).

You should use open-ended effect types along with a definite mutex acquision order.

Each thread should be started by a procedure that indicates what effects that thread exists to enact. You can use queues, lists etc as well as system event objects to move data between effect silos, and of course control access to those via their own labeled effects.

It turns out that you can use wildcard types to help restrict systems exclusively to approved effect configurations, so it’s possible to use effects to disallow certain combinations of operations even when intervening code uses expandable effect types (basically, if you ever found a cheaty valid :: VALID effect row anywhere, that would be quite suspicious).

-- I am the very model of a modern major-general
type EffConAndLwip x = Eff (valid :: LwIP.VALID, console :: CONSOLE, lwip :: LwIP.LWIP) x

doit :: Unit -> EffConAndLwip Unit
doit _ = do
LwIP.memp_init unit
nc <- LwIP.ps_netconn_new LwIP.netconn_tcp
ty <- LwIP.ps_with_netconn nc unit (\conn _ -> LwIP.ps_netconn_type conn)
log (if ty == LwIP.netconn_tcp then “tcp” else “udp”)
-- Fail to compile if we lack the VALID effect.  Our code
-- contains type declarations for valid effect combinations
-- that contain valid. A programmer could add it on their
-- own but it's very easy to check for and be suspicious about.
main :: Eff (valid :: LwIP.VALID | _) Unit
main =
doit unit