Algebraic effects for testable I/O

Cadel Watson
Geora
Published in
7 min readJul 15, 2022

At Geora we are a team of software engineers, agriculture pros and sustainability devotees. You are reading a blog written by our engineering team and it’s pretty tech heavy, if that’s not for you check out our other posts on sustainability and farming.

At Geora, we’re building a platform for agricultural asset traceability and finance. The more data we collect from users about their produce, the better we can help them demand a higher price from buyers and access new forms of capital. But manual data entry is time consuming and error prone, so one of our key principles is to prefer integrations with the platforms farmers and traders already use.

Our in-house integrations are deployed as separate microservices, which can only communicate with Geora through our public GraphQL API. This forces us to enhance and dog-food our product rather than resorting to hacks like direct database access, and gives us a clean separation of concerns. With more microservices, however, comes a greater testing burden. We found it more and more difficult to test the interactions between services without running everything at once on our local machines, and began to reconsider how we test these integrations.

Algebraic effects ‘r’ us

Enter algebraic effects, an approach to writing functional programs which perform I/O that we believe provides an enormous testability advantage. Rather than a traditional monad-transformer approach to I/O, algebraic effects allow us to write application logic in terms of pure data structures.

These data structures themselves do nothing; they require an interpreter to run them, and actually perform any interactions with the outside world. The great thing about interpreters is that they can be tailored for the use case — we can have a production-ready interpreter for some effect, like logging, which actually outputs lines to stdout, or we can have a test interpreter for the exact same effect that collects the logs purely instead, in a Writer monad. The application logic has no knowledge of the interpreter, and remains exactly the same for both cases.

A motivating example

To illustrate the beauty of effects, we’ll provide a small example from one of our integrations, written in Purescript, which takes data from a customer’s ERP system and creates and updates Geora asset records for their commodities. The integration needs access to the Geora API for three operations:

  • Create a new asset, with key-value data (“claims”) and an asset class (e.g. “WHEAT”)
  • Update an asset, adding new claims
  • Searching for assets, by claim value

We can encode these three operations into a simple data type, AssetsF:

data AssetsF a
= Create String Claims (Id -> a)
| UpdatePutClaims Id Claims (Id -> a)
| GetMatchingVersions String String (Array (Tuple Id Claims) -> a)

Our implementation of effects uses the excellent purescript-run library, which provides the Run type. We can implement any effect, in this case our ASSETS effect, within Run.

derive instance Functor AssetsFtype ASSETS r = (assets :: AssetsF | r)_assets = Proxy :: Proxy “assets”create :: forall r. String -> Claims -> Run (ASSETS + r) Id
create assetClass claims = Run.lift _assets (Create assetClass claims identity)

Crucially, Run uses row types to implement polymorphic variants. What that means practically is we can declare up-front exactly which effects our function will be using. In the following (simplified) example, we’re using the ASSETS, EXCEPT, and LOG effects, to either create or update an asset record based on some existing state:

submitUpdate
:: forall r
. BatchesState
-> Action
-> Run (ASSETS + EXCEPT String + LOG + r) BatchesState
submitUpdate state = case _ of
CreateAsset batchID barcodeQR -> do
log "CREATE asset for batch"
assetVersionID <- Assets.create "VISCOSE"
( Map.fromFoldable
[ Tuple "batchID" batchID
, Tuple "barcodeQR" barcodeQR
, Tuple "currentState" "Recorded"
]
)
pure $ Map.insert batchID (Tuple assetVersionID Recorded) state
UpdateAsset batchID claims
| Just (Tuple id _) <- Map.lookup batchID state -> do
log "UPDATE asset for batch"
assetVersionID <- Assets.updatePutClaims id claims
pure $
Map.update (\(Tuple _ stage) -> Just (Tuple assetVersionID stage))
batchID
state
UpdateAsset batchID _ ->
throw $
"No asset version ID in state map for batch " <> batchID

This function is completely pure, and performs no I/O — it expresses its effects in the quasi-DSL we declared earlier, AssetsF. To take a Run and actually perform its effects, we need to write an interpreter. For the ASSETS effect, our interpreter expresses the actions in terms of GEORA_API (a more fine-grained effect for accessing our API) and EXCEPT. Notice the type of runAssets, which eliminates the ASSETS effect completely, so that the number of effects left to interpret is reduced.

runAssets
:: forall r
. Run (ASSETS + GEORA_API + EXCEPT String + r) ~>
Run (GEORA_API + EXCEPT String + r)
runAssets = interpret (on _assets handleAssets send)
handleAssets :: forall r. AssetsF ~> Run (GEORA_API + EXCEPT String + r)
handleAssets = case _ of
Create assetClass claims next ->
next <$> doCreate assetClass claims
UpdatePutClaims id claims next ->
next <$> doUpdatePutClaims id claims
GetMatchingVersions targetLabel targetValue next ->
next <$> doGetMatchingVersions targetLabel targetValue
doCreate :: forall r. String -> Claims -> Run (GEORA_API + EXCEPT String + r) Id
doCreate assetClass claims =
waitForUpdate =<<
( GeoraAPI.mutate $
Mutation.assetCreate
{ input:
InputObject.AssetUpdateInput
{ class:
Present assetClass
, claims: Present (claimsToClaimInput claims)
, quantity: Absent
, actorRelationships: Absent
, assetVersionRelationships: Absent
, assetStandard: Absent
, comment: Absent
}
}
UpdateObject.id
# nonNullOrFail
)

If we continue this pattern, and write interpreters for all effects which gradually reduce them to a smaller number, we end up with an extremely satisfying base interpreter, which ends up running all effects in Aff, Purescript’s asynchronous effect monad:

runAllEffects
:: forall a
. Run (ALL_EFFECTS + ()) a
-> Aff (Either String a)
runAllEffects =
runBaseAff'
<<< runExcept
<<< runHoneycomb
<<< runLogger
<<< runConfig
<<< runAuth
<<< runGeoraAPI
<<< runAssets
<<< runKafka
<<< runNode
<<< runProtobuf
<<< runAjax

So what’s the upside?

As the example shows, there’s a bit of boilerplate involved in using Run, and the benefits over a basic implementation in Aff might not be immediately apparent. But the pluggable interpreter model gives us excellent testability, almost for free!

Compare the above runAssets with the alternative interpreter runAssetsState. Instead of implementing the ASSETS effects in terms of GEORA_API, we implement it purely using the STATE effect.

runAssetsState
:: forall r
. Run (ASSETS + STATE AssetsState + EXCEPT String + r) ~>
Run (STATE AssetsState + EXCEPT String + r)
runAssetsState = interpret (on _assets handleAssetsState send)
handleAssetsState
:: forall r. AssetsF ~> Run (STATE AssetsState + EXCEPT String + r)
handleAssetsState = case _ of
Create _assetClass claims next -> do
newID <- gets nextID
modify (Map.insert newID [ Tuple (newID + 1) claims ])
pure $ next (Id (show (newID + 1)))

We can now write test cases which check the simulated backend state after our business logic runs, with no changes to the business logic itself:

it "calls Geora correctly for a partial product" do
testHandle partialCsv
`shouldReturn`
( Right $ M.fromFoldable
[ Tuple 0
[ Tuple 1
( M.fromFoldable
[ Tuple "batchID" "211217"

]
)
, Tuple 2
( M.fromFoldable
[ Tuple "currentState" "InProduction"

]
)
]
]
)

This is nice for writing specific test cases, like the above, but what we’re really interested in is testing entire properties of the integration. For example, our integrations tend to be triggered by an event queue with at-least-once delivery semantics; therefore, we would like them to be idempotent (i.e. passing the same input in again shouldn’t change state at all). We can have a function isIdempotent which uses the pure ASSETS interpreter to run twice with the same input, and return the state after the first and second run for comparison:

it "is idempotent" do
(Tuple first second) <- liftEffect $ isIdempotent (Csv multiBatchCsv)
first `shouldEqual` second

It’s cool to be able to test the business logic of the integration without relying on a running Geora API, but this test does nothing an end-to-end test couldn’t do. But what if we instead wanted to check all¹ possible inputs and ensure the integration is idempotent? This test might take hours if run against a real API. But with our pure backend, it’s as easy as:

it "is REALLY idempotent" do
quickCheck \csv -> do
let (Tuple first second) = unsafePerformEffect (isIdempotent csv)
first `shouldEqual` second

This test uses purescript-quickcheck to generate hundreds of possible CSV file inputs, and runs in milliseconds. We can write similar tests for other properties, and other effects like logging, tracing with Honeycomb, and reading configuration from the environment.

Our use of effects has therefore made our business logic both more readable and more testable; and testing properties of the system is almost trivial.

About Geora

Geora provides simple and secure technology for farmer networks to track and finance agri-supply chains. Co-founders Bridie Ohlsson and Cadel Watson have been working at the junction of blockchain and agriculture since 2015. Sign up to the Geora platform today to make your data work harder for your agribusiness. For more information get in contact with our team at hello@geora.io.

For more information, visit geora.io

  1. For practical values of all

--

--