The purpose of this post-mortem is to shed light on ACH processing issues that occurred on 09/30 and 10/16. Our intention is to provide clear, thoughtful and transparent insight into what occurred, the scope of the issue, what we learnt from it and what the next steps are for us.
In my mind, the primary audience of this post are Synapse customers — developers building on top of us. So whenever I say “you”, I am referring to our customers.
On Friday 10/04 we detected that a portion of our ACH file that was supposed to be processed on 9/30 was not fully processed.
Then on Wednesday, 10/16 a portion of our same day and next day ACH files were double processed.
Here is what happened
On 09/30, we observed a step function increase in our ACH processing volume, which caused automated batching to fail as the files were larger than what a single database write could handle. To solve this issue, we manually chunked the files into smaller batches and processed them.
This spike was particularly odd because it consisted of an odd increase in our micro deposit transactions.
Since things reverted back to normal, the team did not investigate this further. On 10/04, we started getting inbound queries on a few ACH debit transactions that should have hit the user accounts on 10/01 but had not yet. This caused us to look into all the batches that were processed on 09/30 and we discovered that a portion of the batch was missed due to manual processing.
At this point, we added some additional logging during the batch process to monitor if this were to happen again.
On 10/17, we observed the odd micro deposit spike again, except this time it was close to 100 times more severe. To react to this situation, we logged, chunked files and processed again. As this was being done by humans, some of the batches were double submitted as we thought they weren’t processed.
The odd micro deposit behavior also prompted an additional investigation. During the investigation we identified a vulnerability in our micro-deposit origination system that was being exploited to originate a large volume of micro-deposits.
Scale of the Impact
Even though Synapse was the target of this vulnerability, this impacted three financial institutions in total, which caused slowdowns in return propagations, transaction processing and lots of human fire drills at each. Out of the three, two are quite large financial institutions.
I only say this to give everyone the scale of the issue and how disruptive it was for the back offices of not just Synapse, but quite reputable financial institutions as well.
Here is what we have done
For us, the opportunity for improvement was two fold:
Eliminating the micro deposit vulnerability
We have since eliminated this micro deposit vulnerability that caused the issue to occur in the first place.
Improved manual processes
To better deal with these situations in the future, we’ve added automated chunking of batches into our workflow. On top of that, we’ve also started indexing all batches and their individual entries even at the NACHA file level, so that we can always query a batch or a transaction to see if it were processed before or not, further reducing the likelihood of double batching.
Here is what we’d still like to do
We take our mission very seriously. We want to ensure that everyone around the planet has access to high class financial products.
So we do not just feel responsible to you, but also to your customers.
It personally pains me to admit that this issue had negative impact on folks that were double debited. If I could do anything to take that back, I would. To make up for that, here is what we’d like to do:
For users that were debited twice in the 10/16 batch, we will cover any overdraft fees that the users had to incur. All we ask is that you collect evidence of the overdraft fee from the user and send it to firstname.lastname@example.org.
I am also personally available to discuss this matter with you or your customers. If you would like to do that, here is a link to my calendar.
PS: It goes without saying, but any customers that were impacted by this were already notified. So if you did not hear from us, you were not impacted by this.
Update: After doing a comprehensive audit for the issue, we also found that an additional batch on Oct 21st that was not processed. We notified all the customers that were affected by this and submitted the missed ACH debits with an effective date of Nov 1st.