A feature I recently worked on for Actual was importing and reconciling transactions. This is one of the most important workflows so I want to get it right. I talked about some of the technical implementation in this post, and now I want to talk about what it actually does.
I’d really like feedback for this. What about importing transactions in other personal finance apps annoys you? What works really well? How does the method I describe below compare, and am I missing any use cases? Please tell me on twitter, facebook, send an email to email@example.com, or respond on whatever site linked to this post.
There are 3 ways transactions can be added to Actual:
- Manually adding one at a time
- Importing local files in various formats (OFX/QIF/CSV/etc)
- Automatically downloading from the internet
The most complex use cases are where transactions are entered in multiple ways — the user automatically download transactions but also manually enters a few of them. If you only ever download transactions, it’s a bit simpler but there’s still edge cases to get right like pending transactions.
Here is the list of use cases I had in mind while working on this, with deeper explanations below:
- Adding an online account should set the starting balance, and download some initial transactions.
- Subsequent imports should only add new transactions since the last import, and never add transactions before the date of the initial import.
- Pending transactions may be automatically shown for online accounts.
- The user may manually add transactions to an online account to see updates immediately and avoid waiting for bank to post it (only needed if downloading pending transactions is insufficient or unavailable).
- Because transactions may be manually entered, when importing transactions they should automatically reconcile and “match” the correct transactions that were already entered and avoid duplicates. This should be intuitive.
- Transferring money between accounts should happen by entering an outgoing transaction in one account (or importing it) and categorizing it with the account to move the money to. The system should automatically add a new incoming transaction in the target account with the same amount. If the target account is a connected account, the transfer should always resolve correctly when importing transactions.
If you are only ever manually adding transactions one by one, most of this doesn’t apply to you. This is all about how importing transactions (usually from the internet, but could also be from QIF/OFX files) should intuitively reconcile with the existing transactions in the account.
The first use case is pretty basic: adding an account and setting it up in Actual. The question is: which transactions should we initially pull down?
Let’s look at what a few other products do when you first connect an account. YNAB adds the account and doesn’t import any transactions; it just adds a transaction representing the starting balance.
Mint downloads transactions from as far back as 4 months ago and auto-categorizes them. If I scroll down there are 7 more pages of transactions!
The philosophical difference is stark. With YNAB, it doesn’t make sense to download previous transactions because you’re suppose to carefully track them and the budget month-to-month. If it imported 4 months worth of transactions the budget wouldn’t make any sense. In Mint, it’s budgeting system doesn’t rollover and it auto-categorizes transactions, so it’s happy to download previous transactions.
Personally I don’t like how Mint does it. Not only does it download several months of transactions, it automatically generates a budget for you too. The chance of any of this data being realistically correct is next to zero. The first transactions in the screenshot, “Urban Fare,” is categorized wrongly: it should be Restaurants, not Groceries. And if it’s not categorized correctly the budget is useless.
I’m much more in favor of YNAB’s approach with giving up on historical data and encouraging the user to be more hands-on. That means if it imported transactions, the user would need to go through all of them and categorize them, which is not exactly a fun task to do the first time you use the app.
However, it’s unfortunate that you see no transactions because most likely you’re just trying out the app! You want to use it and see how it feels, right? Without any transactions you can’t see how the workflow for categorizing and budgeting actually works.
You could enter some manually, but given you just connected your account certainly we can do something. In Actual, I chose to download any transactions for the current day and add a “Starting Balance” representing the balance before them. This will vary for each account; some may have many and some may have none. But I think it’s better on average to at least show a few transactions that the user can begin working with.
Example of an initial import on January 29, 2018
Now that we’ve created an account and performed an initial import, how do we update it? A simple method would be to download any transactions since that latest date in the current transactions. There are several reasons to avoid this as described in the use cases below, but regardless it’s not very fool-proof. What if you accidentally deleted a transaction and didn’t realize it?
Actual will find the latest date in the current transactions and download transactions since 10 days before it, adding any new transactions that don’t already exist. That way you’ll get all your new transactions, and a gut check that your current transactions are sane. If you accidentally deleted one in the last 10 days, it’ll just be added again.
An implementation note: remember how we don’t download transactions that existed before the account was added? We can’t always naively add transactions 10 days ago; we need to make sure we never import past the “Starting Balance” transaction.
This is the first use case of reconciliation. Reconciliation is the process of taking a bunch of transactions and matching them with existing ones. Here was have a bunch of duplicate transactions and some new ones, and we want to make sure to only add the new ones.
It’s easy in this case — imported transactions come with a unique “financial id” that we can easily use to compare current transactions and avoid adding duplicates.
Online accounts will automatically show pending transactions, if available. These transactions show up very quickly, often within a few hours, and represent transactions that are still processing. It usually takes a day or two for transactions to fully post to your bank account.
An early design of pending transactions
I’m still working through how the system treats pending transactions, but most likely they will be generally ignored by everything until you explicitly choose to add it as part of the budget (see the next section about manually adding transactions).
As part of the reconciliation process when importing transactions, Actual will match transactions that were previously pending but now reconciled and move them out of the pending state. As with detecting duplicates, this is trivial because the downloaded transaction comes with financial ids to easily match everything up.
Reconciling manually entered transactions
There are many cases where pending transactions may not be available: the user’s bank does not provide them as part of the online import, they are sufficiently delayed (takes longer than a few hours to show up), or the user is importing QIF/OFX files that do not have pending transactions. In this case, if the user wants immediately up-to-date information about their accounts, they may want to manually add transactions at the time they make purchases.
Note from the creator: it’s perfectly reasonable to avoid this extra work and simply wait for transactions to be posted. Personally I never add transactions manually. But some people highly cherish immediately up-to-date accounts and that’s OK too. It’s my goal for Actual to accommodate many different workflows.
You may wonder: what’s the point of even downloading transactions if you’re going to manually add them? The reason is to ensure that you’ve entered all the information correctly and make sure you’re always in sync with your real account.
Now we are faced with a more difficult problem: if you’ve manually entered transactions since the last import, how do we avoid duplicating transactions in the next import?
We need to reconcile the imported transactions with the manually entered ones, except now we don’t have simple financial ids to compare. We need to fallback to a more fuzzy matching algorithm. We’ll never be 100% sure that we’ve matched the correct ones, but can we be relatively confident about it?
A simple attempt might be to look for a transaction of the same amount on the same day. There are multiple problems with this:
- There could be multiple transactions of the same amount all with different payees. While you could avoid matching the same transaction twice which would allow them all to match, they will still match up with the wrong transactions which is asking for problems.
- Much more problematic: there’s no guarantee the user entered the transactions on the same day that it was posted. Most likely they entered it on the day they made a purchase. So the dates will almost never match up.
Clearly we need to try to match an imported transaction with a transaction in the past several days. Actual will check all transactions up to 5 days before the imported transaction’s posted date. If it finds a match, it will update the date of the manually entered transaction to be the posted date in order to stay in sync with your bank. This means the above idea would never work because matching the wrong transaction would cause out-of-order transactions which make it impossible to diagnose problems.
A quick note to explain that a bit more: it’s very important that transactions match the day they were posted in your bank so that you can go back and see the expected balance at the end of each day. Doing this allows you to go to your real account and verify that it’s the same. Checking each day’s balance is a way to diagnose problems if your account gets out-of-sync; you can make sure it was in sync a week ago and move forward each day. You should rarely have to do this but it’s important for this to be available.
Since transactions need to be reconciled across multiple days, we have to take into account the name of the payee. This is very difficult because the amount of variations it can have. Most imported payees are resolved to simple names like “Kroger” but some are left with the ugly posted description of something like “#0032 0118 Kroger 34st EMX”. On top of that, the user could have entered the payee name in all kinds of ways when they manually added the transaction.
We need a way to compare the name of the payee in two transactions and calculate a “similarity” value between 0 and 1. I researched a bunch of string metric functions and nothing stood out. The problem is that if the user entered something like “kroger” and the full imported payee is “#0032 0118 Kroger 34st EMX”, we want those to match. I ended up using a Damerau–Levenshtein distance function, and walking through the longer string to see if the shorter string matched anywhere inside of it. You can see the code here if you’re interested.
Here are some examples of strings that match and don’t match:
"Macsy wkout" == "macys weekout"
"Papa John's" == "papa john"
"Papa John's" == ''
"#001 fenn st Macy's is good 33333 EMX" == "macys good"
"#001 fenn st Macy's 33333 EMX" == "macrys"
Does not match:
"macys" != "lowes"
"#001 fenn st Macy's is good 33333 EMX" != "goo place"
Because this a fuzzy match, I set a certain threshold level to accept matches. Right now I guessed what it should be, but I expect to keep tweaking it once real users start using it. Maybe it too accepting and matches the wrong things? Or not tolerant enough?
Note that “Papa John’s” matched an empty payee name. If either strings are empty it will always match. See the next section about transferring money as to why.
Now that we have a string matching function, the reconciliation process looks like this: go through all transactions from 5 days before and up to the imported transactions date and check if any of them are the same amount, have a payee name that matches according to our fuzzy string metric, and haven’t been matched already. If a transaction is found, the imported transaction is reconciled to the existing one and the date of the transaction is updated.
This allows very flexible workflows where you can quickly enter transactions manually if you’d like, and most of the time Actual will match them up correctly.
Since we aren’t 100% sure if we matched transactions correctly, it’s important to expose what happened to the user. Currently I’m experimenting with simply showing a message like below to the user, and they can click it to get a list of all the matched transactions and un-match them if necessary.
What you’d see if you had manually entered 3 transactions
Actual will also likely decorate the transactions with more information inline so you’ll see how things were reconciled.
Transferring money between accounts
When you transfer money between accounts, it shows up as an expense in one account and a deposit in another. You could treat it as a normal expense and deposit, but you lose information about how money is flowing and it might make reports harder to read. Usually, you want to categorize it as a transfer and tell it which account the transfer came from/to.
When you specify the from/to account, Actual will automatically create the corresponding transaction in the other account. So if you transferred $500 from account A to B and categorized the transfer in account A, it’ll create a deposit transaction of $500 in account B automatically. It’s useful regardless if you enter transactions manually or not — even if you download transactions, you know the deposit transaction is coming so you might as well enter it now.
Reconciliation is very important for this use case. In Account A, if you manually enter a pending transfer and it’s downloaded later, it should reconcile correctly. Additionally, the deposit transaction in Account B that Actual created should later reconcile to the imported transaction.
When we are adding these transactions we want to be sure that Actual reconcile them later. Since we don’t really care about the Payee, transfer transactions will always leave the Payee blank which tells the system to ignore it when reconciling. When a payee is blank, it’ll match with any imported transaction of the same amount.
There’s one last edge case which is if you categorize a transfer, and in the receiving end the transaction already exists. In the above case, account B would already have the transfer. We need to check for this before automatically adding the receiving transaction, but the payee name may not be blank; in this case, Actual still ignores the payee name and will not add a transaction if one with the same amount exists. Essentially, transfer transactions reconcile with looser requirements.
That’s the guts of how Actual currently imports and reconciles transactions. I’d really love feedback on all of this! What features of reconciliation do you rely on daily? Is there anything missing here? Let me know at the email or social media sites below!