A Simple Reference Guide to IRS Data Quirks
With Visuals, Because That’s Way Easier
So after my recent posts on the IRS SOI data, I’ve gotten feedback from several people with more questions about some niche issues, some that I covered but that are worth explaining better than I did the first time, some that I didn’t cover. So this post will be purely an evergreen guide to some quirks of the IRS SOI data, and some tips for addressing them.
So, first quirk on the list: the weird temporality of IRS data.
Meet Your Example Taxpayers
The chart below is a visual representation of 7 example taxpayers, with some common generic names assigned to them for convenience. Each taxpayer has a different filing/migration profile. The green bars, for convenience, show when earnings occurred. This spreadsheet represents different taxpayers who might feed into the 2013–2014 IRS migration file. That means we want filings made in 2013 and 2014, which reflect 2012 and 2013 income, respectively.
So let me explain these scenarios. First comes Oceanus. He’s the simplest scenario. He files his 2012 taxes in March 2013, with his March 2013 address on the return. In April 2013, he moves (presumably got his refund quick?). He files his 2013 taxes in March 2014, from his new address. Simple.
Hyperion’s case is also simple. He’s identical to Oceanus, except that he moved in February instead of April, so before filing instead of after.
Next comes Cronus. Cronus takes his time. He files his TY2012 returns the same time as most of his fellow example taxpayers, in March 2013, but he gets extensions for his apparently very complex TY2013 taxes. They aren’t filed until September 2014, instead of March. In the meantime, in July 2014, after everybody else has finished moving or filing, Cronus moves. His new address is on his september filings.
Then we’ve got Phoebe. Phoebe’s case is simple, exactly like Oceanus’, except instead of moving in April, she moves in December. No big deal, right?
Then we’ve got Rhea. Rhea is in a really dynamic time in her life. She files her taxes at normal times, but she moves in May 2013 and in December 2013. She’s a double migrant! Her movement, by the way, is from Olympus, to Athens, to Corinth; she doesn’t move Olympus →Athens →Olympus. The final destination is not the origin. But below, I’ll offer a discussion of what would happen if the final destination were the origin.
Themis is our next example taxpayer. She’s just like Phoebe, but moves even later, in February 2014. In many ways, Themis is a very similar position as Hyperion.
Lastly, we’ve got Iapetus. Iapetus files his TY2012 returns very late, and his TY2013 returns early, with the result that there’s just 6 months between them. He also migrates in June 2013, before he files his first return.
So those are our example taxpayer-migrants. Let’s look at how the IRS classifies them.
Different Tax Filing and Migration Timing Can Radically Impact Results
And Can Totally Screw Up Your Interpretation
So now let’s explore what the IRS classification system really means. But first, let’s consider what it should show us. An ideal data source with AGI info would tell us the following things:
- Individuals who migrated at similar times should be classified in the same year
- All migrations should be classified as migrations
- Income should be intuitively assigned to the place where it was really earned
So how well does IRS SOI data measure up?
Not so well.
First of all, Hyperion and Iapetus aren’t even classified as 2013–2014 migrants, despite the fact that they totally migrated in 2013. Meanwhile, Cronus and Themis, who didn’t move until 2014, are classified in 2013–2014. Iapetus is classified in 2012–2013, despite migrating after Oceanus, Hyperion, or Rhea. Hyperion is also classified in 2012–2013 of course, but Oceanus and Rhea are both classified in 2013–2014.
Meanwhile, Rhea gets short shrift as well. She moves twice. But the IRS only tracks the initial tax filing address, and the second tax filing address. Rhea’s middle destination vanished completely. That’s if Rhea moves Olympus →Athens →Corinth as I said. But if she moves Olympus →Athens →Olympus, then she won’t show up as a migrant at all. This means that IRS SOI will tend to understate seasonal migrants.
But the 3rd criterion, about income, is maybe the most interesting. The IRS reports two years of AGI now (yay!). For 2013–2014, that’s AGI filed in 2013 and 2014, which means 2012 and 2013 AGI.
But how does that AGI relate to actual migration?
Let’s make a simple assumption: all taxpayers earn the same income each month regardless of migration or filing decisions. In other words, income isn’t impacted by migration or filing, and is evenly spaced throughout the year. This income is total bullcrap. But it lets me give instructive ballpark estimates of how much Year-2 (“post-migration”) AGI is really post-migration.
The third column of the table estimates what percentage of 2013 AGI was earned post-migration. As you can see, for Oceanus, it’s 75%. So for him, the Year-2 AGI is a pretty good proxy for post-migration income.
But look at every other migrant. For all of them, less than 10% of their Year-2 AGI occurred in the IRS-reported final destination. That’s because if you move late in 2013 or early in 2014, you show up in 2013–2014 migration data, but it’s too late for much of your 2013 AGI to have occurred in your new home. In other words, a substantial share of the Year-2 AGI is still pre-migration income. So remember, whether you’re using the data wrongly to talk about “money migration,” or more rightly to estimate demographics or migration’s connection to income changes, year 2 AGI has a very mixed relationship with post-migration income. However, we can say that Year 1 AGI is definitely pre-migration income.
For Rhea, over half of her income drops off the face of the earth, in terms of its physical location. It’s reported with her final filing state, despite having been earned in some intermediary location.
So. that’s your primer on the temporality of IRS migration.
Also, pro-tip: migration itself has seasonal trends.
So to get a really representative taxpayer migrant, I’d need to have someone filing in April-April, moving in June. About 50% of their income would be post-migration. Make of that what you will.
International Migration is Tricksy
And Puerto Rico Is Odd Too
The IRS data includes data on international migration, and migration to U.S. territories. This is nifty. It’s also woefully incomplete. The only returns that make it into the IRS SOI migration file are those that can be matched year-to-year. So inflows with no prior-year tax return are not included.
Let’s use our examples again. They’ll have similarities to the above examples, but won’t be exactly the same. And I’m only using 5. I drop Iapetus and Hyperion because they aren’t even classified as migrants in 2013–2014. But as you’ll see, if the example taxpayers migrated as shown below, several of these filers wouldn’t have shown up as migrants either.
Let’s say in this case, Oceanus is a nonfiler originally, in Puerto Rico. He pays no US income taxes. Then he moves to the US, and must pay income taxes.
Cronus was a US taxpayer (say, military), who moves to Puerto Rico, and continues to pay US income taxes.
Phoebe is a US taxpayers who moves to Australia, and ceases to pay taxes. This may be because she is evading taxes, or because she has no income to report living as an Antipodean bum, or may be because she is an Australian citizen who was working in the US. Doesn’t matter why. What matters is, she stops filing US income taxes after moving to Australia.
Rhea moves from Kentucky to France, then back to Virginia. Her first return is in Kentucky. She reports the French income in her next tax return, but that return is filed from Virginia.
Themis was a nonfiler in Japan, for any of the reasons mentioned for Phoebe. Then she moves to the United States, maybe to attend university. She gets no income or anything reportable as income. So she continues to file no US taxes.
Let’s look at how these scenarios impact IRS treatment of our example migrants.
Oceanus immigrated during the 2013–2014 season, from Puerto Rico. But he didn’t file when in Puerto Rico. So he’s not counted among IRS inflows from abroad. Most Puerto Ricans don’t pay US income taxes, so the IRS enormously undercounts migration to and from Puerto Rico.
Cronus was an emigrant to Puerto Rico in the 2013–2014 season. But he filed in both cases because, as a US Government employee in Puerto Rico, he’s still subject to the income tax. So with two returns and an international address on the second, the IRS sees Cronus (correctly) as an emigrant.
Phoebe moved to Australia, but she stopped filing. So the IRS sees her as having just vanished off the face of the earth, not an emigrant.
Rhea, on the other hand, does report income earned abroad. However, she reports it on a tax return filed after she gets back to the United States, and files from Virginia. The result is that the IRS sees her as having moved to Virginia from Kentucky, not from Kentucky to France, or from France to Kentucky. In theory, Rhea should be tracked as an international inflow and an outflow. Instead, she’s counted as neither.
Themis moved to the United States. But she didn’t file taxes the year before, or after arrival, because she had no US-taxable income, or is a pretty aggressive tax-evader. In either case, she doesn’t show up as an international inflow.
The result is that of our 5 globetrotting taxpayers, the IRS only caught one of them. The take-away here is that the IRS’ international numbers are neat for seeing how many taxpaying, employed Americans move to and from the US vs. expat jobs, but not really useful for any estimates of wider immigration and emigration.
But Seriously, Where Was That Income Earned?
The IRS Uses Return Addresses, Not the Actual Tax Return Information
I’ve seen some people perplexed about the IRS SOI migration data, because they assume that the migration file uses the information you input about where income is earned, what states you file in, where you claim credit for state taxes paid, or W2 geocoding, or something like that. Nope. The IRS tracks only the return address on the tax filing.
It would be super neat if the IRS used W2 filings with addresses to estimate migration. I don’t know if that’s possible. But that wouldn’t be residential migration, it’d be employment migration: and some people change employment address without residential address, or vice versa. Some people commute very, very long distances. So it’d be interesting, but would have its own set of flaws.
Another cool method would be to track credits for state taxes paid, or simply state tax returns filed. The first one may be easier than the second, as I’m unsure if all state tax authorities report who pays their income taxes. But even if they did, there are still some problems. Some states don’t have income taxes. Some states do, but have filing thresholds that let some individuals not file. The difference between residence and workplace would crop up once again.
There are also tax deductions that could correlate with migration, like moving expenses or the mortgage interest deduction. These could be tracked to give some supplementary statistics. But they’re pretty loosely tied to migration and would cover only certain subsets of migrants, so they’d be very noisy. Plus, changes in the laws surrounding those deductions could alter migration stats (as is the case with international migration, where FATCA induced lots more Americans abroad to file taxes).
In other words, while the IRS theoretically could provide alternate measures of migration based on the detailed content of a tax filing, such a method would still have serious flaws, and would undoubtedly require tons of extra work. I’d love to have it! That’d be awesome! It’s also not an easy change like adding the second-year AGI. That’s much harder.
Really Now. WHERE?
…Sometimes Return Addresses Aren’t Relevant
This is in the IRS user guide, but it’s worth repeating here. Some peoples’ return addresses aren’t their residential address.
Why might this be? Lots of reasons. Maybe they use a tax preparer’s address. Maybe they list a P.O. Box. Maybe they file from a business address, especially if they have pass-through income. There are other reasons. But in every case, it’s clear that the address could change without a migration, or a migration could occur without a change.
IRS SOI migration data continues to improve in quality — but users still need to understand what they’re working with. Whether it’s common misconceptions about money migration, or just uncertainty inherent to the data about when, how, and where people migrate, IRS SOI data is imperfect. Those imperfections don’t make it useless: to the contrary, IRS data is closely associated with other sources, and also has an unmatched longevity. This makes it a very useful additional benchmark, and also a good source in its own right when other sources are unavailable or cumbersome.
Keep using IRS SOI data. Just use it the right way.
See my previous post, reviewing winners and losers from new IRS data.
Check out my new Podcast about the history of American migration.
If you like this post and want to see more research like it, I’d love for you to share it on Twitter or Facebook. Or, just as valuable for me, you can click the recommend button at the bottom of the page. Thanks!
Follow me on Twitter to keep up with what I’m writing and reading. Follow my Medium Collection at In a State of Migration if you want updates when I write new posts. And if you’re writing about migration too, feel free to submit a post to the collection!
I’m a graduate of the George Washington University’s Elliott School with an MA in International Trade and Investment Policy, and an economist at USDA’s Foreign Agricultural Service. I like to learn about migration, the cotton industry, airplanes, trade policy, space, Africa, and faith. I’m married to a kickass Kentucky woman named Ruth.
My posts are not endorsed by and do not in any way represent the opinions of the United States government or any branch, department, agency, or division of it. My writing represents exclusively my own opinions. I did not receive any financial support or remuneration from any party for this research. More’s the pity.