Here’s how you can access your entire iMessage history on your Mac

Yorgos Askalidis
6 min readMar 9, 2019

--

A guide on how to create a data-science-friendly file with your iMessage history.

If you use the Messages app on your Apple computer then you probably have connected your Apple-Id to that computer in order to send and receive iMessages across all your Apple devices (iPhone, iPad, computer). When using the Messages app, you can see all your history of messages on that device and maybe (like me) you have wondered where exactly is that data stored? Can you access it in a format that is easy to analyze? Wouldn’t it be great to be able to see how many messages you have sent per day? Or read the first message you have ever sent to your friends?

Well you’re in luck! Your Apple computer stores that message history right within your grasp in a hidden folder on your hard drive!

This is a somehow technical guide for extracting all the iMessage data in your computer’s hard drive and putting them in an analysis-friendly file.

Note: this approach only works for Macs (laptops and desktops) and only gets your iMessage history. Green bubble messages unfortunately are not captured.

If you just want to see the good stuff you can find the notebook with the minimal code to extract and prepare the data here.

Where’s the data

The iMessage history that powers your Messages app is stored in a database file in your computer’s hard drive, in a hidden folder named Library which, in turn, is in your username folder. You can usually find your username folder on the side bar of the finder.

Hidden folders are folders that by default don’t appear to the user, usually holding files that a casual user doesn’t have to interact with, such as system-related files. You can make the hidden folders appear by simultaneously pressing the Command, the Shift and the dot keys: “Command+Shift+.”. If for some reason that doesn’t work, you can also open the Terminal app and simply type the following.

defaults write com.apple.finder AppleShowAllFiles YES

Once that works, you can go and find the Messages folder, (which contains the chat.db database) within the Library folder as shown in the image below.

Notice how some folders appear with a greyed icon; these are the normally hidden folders.

What’s a database?

A very simple way to understand what a database file is is to think of it a a folder that contains a bunch of excel-like tables. Much like larger or enterprise-grade databases you can connect and access the data in the database in a variety of ways.

Using Python and pandas

In this tutorial, I’m using Python and the amazing pandas module to connect to the database, explore the tables and data it holds and then read that data from the appropriate tables.

connection code
import sqlite3
import pandas as pd
# substitute username with your username
conn = sqlite3.connect('/Users/username/Library/Messages/chat.db')
# connect to the database
cur = conn.cursor()
# get the names of the tables in the database
cur.execute(" select name from sqlite_master where type = 'table' ") for name in cur.fetchall():
print(name)

Above we connect to the database and explore what tables are in there. I found that there are a few tables in the database including one called message and others names chat, handle and attachment. Let’s explore the message table because that’s the one that sounds most promising to hold our iMessages. I do that by transferring the table into a pandas dataframe, a type of file that is much easier to explore and manipulate for data analyis projects.

# get the 10 entries of the message table using pandas
messages = pd.read_sql_query("select * from message limit 10", conn)

Getting the message text and phone number

We hit bingo! The message table indeed seems to hold all the saved iMessages. It has a text field with the actual sent or received message, a date field (more on that below) and a handle id. After a little exploration I found that the handle_id is a code for each phone number or Apple-id that you have had a conversation with. In order to map the handle_id back to the Apple-id we can use a table in the database (appropriately) named handle and join on handle_id.

# get the handles to apple-id mapping table
handles = pd.read_sql_query("select * from handle", conn)
# and join to the messages, on handle_idmessages.rename(columns={'ROWID' : 'message_id'}, inplace = True)handles.rename(columns={'id' : 'phone_number', 'ROWID': 'handle_id'}, inplace = True)merge_leve_1 = temp = pd.merge(messages[['text', 'handle_id', 'date','is_sent', 'message_id']], handles[['handle_id', 'phone_number']], on ='handle_id', how='left')

Adding a chat id

Similarly, the message table also includes a chat_id that maps each message back to unique chat. This can be useful when doing analysis on chats with multiple people in them. We can get the chat_id of each message by joining the message table with the (again, appropriately named) chat_message_join table on message_id.

# get the chat to message mapping
chat_message_joins = pd.read_sql_query("select * from chat_message_join", conn)
# and join back to the merge_level_1 table
df_messages = pd.merge(merge_level_1, chat_message_joins[['chat_id', 'message_id']], on = 'message_id', how='left')

Getting the date

The message table also includes a date column and this was a little tricky for me to decode since it isn’t exactly in any format that is widely used in the industry. Moreover, the way that this column is recorded is a little different in newer version of Mac OS X compared to older ones.

Credit to this stackoverflow page that helped me figure this out.

In Mac OS X versions before High Sierra (which is version 10.13 and released in September 2017), the date column is an epoch type but, unlike the standard of counting the seconds from 1970–01–01, it is counting the seconds from 2001–01–01. In order to convert that type into a data field we can actually comprehend we can use a command while querying the message table to create a new field (we will call it date_utc, since it is giving a UTC timezone date as a result) based on the date field.

# convert 2001-01-01 epoch time into a timestamp
# Mac OS X versions before High Sierra
datetime(message.date + strftime("%s", "2001-01-01") ,"unixepoch","localtime")
# how to use that in the SQL query
messages = pd.read_sql_query("select *, datetime(message.date + strftime("%s", "2001-01-01") ,"unixepoch","localtime") as date_uct from message", conn)

In Mac OS X High Sierra and above, it’s the same thing but the date format is now much more granular: it is in nano-second level. So now we need to divide by 1,000,000,000 before we apply the same code snippet we applied above.

# convert 2001-01-01 epoch time into a timestamp
# Mac OS X versions after High Sierra
datetime(message.date/1000000000 + strftime("%s", "2001-01-01") ,"unixepoch","localtime")
# how to use that in the SQL query
messages = pd.read_sql_query("select *, datetime(message.date/1000000000 + strftime("%s", "2001-01-01") ,"unixepoch","localtime") as date_uct from message", conn)

Putting it all together

You can find the notebook here with all the code in order for you to extract your iMessages from your laptop and start analyzing!

It should only take a few minutes and by the end of it you should have a basic history of your iMessage data that includes the phone number (or email), the text, a unique chat for each unique group of people you had a chat with and the timestamp (in UTC timezone) of each message sent.

You can actually find more data in the database such as details if the message was delivered and read as well as attachments. I’m not touching on those attributes on this post.

This post also only instructs on how to get iMessage data from your Apple computer. If you have any pointers on how you can extract your iMessage history from Apple mobile devices (iPhone & iPad) let me know in the comments.

Happy reading your messages!

Was this helpful? Let me know if things were not clear.

--

--

Yorgos Askalidis

Data Scientist at Instagram NYC. Previously at Spotify. Ask me about data, soccer, or data about soccer (or anything else).