To read emails and download attachments in Python

Sanket Doshi
3 min readOct 13, 2018

--

import imaplib
import base64
import os
import email

imaplib is the package that installs IMAP a standard email protocol that stores email messages on a mail server, but allows the end user to view and manipulate the messages as though they were stored locally on the end user’s computing device(s).

base64 is a package that provides encoding or decoding of binary data to ASCII characters or vice versa.

os is a package using which we can manipulate directories of files present on our local desktop.

email is a package used to read, write and send emails from your python script.

Now first, we need email_id and password to access emails.

email_user = input('Email: ')
email_pass = input('Password: ')

You, enter email and password for accessing emails.

mail = imaplib.IMAP4_SSL(“host”,port)

Now, we’ve made the connection with the host over an SSL encrypted socket. The standard port for IMAP4_SSL is 993. If your account is Gmail than host address is “imap.gmail.com”.

mail.login(email_user, email_pass)

Now, we’ve logged into your mail and we’ve full access to all the emails.

mail.select()

This selects the folder or label you want to read mail from. If you write mail.select('Inbox') then we’ve chosen the inbox folder.

type, data = mail.search(None, 'ALL')
mail_ids = data[0]
id_list = mail_ids.split()

.search searches from mail here you can provide filters like from, to or subject of mail to be found. None is charset and ALL returns all the messages without any filter. This function returns 2 values one is type that is whether the request was ok or not . If ok that means request was successful. While data is id’s of all the emails.

for num in data[0].split():
typ, data = mail.fetch(num, '(RFC822)' )
raw_email = data[0][1]
# converts byte literal to string removing b''
raw_email_string = raw_email.decode('utf-8')
email_message = email.message_from_string(raw_email_string)
# downloading attachments
for part in email_message.walk():
# this part comes from the snipped I don't understand yet...
if part.get_content_maintype() == 'multipart':
continue
if part.get('Content-Disposition') is None:
continue
fileName = part.get_filename()
if bool(fileName):
filePath = os.path.join('/Users/sanketdoshi/python/', fileName)
if not os.path.isfile(filePath) :
fp = open(filePath, 'wb')
fp.write(part.get_payload(decode=True))
fp.close()
subject = str(email_message).split("Subject: ", 1)[1].split("\nTo:", 1)[0]
print('Downloaded "{file}" from email titled "{subject}" with UID {uid}.'.format(file=fileName, subject=subject, uid=latest_email_uid.decode('utf-8')))

Now, we’re iterating from all the ids and will print from, subject, the body of the mail.

.fetch fetches the mail for given id where 'RFC822' is an Internet Message Access Protocol. Now, you can use RFC822.HEADER to get header of mail. data from fetch is in binary encoded so we need to decode in UTF-8 charset. Now, pass that decoded string to email.message_from_string which accepts a string and converts it into dictionary format with required fields. .walk is used to iterate through the tree of mail. get_content_maintype() is multipart if emails contain attachments or else it’s plain/text.

Now, if we just want to print subject and from

for response_part in data:
if isinstance(response_part, tuple):
msg = email.message_from_string(response_part[1].decode('utf-8'))
email_subject = msg['subject']
email_from = msg['from']
print ('From : ' + email_from + '\n')
print ('Subject : ' + email_subject + '\n')
print(msg.get_payload(decode=True))

.get_payload() returns the body of mail if present in the plain/text.

There are more methods to send emails I’ll write about it in my next blog.

Thank you for reading hope you like it.

--

--

Sanket Doshi

Currently working as a Backend Developer. Exploring how to make machines smarter than me.