Extract email signatures with .NET and C# from a Gmail account
We’ll walk thru how to capture contact details from Gmail messages with .NET and C# in a simple console app. Full code at the end of the post. This will work with .NET Core or .NET 4.6 and above.
Parsing emails is tough!
Email signatures contain all sorts of goodness like names, phone numbers, locations, titles, Twitter and LinkedIn URLs.
Capturing that data can be hard.
- Phone numbers can have lots of different formats
- Phone number type identifiers can precede or be appended to phone numbers
- Titles can be on their own line or mixed into the signature block
- Titles are almost infinite in variations
- Street addresses have no standard format
- Email headers can be different between clients and change over time
- Emails can get mangled as they are sent between clients
- Email signatures in reply chains can get complicated.
- How about multiple languages?
We won’t be reinventing email parsing for this tutorial. That involves machine learning and tons of regexes. Instead we’ll use SigParser to do most of the heavy lifting.
(5 minutes) Connect to Gmail from .NET
For this example we’ll use Gmail. You should follow these steps to get a connection going.
using Google.Apis.Auth.OAuth2; using Google.Apis.Gmail.v1; using Google.Apis.Gmail.v1.Data; using Google.Apis.Services…developers.google.com
We’ll be using the SigParser Nuget package. SigParser is a constantly updated API for parsing emails and email signatures. You’ll need to go to SigParser and setup an account so you can get an API key.
SigParser Nuget Package
Parse raw email chains and email signatures from either HTML or plain text emails. Multiple languages supported. Signup…www.nuget.org
Run this command in the package manager console or find SigParser in the Nuget Package Manager in Visual Studio.
Add a function to decode the email bodies to your demo you setup from the Gmail steps. You’ll use this later.
Pull Down Messages
Add the below code block right after the “var service =” line.
It replaces pulling “labels” from the original demo with pulling the first page of messages, processes each message with SigParser and displays the parsed results. You could replace the displaying part with the processing you intend to do. Also, the “emailResult” variable has other details about the email parse you could find useful.
You’ll need to fill in the API key value which you can get from SigParser.com.
Run the code. It should start dumping out results to the console like so. Each time you hit Enter it will process the next message.
Next Steps: Paging Message Results
Right now this example will only pull the first page of results from Gmail (approx 100 message). To get the next page, you should use the below pattern around where you process the messages.