Parse email signatures with C#

Learn how to scrape email signatures with C# and helpful libraries that make it easy to pull emails.
Get a FREE demo and trial of SigParser
No commitment required


Here is how to parse email signatures with C#.

We’ll cover the following:

  • Why parsing email signatures is hard
  • Nuget Package
  • Code Example for Sending An Email to SigParser
  • Libraries for pulling emails with .NET Framework and Core

Why parsing email signatures is hard

Parsing email signatures is exceptionally difficult. Many think they can just use a couple regex expressions but they’re wrong. Here are some of the major things you’d need to handle.

  • No signature is formatted the same
  • Phone numbers can have many different formats
  • Need to attribute the type of phone number (Fax vs Mobile vs Work phones).
  • The phone type indicator has lots of variations. For example, Mobile vs M: vs Cell vs C: and many others
  • Titles can be incredibly difficult to capture without getting too much wrong information.
  • Locations are tough. Very few people put full addresses. Often they’ll only put the city and state but no country. Even street addresses are massively different by country.

Then there is identifying where the email signature is in the email which is really hard. We use a machine learning algorithm with tons of label validation emails. We couldn’t come up with a more reliable way than that. We’ve been labeling our test set for years.

Splitting Email Chains

If you want to split emails on the headers, that is also tough. No two email clients seem to produce the same header format and over time the email clients change the way they format headers.

This is why we expose a REST API

So this is why we suggest using our email parsing service to parse emails. As we improve you’ll automatically get fixes.

The SigParser Email Parsing API

The SigParser Email Parsing API is a serverless, stateless email parsing API. It can extract contacts and split emails into sections. It can find phone numbers, titles, addresses and attribute them to the correct contact. It even takes care of deduping contacts for you if the same email address appears in the email.

The API is is stateless. We store nothing about the email. It is a processing only service. We only store some high level statistics.

Nuget API Wrapper (.NET Core and Framework Package)

Although you can absolutely call our REST API from .NET, we’ve exposed a helper Nuget package which has the data model setup and allows you to easily pass it the API key. This is comptabile with .NET Core and .NET Framework. At the bottom of this page are references to how to query for the emails for different types of providers.

Install-Package SigParser

https://www.nuget.org/packages/SigParser/

Example: Parse an Email with C#

// create the SigParser client with the API Key you got from https://app.sigparser.com

var client = new SigParser.Client(ApiKey);

var email = new SigParser.EmailParseRequest
            {
                plainbody = @"
Hi John,

Lets get coffee tomorrow.

Thanks
Steve Johnson
888-333-3323 Mobile
San Diego, CA
",
                from_name = "Steve Johnson",
                from_address = "sjohnson@example.com"
            };

var result = client.Parse(email).Result;

// "result" has the email broken into pieces as well as contact information.

It really is that simple. The rest of this guide is about how to pull emails from various email clients.

Email Libraries to Pull Emails

Pulling emails can be hard but is totally doable in .NET and .NET Core.

Gmail

Nuget Package

Install-Package Google.Apis.Gmail.v1

We made a handy article on Medium for how to do it exactly:

Blog: Pull Gmail messages with C#

Office 365

This is kind of a difficult API to get your head wrapped around at first.

Install-Package Microsoft.Graph

Tips

  • The Graph API Nuget package just helps you compose ODATA formatted requests.
  • You should familiarize yourself with the Microsoft docs for Graph API Email Messages
  • If you need to sync emails every X hours or even on an initial pull, be sure to use the “delta” endpoint.
    • You can only sort the sync by “receivedDateTime desc” otherwise the results will be random.
  • If you try to sync by just querying messages newer than some date using the “List messages” endpoint, you could encounter the following errors:
    • Timeouts if the mailbox is too large
    • Missing messages

Exchange

It is suggested to use Office 365 if at all possible but if not then…

.NET Framework then use

Install-Package Microsoft.Exchange.WebServices

This is the version officially supported by Microsoft but it doesn’t work on .NET Core.

.NET Core Compatible (will also work with .NET Framework)

Install-Package Microsoft.Exchange.WebServices.NETStandard

The source code is here but it isn’t officially supported by Microsoft. For SigParser for security reasons we have our own build of this since we aren’t sure who the maintainer is in real life.

This version also lacks some of the DNS discovery features that the .NET Framework version has.

IMAP

This is an IMAP Email .NET assembly from LimiLabs. You can try it for free but you have to pay for it. As of this writing it is $149 for a single developer and a year of support. This is probably one of the best IMAP products we’ve used for handling IMAP emails. It also works with .NET Framework and .NET Core.

Install-Package Mail.dll

Try SigParser for FREE

Try SigParser for FREE with no commitment. Schedule a 15 minute web conference to get an overiew of SigParser and set up with a free trial account. No commitment required.