Parse Email Bodies and More with SigParser

SigParser's API parses email bodies and other contents out of emails. Our API parses all header, signature, and reply chain content from an email in order to have a clean email message body for data analysis, display within an application, or any other use.
Test SigParser API for Free
No Commitment Required

How Our API Works

We make it easy to programmatically extract email message bodies out of raw email content (MIME, .eml, .msg)

STEP 1: Choose How To Use SigParser

Developers can deploy SigParser in a variety of ways: Stateless Cloud REST API is what we suggest. It's the easiest to get started with. Our second most used option is AWS Lambda. AWS Lambda instances are nice because you can give each email parse its own 2GB of RAM to handle very large emails and it scales linearly.

STEP 2: Post Raw Email Content (MIME, EML, MSG, JSON)

Developers can provide SigParser with a set of .eml files, .msg files, MIME data or even an email converted to a JSON format. Below is an example of a single email in MIME format.
MIME-Version: 1.0
References: <CABxEEohuqZBoVpsyY4pOFMYixhU2bzfxgs9tRLbUoV2NJMqCJw@mail.gmail.com> 
<CAL5Lp9Xyo0mEQ6-c1yAQ+SuKXrT4Xu5y-7BnvnGS4RMjZOBJ=g@mail.gmail.com>
In-Reply-To: <CAL5Lp9Xyo0mEQ6-c1yAQ+SuKXrT4Xu5y-7BnvnGS4RMjZOBJ=g@mail.gmail.com>
From: Chris <c@sigparser.com>
Date: Wed, 9 Jan 2019 08:36:15 -0800
Message-ID: <CABxEEoizOPyCLkq4+FBGNaw7KC2TJDfTZF5dp8xD9aFjDQoL+Q@mail.gmail.com>
Subject: Re: food for thought
To: Paul <p@sigparser.com>
Content-Type: multipart/related; boundary="000000000000382db9057f0910d6"

--000000000000382db9057f0910d6
Content-Type: multipart/alternative; boundary="000000000000382db0057f0910d5"

--000000000000382db0057f0910d5
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Ok.  Just a thought.  Got it.

--000000000000382db0057f0910d5
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div><div dir=3D"auto">Ok.=C2=A0 Just a thought.=C2=A0 Got it. =C2=A0</div>=
</div><div><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Wed, Jan 9, 2=
    

STEP 3: Receive Clean Email Messages

The SigParser application will strip out all content from the posted raw email and return a JSON payload with a clean email body as well as other email contents that can be used for any need in your applications.
{
    "CleanedBodyPlain": "Another response in the chain.\r\n\r\n",
    "CleanedBodyHtml": "<div dir=\"ltr\"><div dir=\"ltr\"><div>Another response in the chain. </div><div><br clear=\"all\"></div></div></div>",
    "IsSpammyLookingEmailMessage": false,
    "IsSpammyLookingSender": false,
    "EmailTypes": [
        "NormalEmail"
    ],
    "Emails": [
        {
        "CleanedBodyPlain": "Another response in the chain.\r\n\r\n",
        "CleanedBodyHtml": "<div dir=\"ltr\"><div dir=\"ltr\"><div>Another response in the chain. </div><div><br clear=\"all\"></div></div></div>",
        "Subject": null,
        "Date": "2020-05-11T16:41:16+00:00",
        "FromEmailAddress": "paul@example.com",
        "FromName": "Paul Mendoza",
        "To": [
            {
            "Name": "Outlook Tester",
            "EmailAddress": "outlook.tester@salesforceemail.com"
            }
        ],
        "Cc": []
        },
        {
        "CleanedBodyPlain": "This is a reply from the test account.\r\n\r\n",
        "CleanedBodyHtml": null,
        "Subject": null,
        "Date": "2020-05-11T09:40:00",
        "FromEmailAddress": "outlook.tester@salesforceemail.com",
        "FromName": "Outlook Tester",
        "To": [],
        "Cc": []
        },
        {
        "CleanedBodyPlain": null,
        "CleanedBodyHtml": null,
        "Subject": "One more test email at 3:25 PM",
        "Date": "2020-04-12T15:25:00",
        "FromEmailAddress": "paul@example.com",
        "FromName": "Paul Mendoza",
        "To": [
            {
            "Name": "Outlook Tester",
            "EmailAddress": "outlook.tester@salesforceemail.com"
            }
        ],
        "Cc": []
        }
    ],
    "Subject": "Re: One more test email at 3:25 PM",
    "Date": "2020-05-11T16:41:16+00:00",
    "Headers": {
        "mime-version": "1.0",
        "date": "Mon, 11 May 2020 09:41:16 -0700",
        "references": "<CAL5Lp9VcCVNqeiw0Rry7BHQaTct46qv3BnUvR5-HNqWZO-Xxiw@mail.gmail.com>\r\n\t<BY5PR04MB6819EFA89CDABDFCB9D67D2F8AA10@BY5PR04MB6819.namprd04.prod.outlook.com>",
        "in-reply-to": "<BY5PR04MB6819EFA89CDABDFCB9D67D2F8AA10@BY5PR04MB6819.namprd04.prod.outlook.com>",
        "message-id": "<CAL5Lp9X0RjYNOo68Y_boL8OOw32gU-SWxLW3WjgYj93eTfUsyQ@mail.gmail.com>",
        "subject": "Re: One more test email at 3:25 PM",
        "from": "Paul Mendoza <paul@example.com>",
        "to": "Outlook Tester <outlook.tester@salesforceemail.com>",
        "content-type": "multipart/alternative; boundary=\"00000000000001bd4705a5620460\""
    },
    "FullPlainTextBody": "Another response in the chain.\n\n*Paul Mendoza*, Founder\nMobile 760-917-3753\nSigParser\npaul@example.com\nSchedule a meeting with me here <https://www.meetingbird.com/m/xxxxxx>\n\nListen to podcasts? I was recently on the *FutureTech Podcast*\n<https://www.futuretechpodcast.com/podcasts/digging-up-the-data-your-company-has-needs-and-cant-access-paul-mendoza-sigparser/>\ntalking about SigParser and use cases other customers are using it for.\n\n\nOn Mon, May 11, 2020 at 9:40 AM Outlook Tester <\noutlook.tester@salesforceemail.com> wrote:\n\n> This is a reply from the test account.\n>\n>\n>\n> *From:* Paul Mendoza <paul@example.com>\n> *Sent:* Sunday, April 12, 2020 3:25 PM\n> *To:* Outlook Tester <outlook.tester@salesforceemail.com>\n> *Subject:* One more test email at 3:25 PM\n>\n>\n>\n>\n> *Paul Mendoza, *Founder\n>\n> Mobile 760-917-3753\n>\n> SigParser\n>\n> paul@example.com\n>\n> Schedule a meeting with me here <https://www.meetingbird.com/m/xxxxxx>\n>\n> Listen to podcasts? I was recently on the *FutureTech Podcast*\n> <https://www.futuretechpodcast.com/podcasts/digging-up-the-data-your-company-has-needs-and-cant-access-paul-mendoza-sigparser/>\n> talking about SigParser and use cases other customers are using it for.\n>\n",
    "FullHtmlBody": "<div dir=\"ltr\"><div dir=\"ltr\"><div>Another response in the chain. </div><div><br clear=\"all\"><div><div dir=\"ltr\" class=\"gmail_signature\" data-smartmail=\"gmail_signature\"><div dir=\"ltr\"><div><div dir=\"ltr\"><div><div dir=\"ltr\"><div><div dir=\"ltr\"><div dir=\"ltr\"><div dir=\"ltr\"><div dir=\"ltr\"><font color=\"#3d85c6\" face=\"tahoma, sans-serif\" style=\"font-size:12.8px\"><b>Paul Mendoza</b></font><font color=\"#3d85c6\" face=\"tahoma, sans-serif\" style=\"font-size:12.8px;font-weight:bold\">, </font><span style=\"font-size:12.8px;color:rgb(61,133,198);font-family:tahoma,sans-serif\">Founder</span><div style=\"font-size:12.8px\"><div><font color=\"#666666\" size=\"2\" face=\"arial narrow, sans-serif\">Mobile 760-917-3753</font></div><div><font color=\"#666666\" size=\"2\" face=\"arial narrow, sans-serif\">SigParser</font></div><div><a href=\"mailto:paul@example.com\" style=\"font-family:tahoma,sans-serif;font-size:12.8px;color:rgb(17,85,204)\" target=\"_blank\">paul@example.com</a><br></div><div><a href=\"https://www.meetingbird.com/m/xxxxxx\" target=\"_blank\">Schedule a meeting with me here</a></div><div><img src=\"https://drive.google.com/a/sigparser.com/uc?id=1GUhMvrGnJMCfkge1HMqyKFQCLSJNXcw-&amp;export=download\" width=\"200\" height=\"90\"><br></div></div>Listen to podcasts? I was recently on the <a href=\"https://www.futuretechpodcast.com/podcasts/digging-up-the-data-your-company-has-needs-and-cant-access-paul-mendoza-sigparser/\" target=\"_blank\"><b>FutureTech Podcast</b></a> talking about SigParser and use cases other customers are using it for. </div></div></div></div></div></div></div></div></div></div></div></div><br></div></div><br><div class=\"gmail_quote\"><div dir=\"ltr\" class=\"gmail_attr\">On Mon, May 11, 2020 at 9:40 AM Outlook Tester &lt;<a href=\"mailto:outlook.tester@salesforceemail.com\">outlook.tester@salesforceemail.com</a>&gt; wrote:<br></div><blockquote class=\"gmail_quote\" style=\"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex\">\n\n\n\n\n\n<div lang=\"EN-US\">\n<div class=\"gmail-m_-2662285044572695259WordSection1\">\n<p class=\"MsoNormal\">This is a reply from the test account.<u></u><u></u></p>\n<p class=\"MsoNormal\"><u></u> <u></u></p>\n<div style=\"border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(225,225,225);padding:3pt 0in 0in\">\n<p class=\"MsoNormal\"><b>From:</b> Paul Mendoza &lt;<a href=\"mailto:paul@example.com\" target=\"_blank\">paul@example.com</a>&gt; <br>\n<b>Sent:</b> Sunday, April 12, 2020 3:25 PM<br>\n<b>To:</b> Outlook Tester &lt;<a href=\"mailto:outlook.tester@salesforceemail.com\" target=\"_blank\">outlook.tester@salesforceemail.com</a>&gt;<br>\n<b>Subject:</b> One more test email at 3:25 PM<u></u><u></u></p>\n</div>\n<p class=\"MsoNormal\"><u></u> <u></u></p>\n<div>\n<p class=\"MsoNormal\"><br clear=\"all\">\n<u></u><u></u></p>\n<div>\n<div>\n<div>\n<div>\n<div>\n<div>\n<div>\n<div>\n<div>\n<div>\n<div>\n<div>\n<p class=\"MsoNormal\"><b><span style=\"font-size:9.5pt;font-family:Tahoma,sans-serif;color:rgb(61,133,198)\">Paul Mendoza, </span></b><span style=\"font-size:9.5pt;font-family:Tahoma,sans-serif;color:rgb(61,133,198)\">Founder</span><u></u><u></u></p>\n<div>\n<div>\n<p class=\"MsoNormal\"><span style=\"font-size:10pt;font-family:&quot;Arial Narrow&quot;,sans-serif;color:rgb(102,102,102)\">Mobile 760-917-3753</span><span style=\"font-size:9.5pt\"><u></u><u></u></span></p>\n</div>\n<div>\n<p class=\"MsoNormal\"><span style=\"font-size:10pt;font-family:&quot;Arial Narrow&quot;,sans-serif;color:rgb(102,102,102)\">SigParser</span><span style=\"font-size:9.5pt\"><u></u><u></u></span></p>\n</div>\n<div>\n<p class=\"MsoNormal\"><span style=\"font-size:9.5pt\"><a href=\"mailto:paul@example.com\" target=\"_blank\"><span style=\"font-family:Tahoma,sans-serif;color:rgb(17,85,204)\">paul@example.com</span></a><u></u><u></u></span></p>\n</div>\n<div>\n<p class=\"MsoNormal\"><span style=\"font-size:9.5pt\"><a href=\"https://www.meetingbird.com/m/xxxxxx\" target=\"_blank\">Schedule a meeting with me here</a><u></u><u></u></span></p>\n</div>\n<div>\n<p class=\"MsoNormal\"><span style=\"font-size:9.5pt\"><img border=\"0\" width=\"200\" height=\"90\" style=\"width: 2.0833in; height: 0.9375in;\" id=\"gmail-m_-2662285044572695259_x0000_i1025\" src=\"https://ci6.googleusercontent.com/proxy/TTpjUlFcjmphqTPKcbTFGb7TsHUk5MzP3P1Wt2uZYLjMzlO0UPeF7MAgaUwFk4hqlFafCMhmzlmkc3FUbGH4ijNXkqx9DAsv-_3CFnCTmZaZhMlONJqrrR-oGfWMfwqGpDgk301HHsijRMhsymfOCkhNKg=s0-d-e1-ft#https://drive.google.com/a/sigparser.com/uc?id=1GUhMvrGnJMCfkge1HMqyKFQCLSJNXcw-&amp;export=download\"></span><span style=\"font-size:9.5pt\"><u></u><u></u></span></p>\n</div>\n</div>\n<p class=\"MsoNormal\">Listen to podcasts? I was recently on the <a href=\"https://www.futuretechpodcast.com/podcasts/digging-up-the-data-your-company-has-needs-and-cant-access-paul-mendoza-sigparser/\" target=\"_blank\">\n<b>FutureTech Podcast</b></a> talking about SigParser and use cases other customers are using it for.\n<u></u><u></u></p>\n</div>\n</div>\n</div>\n</div>\n</div>\n</div>\n</div>\n</div>\n</div>\n</div>\n</div>\n</div>\n</div>\n</div>\n</div>\n\n</blockquote></div></div>\n"
}

        

Why Email Parsing is Hard

There are a lot of issues that need to be solved when writing your own email parser. We have spent years developing and optimizing email parsers to make it easy for developers to programmatically extract email messages out of raw email content.

Here are some of the challenges that need to addresses when writing an email parser:
  • Email signature identification
  • Various formats for headers
  • Reply chains indicated by > or multiple >>>
  • Some lines look like signatures but aren’t
  • Corrupted email headers
  • Common for plain text emails to split reply headers
  • Multi-language support if required
  • Header formats change over time
Due to this, we suggest not coding your own email parsing algorithm. We've spent years working on this problem. It is non-trivial. There are a number of open source email parsing solutions and we've tried most of them. But, with these solutions, we've found that they do not account for all of the variations of content that exist in raw email files. Most of our users have tried open source solutions before deciding to use SigParser.

Want to Test the SigParser API?

You can paste a MIME encoded email in the input area below. Enter your email address and click the "Get Parsed JSON" button to receive an email that will contain a JSON file with the parsed contents of your email.


Not sure how to get a MIME email?
In Gmail click the three dots on any email and click "Show Original".
Or open any .EML file and paste the contents below.


Enter Your Email Address
Enter your email address and click the button below so we can send you the parsed JSON content from your MIME email

Multiple Options for Developers

SigParser has multiple solutions for developers looking to parse contents out of emails. Click on each option to learn more.

Usage Pricing

Parse a limited number of emails every month with an on-premise application or web API
$249
per month

Up to 25,000 emails parsed per month
$0.01 per additional email

Unlimited Pricing

Parse an unlimited number of emails every month with an on-premise application
$699
per month

Unlimited email parsing on up to 3 machines

Ready to Try for FREE?

Try the SigParser API for FREE with no commitment. Schedule a 15 minute web conference to get an overiew of SigParser and an API key with 500 free API requests. Upgrade or downgrade at any time. Our API is entirely serverless and stateless.