If you use Swift and need to parse emails then this guide is for you.
We’ll cover:
- Why parsing emails is hard
- Swift Code Example for parsing emails with SigParser
Why parsing emails is hard
Splitting Email Chains
If you want to split emails on the headers with Swift, that is tough. No two email clients seem to produce the same header format and over time the email clients change the way they format headers.
Signature and Contact Detail Detection
Parsing email signatures with Swift is also difficult. Many think they can just use a couple regex expressions and they'll be done. But the more you start working the problem the harder it becomes. Here are some of the major things you'd need to handle.
- No signature is formatted the same
- Phone numbers can have many different formats
- Need to attribute the type of phone number (Fax vs Mobile vs Work phones).
- The phone type indicator has lots of variations. For example, Mobile vs M: vs Cell vs C: and many others
- Titles can be incredibly difficult to capture without getting too much wrong information.
- Locations are tough. Very few people put full addresses. Often they'll only put the city and state but no country. Even street addresses are massively different by country.
Then there is identifying where the email signature is in the email which is really hard. We use a machine learning algorithm with lots of labeled validation emails. We've been labeling our test set for years across many organizations.
This is why we expose a simple email parsing API for use with Swift
All of the above is why we suggest using our email parsing service to parse emails from Swift. As we improve you'll automatically get any improvements without the need to redeploy any of your code. If an email client like Gmail starts using a new reply header format, we'll have a fix deployed with days and you won't have to do anything.
The SigParser Email Parsing API
The SigParser Email Parsing API is a serverless, stateless email parsing API which is easy to call from Swift. It can extract contacts and split emails into sections. It can find phone numbers, titles, addresses and attribute them to the correct contact. It even takes care of deduping contacts for you if the same email address appears in the email.
Stateless means we store none of the email contents. It is a processing only service. We only store some high level statistics like length of the email or how long it took to process.
Example: Parse an Email with Swift
Here is how to call the SigParser API with Swift and convert the data to nice model. The model definition is below.
func fetchData() {
guard let url = URL(string: "https://ipaas.sigparser.com/api/Email") else { return }
let parameters = ["subject" : "mr.John",
"from_address" : "jsmith@example.com",
"from_name" : "John Smith",
"htmlbody" : "description",
"plainbody" : "This is the body of the email",
"date" : "Current date"] as [String : String]
var urlRequest = URLRequest(url: url)
urlRequest.httpMethod = "POST"
urlRequest.setValue("no-cache", forHTTPHeaderField:"cache-control")
urlRequest.addValue("p1rrIkuP6v6famS623AGs1hA8MpBJ13b6VeMNkmz", forHTTPHeaderField: "x-api-key")
urlRequest.addValue("application/json", forHTTPHeaderField:"Content-Type")
guard let httpBody = try? JSONSerialization.data(withJSONObject: parameters, options: []) else { return }
urlRequest.httpBody = httpBody
let session = URLSession.shared
session.dataTask(with: urlRequest) { (data, response, error) in
if let response = response {
print(response)
}
guard let data = data,
let model = self.getModelMap(data: data) else { return } // got object model, here you can use it
}.resume()
}
private func getModelMap(data: Data?) -> Model?
{
guard let data = data else { return nil }
let decoder = JSONDecoder()
let model = try? decoder.decode(Model.self, from: data)
return model
}
The Swift class which you can bind the response to looks like
class Model: Decodable {
var error: String?
var contacts: [Contacts]?
var isSpammyLookingEmailMessage: Bool?
var isSpammyLookingSender: Bool?
var isSpam: Bool?
var fromLastName: String?
var fromFirstName: String?
var fromFax: String?
var fromPhone: String?
var fromAddress: String?
var fromTitle: String?
var fromMobilePhone: String?
var fromOfficePhone: String?
var fromLinkedInUrl: String?
var fromTwitterUrl: String?
var fromTwitterHandle: String?
var fromEmailAddress: String?
var emails: [Emails]?
var fromLinkedInHandle: String?
var duration: Float
var cleanedemailbody: String?
var cleanedeMailBodyIsHtml: Bool?
var cleanedeMailBodyPlain: String?
enum CodingKeys: String, CodingKey {
case error, contacts, isSpammyLookingEmailMessage, isSpammyLookingSender, isSpam, emails, duration, cleanedemailbody
case fromLastName = "from_LastName"
case fromFirstName = "from_FirstName"
case fromFax = "from_Fax"
case fromPhone = "from_Phone"
case fromAddress = "from_Address"
case fromTitle = "from_Title"
case fromMobilePhone = "from_MobilePhone"
case fromOfficePhone = "from_OfficePhone"
case fromLinkedInUrl = "from_LinkedInUrl"
case fromTwitterUrl = "from_TwitterUrl"
case fromTwitterHandle = "from_TwitterHandle"
case fromEmailAddress = "from_EmailAddress"
case fromLinkedInHandle = "from_LinkedInHandle"
case cleanedeMailBodyIsHtml = "cleanedemailbody_ishtml"
case cleanedeMailBodyPlain = "cleanedemailbody_plain"
}
required init(from decoder: Decoder) throws {
let values = try decoder.container(keyedBy: CodingKeys.self)
self.error = try? values.decode(String.self, forKey: .error)
self.isSpammyLookingEmailMessage = try? values.decode(Bool.self, forKey: .isSpammyLookingEmailMessage)
self.isSpammyLookingSender = try? values.decode(Bool.self, forKey: .isSpammyLookingSender)
self.isSpam = try? values.decode(Bool.self, forKey: .isSpam)
self.fromLastName = try? values.decode(String.self, forKey: .fromLastName)
self.fromFirstName = try? values.decode(String.self, forKey: .fromFirstName)
self.fromFax = try? values.decode(String.self, forKey: .fromFax)
self.fromPhone = try? values.decode(String.self, forKey: .fromPhone)
self.fromAddress = try? values.decode(String.self, forKey: .fromAddress)
self.fromTitle = try? values.decode(String.self, forKey: .fromTitle)
self.fromMobilePhone = try? values.decode(String.self, forKey: .fromMobilePhone)
self.fromOfficePhone = try? values.decode(String.self, forKey: .fromOfficePhone)
self.fromLinkedInUrl = try? values.decode(String.self, forKey: .fromLinkedInUrl)
self.fromTwitterUrl = try? values.decode(String.self, forKey: .fromTwitterUrl)
self.fromEmailAddress = try? values.decode(String.self, forKey: .fromEmailAddress)
self.emails = try? values.decode([Emails].self, forKey: .emails)
self.fromLinkedInHandle = try? values.decode(String.self, forKey: .fromLinkedInHandle)
self.duration = try! values.decode(Float.self, forKey: .duration)
self.cleanedemailbody = try? values.decode(String.self, forKey: .cleanedemailbody)
self.cleanedeMailBodyIsHtml = try? values.decode(Bool.self, forKey: .cleanedeMailBodyIsHtml)
self.contacts = try? values.decode([Contacts].self, forKey: .contacts)
}
}
class Contacts: Decodable {
var firstName: String?
var lastName: String?
var emailAddress: String?
var phoneNumber: String?
var mobilePhone: String?
var voipPhone: String?
var officePhone: String?
var fax: String?
var address: String?
var title: String?
var twitterUrl: String?
var twitterHandle: String?
var linkedInUrl: String?
var linkedInHandle: String?
enum CodingKeys: String, CodingKey {
case firstName, lastName, emailAddress, phoneNumber, mobilePhone, voipPhone, officePhone, fax, address, title,
twitterUrl, twitterHandle, linkedInUrl, linkedInHandle
}
required init(from decoder: Decoder) throws {
let values = try decoder.container(keyedBy: CodingKeys.self)
self.firstName = try? values.decode(String.self, forKey: .firstName)
self.lastName = try? values.decode(String.self, forKey: .lastName)
self.emailAddress = try? values.decode(String.self, forKey: .emailAddress)
self.phoneNumber = try? values.decode(String.self, forKey: .phoneNumber)
self.mobilePhone = try? values.decode(String.self, forKey: .mobilePhone)
self.voipPhone = try? values.decode(String.self, forKey: .voipPhone)
self.officePhone = try? values.decode(String.self, forKey: .officePhone)
self.fax = try? values.decode(String.self, forKey: .fax)
self.address = try? values.decode(String.self, forKey: .address)
self.title = try? values.decode(String.self, forKey: .title)
self.twitterUrl = try? values.decode(String.self, forKey: .twitterUrl)
self.twitterHandle = try? values.decode(String.self, forKey: .twitterHandle)
self.linkedInUrl = try? values.decode(String.self, forKey: .linkedInUrl)
self.linkedInHandle = try? values.decode(String.self, forKey: .linkedInHandle)
}
}
class Emails: Decodable {
var fromEmailAddress: String?
var fromName: String?
var textBody: String?
var htmlLines: [String]?
var date: String?
var didParseCorrectly: Bool?
var to: [User]?
var cc: [User]?
var htmlBody: String?
var spammyLookingEmail: Bool?
var subject: String?
enum CodingKeys: String, CodingKey {
case textBody, htmlLines, date, didParseCorrectly, to, cc, htmlBody, spammyLookingEmail, subject
case fromEmailAddress = "from_EmailAddress"
case fromName = "from_Name"
}
required init(from decoder: Decoder) throws {
let values = try decoder.container(keyedBy: CodingKeys.self)
self.fromEmailAddress = try? values.decode(String.self, forKey: .fromEmailAddress)
self.fromName = try? values.decode(String.self, forKey: .fromName)
self.textBody = try? values.decode(String.self, forKey: .textBody)
self.htmlLines = try values.decode([String].self, forKey: .htmlLines)
self.date = try? values.decode(String.self, forKey: .date)
self.didParseCorrectly = try? values.decode(Bool.self, forKey: .didParseCorrectly)
self.to = try values.decode([User].self, forKey: .to)
self.cc = try? values.decode([User].self, forKey: .cc)
self.htmlBody = try? values.decode(String.self, forKey: .htmlBody)
self.spammyLookingEmail = try? values.decode(Bool.self, forKey: .spammyLookingEmail)
self.subject = try? values.decode(String.self, forKey: .subject)
}
}
class User: Decodable {
var name: String?
var emailAddress: String?
enum CodingKeys: String, CodingKey {
case name, emailAddress
}
required init(from decoder: Decoder) throws {
let values = try decoder.container(keyedBy: CodingKeys.self)
self.name = try? values.decode(String.self, forKey: .name)
self.emailAddress = try? values.decode(String.self, forKey: .emailAddress)
}
}
The JSON looks like this.
{
"error": "string",
"contacts": [
{
"firstName": "string",
"lastName": "string",
"emailAddress": "string",
"emailAddressDomain": "string",
"emailAddressDomainWithoutTLD": "string",
"phoneNumber": "string",
"mobilePhone": "string",
"voipPhone": "string",
"officePhone": "string",
"fax": "string",
"address": "string",
"title": "string",
"twitterUrl": "string",
"twitterHandle": "string",
"linkedInUrl": "string",
"linkedInHandle": "string",
"companyName": "string",
"website": "string"
}
],
"isSpammyLookingEmailMessage": true,
"isSpammyLookingSender": true,
"isSpam": true,
"from_LastName": "string",
"from_FirstName": "string",
"from_Fax": "string",
"from_Phone": "string",
"from_Address": "string",
"from_Title": "string",
"from_MobilePhone": "string",
"from_OfficePhone": "string",
"from_LinkedInUrl": "string",
"from_TwitterUrl": "string",
"from_TwitterHandle": "string",
"from_EmailAddress": "string",
"emails": [
{
"from_EmailAddress": "string",
"from_Name": "string",
"textBody": "string",
"htmlLines": [
"string"
],
"date": "2019-05-05T22:27:56.124Z",
"didParseCorrectly": true,
"to": [
{
"name": "string",
"emailAddress": "string"
}
],
"cc": [
{
"name": "string",
"emailAddress": "string"
}
],
"htmlBody": "string",
"spammyLookingEmail": true,
"subject": "string",
"cleanedBodyHtml": "string",
"cleanedBodyPlain": "string"
}
],
"from_LinkedInHandle": "string",
"duration": 0,
"cleanedemailbody": "string",
"cleanedemailbody_ishtml": true,
"cleanedemailbody_plain": "string",
"from_CompanyName": "string",
"from_Website": "string",
"from_EmailAddressDomain": "string",
"from_EmailAddressDomainWithoutTLD": "string"
}