Data cleaning tools: Which are the best, what the results mean, and how to keep your data clean

Your email list is one of the most valuable resources you have. It allows you to directly reach out to your customers, attract new fans, and communicate with qualified leads.

If your mailouts are reported as spam, or routinely bounce, your reputation will be hit and your mailers blocked. What’s more, with the General Data Protection Regulation (GDPR) - the EU’s strict rules on data management - coming into force next year, managing your data has never been more important. Good hygiene is essential.

So, how? Below are some of the best tools on the market to clean up your data - followed by some tips on how to keep it clean, going forward.

NeverBounce

Cost: Sliding scale from £.008 per email for 0-10k contacts, to £.001 per email for 3m-10m

Pros: Fast turnaround; wide range of integrations

Cons: Limited free trial

NeverBounce offers a guarantee of 97% delivery. Its system works by detecting emails containing bad syntax and removing duplicate addresses from your list, at no cost. Then, using its validation tool, NeverBounce follows a 12-step verification process, checking each address up to 75 times from locations around the world. Beyond this, the tool reviews both the domain and the mailbox of each item, pinging the individual server to determine whether the address is valid or not. NeverBounce then keeps on going until it has a result.

Neverbounce claims this process provides higher accuracy at a lower cost and with a faster turnaround than the competition (they report they can process 10,000 emails within 10 minutes and 100,000 in an average of 45 minutes). The platform also comes with a wide range of integrations, as well as functionality to programmatically plug into your platform, and check email addresses in real time.

BriteVerify

Cost: Sliding scale from £.0075 per email for 0-250k to £.0035 per email for 1m-2m

Pros: Easy drag-and-drop interface; 24/7 support

Cons: Doesn’t accept all formats for upload

BriteVerify sets out to make email validation as painless as possible, with a simple drag-and-drop interface which allows you to paste your list directly into their forms. For ease of use, you can also enter your data by uploading a file from your email service provider. Following the scan, you can download your list, split between good and bad addresses, or you can automatically unsubscribe invalid emails for good. It also allows you to validate entries in real-time to keep your data clean.  What are the results? Well, BriteVerify claims that it can cut your bounce rate by up to 98%.

DataValidation

Cost: Sliding scale from $.007 per email for 100-10k to $.001 per email for 3m
Pros: Provide range of integrations and an API; 24-hour weekday support; free health report
Cons: Doesn’t guarantee accuracy as a percentage

DataValidation’s unique selling point is a free list quality report prior to purchase - so you can get a sense of the general health of your list before putting any money down. The service also comes with an application programming interface (API), as well as direct integrations into email service providers such as MailChimp, AWeber, MailGun and SendGrid. Depending on the size of your database, validation can take anywhere from minutes to several hours. While they don’t guarantee the accuracy of their results in percentile terms, they grade the emails on your list, with A/A+ as the best, giving you immediately actionable data.
 

Kickbox

Cost: Blocks of 500 validations for $5, 10k for $70, 100k for $400, 2.5m for $2,500
Pros: Drag-and-drop interface; real-time validation
Cons: There are cheaper options if your list is small

Much like BriteVerify, Kickbox comes with an easy to use drag-and-drop interface that allows you to drop your list directly into their browser-based forms, or alternatively import your database directly from your email service provider. It provides an API for real-time validation, and guarantees 95% accuracy for the results that it delivers. While the service is relatively expensive at the bottom end of the scale, charging for set blocks of validations, they offer unlimited email form authentications for free. What’s more, if you are dissatisfied with your results, Kickbox will provide refunds for tracked discrepancies.

HuBuCo Email List Cleaning
Cost: 10k for $19, 1m for $690
Pros: 24/7 support; cheap for small lists; full refund on offer if the service doesn’t work for you
Cons: No direct email service provider integrations
 

HuBuCo is relatively cheap, even when validating smaller batches, but promises 99%+ deliverability. The service comes with 24/7 support and can reportedly validate 200k emails in under 180 minutes. The HuBuCo tool also provides an API to allow for real-time verification, though it doesn’t provide pre-built integrations with other platforms. Nevertheless, the service doesn’t charge for emails it can’t verify, thereby saving you money on every batch - which could add up to a significant amount at scale. And, if, after using the service, you report a hard bounce rate of over 4%, HuBuCo promises to provide a full refund.

Reading the results

Whatever tool you use, you'll get a report on the status of the email addresses that you’ve processed. While concerning entries might be removed automatically, it’s useful to understand what the findings mean for your business, for your list and for the way that you’re collecting email addresses. Here is a set of sample results from NeverBounce:

And here’s what they mean:

Valid: Hooray! This address has been confirmed as being real and accepting mail.

Invalid: Boo! An email address that doesn’t exist or isn’t accepting mail - meaning that emails sent to it will bounce. This is one you should remove.

Catchall: Some domains will accept all mail sent to them, whether the address exists or not (hence catchall). They are used by businesses keen to make sure that messages don’t go astray due to typos in the address, as well as by governments and educational institutions. While the individual addresses may very well exist, they are impossible to verify.

Unknown: The email address appears to be OK, but because the domain or server is not responding, it’s impossible to conclusively determine its validity. If you are using a dedicated email server with your own IPs, these addresses are safe to include - if not, they should be removed.

Disposable: These are temporary email accounts that have been used to effectively fake your email form - if you send mail to them, it will bounce, so these addresses should be removed.

How to keep your email list clean

Once you have a clean list, it will pay to keep it that way. Here are some tips:

  • Check for typos - for example, if addresses don’t have an @ symbol or a full stop in the domain name, they are definitely invalid.

  • Remove any addresses containing the word “spam”.

  • Remove system addresses, such as postmaster@company.com.

  • Review your bounce list - and if you are registering a hard bounce, remove the address from your list asap.

  • Cut inactive users - if subscribers never click, they’re probably not interested, so it’s better to cut them from your list entirely than have them report your email as spam. If you’re (justifiably) worried about removing valid users then try strategies to re-engage them first. If they don’t reconnect, then move ahead with cutting them.

  • Perhaps most importantly, several of the tools listed above offer real-time validation. This ensures only valid addresses can be added to your list, and will automate several of the steps described here.

Conclusion

Getting to grips with your data is an essential part of your business - ensuring that you know who your customers are, and that you’re able to reach them effectively. With GDPR just around the corner, clean data has never been more important.