I have a client who has a problem, he came to me for the answer and I don't even know where to point him in the right direction.
He has a large amount of people's names and their cities. He has also a database of e-mail addresses.
He's not looking to spam them, but his business is one that deals with thousands of signups a day and approximately 20% of the time he gets e-mail addresses that are just written wrong. Due to the software that he's running he doesn't have the opportunity to check the emails when they're entered. He'd obviously like the right e-mails without dedicating his admin to call up people to confirm, etc.
He has multiple other products / services which most of his customers also buy. So let's assume that the 20% of bad email addresses are corrected in these other data sources.
Mind you, this is a "nice to have". He's just trying to save some manpower.
I've done some looking into data enrichment before but frankly it's not my forte. What I really need is a system whereby I can feed in a CSV full of names and addresses and for the software to heuristically try to match it up with the other data sources. It could theoretically be done with something like a vlookup but due to the amount of data here I'm not confident that'll be a solution that will scale to the level that he's looking to expand.
I'm playing around with Open Semantic right now, it seems powerful, but I really don't know how to get from Point A to Point Z.
Thanks!
PS: There are companies out there that can do this for like .06/record so that's always an option.
He has a large amount of people's names and their cities. He has also a database of e-mail addresses.
He's not looking to spam them, but his business is one that deals with thousands of signups a day and approximately 20% of the time he gets e-mail addresses that are just written wrong. Due to the software that he's running he doesn't have the opportunity to check the emails when they're entered. He'd obviously like the right e-mails without dedicating his admin to call up people to confirm, etc.
He has multiple other products / services which most of his customers also buy. So let's assume that the 20% of bad email addresses are corrected in these other data sources.
Mind you, this is a "nice to have". He's just trying to save some manpower.
I've done some looking into data enrichment before but frankly it's not my forte. What I really need is a system whereby I can feed in a CSV full of names and addresses and for the software to heuristically try to match it up with the other data sources. It could theoretically be done with something like a vlookup but due to the amount of data here I'm not confident that'll be a solution that will scale to the level that he's looking to expand.
I'm playing around with Open Semantic right now, it seems powerful, but I really don't know how to get from Point A to Point Z.
Thanks!
PS: There are companies out there that can do this for like .06/record so that's always an option.