Correlated Data

by Vitaly, Tuesday, June 21st, 2016

Opt-In List Manager 1.2.78 has been released.

What’s New

New analysis feature called “Correlated Data”. It allows to keep or remove the records where data in one column or more specified columns partially or fully correlates to data in another column.

Example:

email,first_name,last_name
drewpwtm@yahoo.com,Andrew,Smallhouse

It is needed to find the records that correlate to column 1 (email) from column 2 (first name) or column 3 (last name). Consider them correlating if 3 characters in a row match. Meaning the same 3 characters or more (in the same order) are anywhere in the email address as contained in either the first name column or the last name column.

The software will take “Andrew” and split it into “and”,”ndr”,”dre”,”rew” and check for each of those in the email address. It will find a match with “dre” and therefore be a KEEP row.

If it were to not find “dre” and “rew” (it would find both) but pretend the email was khouz12345@yahoo.com instead.
Then the software would need to take “Smallhouse” and split it into “sma”, “mal”, “all”, “llh”, “lho”, “hou”, “ous”, “use” and check for those in the email address.
In the case of khouz12345@yahoo.com it would find “hou” and KEEP the record.

If both first and last name failed to find any correlation it would remove the record.

Tags: , , ,