In most cases you already have the data.
But, it’s on your servers, in the cloud, across multiple databases, and in Excel spreadsheets. And then there’s the third-party demographic, psychographic and technographic information that’s crucial for turning incomplete data points into comprehensive customer and prospect profiles.
So how do you handle all multiple sources in a way that you can quickly and efficiently run analyses?
There are three main challenges.
- Processing billions of records
- Normalizing data from multiple sources
- Keeping costs under control.
The Alteryx platform
Kristalytics uses the Alteryx platform for very-large-database management, which gives quick and meaningful access to big data, speeds development of cloud-based databases, and efficiently handles all forms of data from multiple sources.
Kristalytics holds licenses with many data providers and combines that data with the client’s proprietary data to develop predictive models that can be applied across a wide variety of marketing channels.
Our Chief Analytics Officer is an Alteryx ACE, a title held by only 24 worldwide who are certified as the best of the best. We combine his technical expertise with strategic insights to turn big data into actionable and profitable advertising campaigns.
Types of Data
With access to so many different types of data, we can figure out exactly what attributes of a prospect indicate that they will be a high-value customer. Let’s look at an example:
If you’re targeting a 30-something adult who makes $75,000 a year and has children, that sounds specific enough. Right? Well, we like to dig deeper. Age, marital status, home ownership, number of children, age of children, technical savviness, just to name a few. But those barely scratch the surface. Because the not-so-obvious data points can be very enlightening: NASCAR or golf? hip-hop or opera? CNN or Fox? Buick or BMW?
First, we run a massive analysis where we look at your customer database, and attach as many of these attributes to all your customers. Then we look at which attributes actually predict who will buy. Then we attach the same attributes to your prospect database. Now we can see, based on your best customers, which prospects are most likely to buy. We take it even one step further, because with access to dozens of databases, once we know which attributes are most important, we can then choose the lowest cost source for that information.
With access to so much data, we can maximize their predictive value while minimizing usage costs.
Data Cleansing
Our first step in working with data is to cleanse it. That means pulling both structured and unstructured data from disparate sources in a way that we can work with all of it. Then we resolve conflicts between name and identity. This ensures that our data lines up with your customer data. Next, we validate household addresses. This is a great way to catch typos and other inconsistencies that prevent you from getting in touch with your customers and prospects.
Once the cleansing is done, then it’s time for data matching …
Data Matching
Data matching ensures the data for each customer in your database is consistent.
We start by removing records that are clearly duplicates. Then we run several more advanced processes:
- De-duping: We reconcile “matching” records that include inconsistent information.
- Householding: We link customer records based on matches such as name, address, phone number, and email address.
- Merge: We consolidate data from multiple records into a single, best-of-breed survivor record.
Data Appending & Enrichment
Now we use our in-house database to enhance the customer records in your database.
Demo / psychographics: This includes cluster assignment, economic stability, urban vs. rural measurement, and many others.
Geographic: This includes state, metro, county, zip, carrier route, block group, household
Property Data: Home value, # of rooms, date built, ownership