Data Pt.2: What Companies Get Wrong About Data


Hello Automation Enthusiasts!

In the previous drops of the Principles of Automation project, I have defined:

Now, that you have a base understanding of these concepts, I can now discuss some of the challenges companies have in utilizing their data for automation.

How Companies Collect Digital Data

First, I will explain how companies collect the digital data that is eventually used for their automations:

Forms (Digital or Paper)

It is hard to do business with a company without filling out a form. Whether you are at a dentist’s office, buying shoes online, registering for a newsletter, creating a social media account, applying for a job, opening a bank account, or enrolling in health insurance, you must fill out a form. When you fill out a form, you are providing data to a company, and companies have the ability to use this data in perpetuity.

What is a Database?

Form data, like most digital data, are stored in databases. You can think of databases as electronic homes for data. These databases typically live in servers. A server, in its most basic form, is a computer that is connected to other computers. Just like you may spend most of your time at home, but leave your home to visit a friend, form data can leave it’s database to go visit other computers when it is “called” by software.

When you fill out a paper form, companies digitize your data so it can live in a database. Either a human manually inputs the data into a computer, known as data entry, or a specialty scanner “reads” the data from a piece of paper and digitizes it automatically. Some examples of scanned data are:

  • Scantron machines for standardized tests
  • Ballot machines (for submitting votes)
  • Mobile check deposit (via smart phone.)


Once companies have your form data, every time you come back to do business with them, the company is able to associate the subsequent data you provide with your digital record. For example, if you filled out an account creation form to sign up for a loyalty card at your favorite coffee shop, every time you make a purchase using this card, the company can associate all of your purchases to your account.

If you filled out an online form on a company’s website, a company can collect additional data from you whether or not you make a purchase. A company can track every action you take on their website, such as view the Summer Sale page, like or comment on a photo, or add items to your online shopping cart.

Sometimes, you don’t even have to go back to the company’s website, because the company placed something on you called a cookie, so you leave crumbs of information for the company to collect wherever you go. Or, in the case of an app, you may have allowed the company to track and collect your data as you use your device.

Companies Have Tons of Data

You may have guessed, companies can accumulate LOTS of data. Typically, it’s after a company has amassed a giant fortress of data, when someone thinks, “Hey, I heard about this automation thing, let’s use this data for automation!” I mean why not, it’s just hanging out in a database, waiting to be called by some software to go to work, right?” In theory, yes. But since the data may not have been collected keeping the Principles of Automation in mind, it likely needs to be standardized. The following are questions to ask yourself before using your data for automation:

1. Where Does the Data Live?

You may be thinking, “I thought you said data lives in a database.” I did, but which one? Companies can have multiple databases, and typically do. Let’s imagine I have a banana stand business. I start by selling bananas at a booth in a farmers market. Later, I decide to expand the business and rent a restaurant space in a building. Eager to grow, I also start selling my bananas online. At the farmers market, I probably kept my customers’ information on my mobile phone. Then, at the restaurant location, I purchased a point-of-sales system (POS.) For the online store, I purchased an an out-of-the box eCommerce platform. Each of these systems have their own unique databases.

Now I have three different databases housing customer information. Yes, I theoretically could have used a single software for all three sales channels and as result had single database, but hindsight is 20/20 right? I didn’t know my humble banana stand was going to turn into a multi-channel empire!

Before I can use my data for automation, I must either select an automation platform that can integrate with all three databases, or perform a software migration to consolidate all my databases.

2. Do I have duplicates?

In theory, it would be relatively easy to consolidate all of my data if all of my customer names and addresses lived on my phone, their names and purchase histories lived on the POS, and their names and preferred payment methods lived on the ecommerce platform. I could use the customer name as a unique identifier and pair all of the data together. Except:

  1. What if two (or more) customers have the same name?
  2. The scenario I described is highly unlikely. My customer information probably overlaps across at least two out of the three databases.

So what do I do? I have to pick one database as the Source of Truth and deduplicate (dedupe) all the data. This may seem scary, but it is necessary for standardization.

3. Do my inputs follow a standard pattern?

Is Jane Doe…

  • First Name: Jane, Last Name: Doe, or
  • Full Name: Jane Doe, or
  • First Name: Jane K. Last Name: Doe

If you have an Automation Mindset, you are likely pulling your hair out right now. As you can see, data must be standardized in order for it to be useful. It must live in the same database, be free of duplicates, and follow identical patterns. Without standardization, data becomes irrelevant and useless. That is why standardization is a Principle of Automation.

In case you forgot, here are the seven Principles of Automation:

The Seven Principles of Automation

  • Data
  • Standardization
  • Segmentation
  • Modularization
  • Time
  • Testing
  • Reporting

Congratulations on finishing this step of your automation journey. Hopefully, you now understand how companies end up with bad data and how you can avoid or resolve your data issues with standardization. Stay tuned by following Principles of Automation on Callin, Twitter, Instagram or my professional website, linked in all of my bios.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s