The huge leaps in Big Data and analytics over the past few years has meant that the average business user is now grappling with a whole new lexicon of tech-terminology. This can breed confusion, as people aren’t sure of the difference between terms and approaches. In my experience, ‘data mining’ and ‘machine learning’ are a prime example of this.
In this article, I define both data mining and machine learning, and set out how the two approaches differ. So if you’ve never quite grasped the difference, this article is for you.
What is data mining?
Data mining is a subset of business analytics and refers to exploring an existing large dataset to unearth previously unknown patterns, relationships and anomalies that are present in the data. It gives us the ability to find completely new insights that we weren’t necessarily looking for – unknown unknowns, if you like.
For example, if a business has a lot of data on customer churn, it could apply a data mining algorithm to find unknown patterns in the data and identify new associations that could indicate customer churn in the future. In this way, data mining is frequently used in retail to spot patterns and trends.
What is machine learning?
Machine learning is a subset of artificial intelligence (AI). With machine learning, computers analyse large data sets and then ‘learn’ patterns that will help it make predictions about new data sets. Apart from the initial programming and maybe some fine-tuning, the computer doesn’t need human interaction to learn from the data.
Put simply, machine learning is about teaching computers to learn a bit like humans do, by interpreting information and learning from our successes and failures. As an analytic process, it’s particularly useful for predicting outcomes. So, Netflix predicting you may want to watch Ozark next, based on the viewing preferences of other users with similar profiles, is an example of machine learning in action. Real-time fraud detection on credit card transactions is another example.
Why do people confuse the two?
As you can see, there are some similarities between the two concepts:
- Both are analytics processes
- Both are good at pattern recognition
- Both are about learning from data so that we can improve decision making
- Both require large amounts of data to be accurate
In fact, machine learning may use some data mining techniques to build models and find patterns, so that it can make better predictions. And data mining can sometimes use machine learning techniques to produce more accurate analysis.
What are the key differences?
Data mining and machine learning may, at heart, both be about learning from data and making better decisions. But the way they go about this is different. Here are some of the key differences between the two:
- While data mining is simply looking for patterns that already exist in the data, machine learning goes beyond what’s happened in the past to predict future outcomes based on the pre-existing data.
- In data mining, the ‘rules’ or patterns are unknown at the start of the process. Whereas, with machine learning, the machine is usually given some rules or variables to understand the data and learn.
- Data mining is a more manual process that relies on human intervention and decision making. But, with machine learning, once the initial rules are in place, the process of extracting information and ‘learning’ and refining is automatic, and takes place without human intervention. In other words, the machine becomes more intelligent by itself.
- Data mining is used on an existing dataset (like a data warehouse) to find patterns. Machine learning, on the other hand, is trained on a ‘training’ data set, which teaches the computer how to make sense of data, and then to make predictions about new data sets.
Clearly, there are some distinct differences between the two. Yet, as businesses look to become more and more predictive, we may see more overlap between machine learning and data mining in future. For example, more businesses may seek to improve their data mining analytics with machine learning algorithms.
Where to go from here
If you would like to know more about Machine Learning, AI and Big Data, cheque out my articles on:
Or browse other related articles.