Text Analytics: How To Analyse And Mine Words And Natural Language In Businesses
2 July 2021
Most businesses have a huge amount of text-based data, such as memos, company documents, emails, reports, media releases, customer records and communication, websites, blogs and social media posts. Until recently it wasn’t always that useful, at least in terms of easily extracting business-critical insights. But that has all changed thanks to text analytics.
Understanding text analytics
Text analytics, also known as text mining, is a process of extracting value from large quantities of unstructured text data. While the text itself is structured to make sense to a human being (i.e. A company report split into sensible sections) it is unstructured from an analytics perspective because it doesn’t fit neatly into a relational database or rows and columns of a spreadsheet. Traditionally, the only structured part of text was the name of the document, the date it was created and who created it.
Access to huge text data sets and improved technical capability means text can be analysed to extract high-quality information above and beyond what the document actually says. For example, text can be assessed for commercially relevant patterns such as an increase or decrease in positive feedback from customers, or new insights that could lead to product tweaks, etc. As such, text analytics is now capable of telling us things we didn’t already know and, perhaps more importantly, had no way of knowing before. And these insights can be incredibly useful in business.
Text analytics is particularly useful for information retrieval, pattern recognition, tagging and annotation, information extraction, sentiment assessment and predictive analytics. It could, for example, shed light on what your customers think of your product or service, or highlight the most common issues that your customers complain about.
Making sure your text is analysis-ready
It’s not enough for the text to be in a digital format, it also needs to be datafied. If you copied a page from a book as a jpeg file, you would technically have a digital copy of the text but it would be no good for running text analytics. What you need is datafied text like the text we see in many e-readers which allow you to interact with the text (by highlighting sections, adding notes, searching the text, etc.). So, any old paper files that you want to analyse will need to be rendered in a digital but also datafied format.
Once the text is ready there are a number of commercially available text analytic tools that can help you. Which one you use will depend on your objective.
Text analytics in action
Unsure how you would use text analytics in practise? Say, for instance, you are concerned about the level of employee engagement in your company and decide to conduct an employee engagement survey. You could read through hundreds of questionnaire responses and that might give you some good ideas, or a sense of who is happy and who is not, but it wouldn’t really give you any indication of trends or what the collective was really feeling.
Text analytics would allow you to assess all that free-flowing unstructured text and establish trends or clusters of opinion in the business, divisions and within specific teams. Text analytics is also having a big impact beyond the world of business. In healthcare, for example, companies are using text analytics to extract large amounts of information from patient medical records – information that can then be used to understand the overall health of the population and improve treatment methods. One such company, Apixio, analyses the information found in electronic healthcare records, such as GP notes, consultant notes, radiology notes, pathology results, etc. To analyse this information, which comes in a wide variety of formats and may even be handwritten, they first have to turn it into something that computers can analyse. They do this using OCR (optical character recognition) technology to create a textual representation of the information that computers can read and understand. The data can then be analysed at an individual patient level, or it can be aggregated across the population in order to derive big-picture insights around disease prevalence, treatment patterns, etc. Apixio hopes that by mining such practise-based clinical data for information – who has what condition, what treatments are working, etc. – we can learn a lot about the way we care for individuals and make improvements based on actual knowledge of what works and what doesn’t.
A word of warning
Converting older, paper-based text documents into something that can be used for analysis can be very time consuming and expensive, so it’s best to be selective rather than attempting to analyse everything you have lying around in your archives. Also keep in mind that most data has a shelf life. Rather than converting old text into an analysis-ready format, it is often better to focus on the new text data you already have access to, such as emails and social media posts.
Where to go from here
If you would like to know more about strategy, KPIs and performance management, cheque out my articles on:
Or browse the Big Data & Analytics and AI & Machine Learning sections of this site to learn more about these topics.
What Tech Trends Should Companies Focus on in 2023? Here Are Three to Consider (And One to Ignore)
It’s common to hear it said that today, in order to thrive, every business needs to become a tech business.[...]
The Top 10 In-Demand Skills For 2030
What will the world be like in 2030? Well, obviously, no one knows for sure, but we have some interesting predictions:[...]
Beyond Dashboards: The Future Of Analytics And Business Intelligence?
Analytics and business intelligence (BI) have long been understood to be fundamental to business success.[...]
The Top 5 Data Science And Analytics Trends In 2023
Today, information can be captured from many different sources, and technology to extract insights is becoming increasingly accessible.[...]
The 5 Biggest Business Trends In 2023 Everyone Must Get Ready For Now
Businesses have faced huge challenges and have undergone an incredible amount of change over the past few years, and this won’t slow down in 2023.[...]
8 Simple Ways To Enhance Your Data Literacy Skills
We’re living through the fourth industrial revolution (or “Industry 4.0”), a revolution that’s defined by wave upon wave of new technologies that combine the physical and digital worlds.[...]
- Get updates straight to your inbox
- Join my 1 million newsletter subscribers
- Never miss any new content