Written by

Bernard Marr

Bernard Marr is a world-renowned futurist, influencer and thought leader in the fields of business and technology, with a passion for using technology for the good of humanity. He is a best-selling author of 20 books, writes a regular column for Forbes and advises and coaches many of the world’s best-known organisations. He has over 2 million social media followers, 1 million newsletter subscribers and was ranked by LinkedIn as one of the top 5 business influencers in the world and the No 1 influencer in the UK.

Bernard’s latest book is ‘Business Trends in Practice: The 25+ Trends That Are Redefining Organisations’

View Latest Book

Big Data: R Explained in less than two minutes, to absolutely anyone

2 July 2021

If you’re looking at ways you can harness the power of Big Data analytics in your business, but are not necessarily a techie person yourself, it can be a confusing field at first.





For this reason I’m publishing a series of short posts aimed at explaining some of the key concepts and technologies behind Big Data and data analytics, aimed at an audience which is not primarily composed of IT specialists or data scientists.

I firmly believe that any business can benefit from the new wave of analytics applications and services which can crunch through as much data as you can throw at them, in order to come out with surprising and valuable insights to drive growth.

These projects usually require a mix of skills, and communication between people with different skillsets (i.e data science and marketing) is essential. So in this post I’ll give an overview of R -the programming language favored by many statisticians.

R is a computer programming language which is particularly well suited to handling and sorting the large datasets associated with Big Data projects.

Like Python which I covered previously, the software environment used to create code in R is open sourced, meaning it is free to download, anyone can use it, and there is a plethora of guidance and advice available on how to use it most effectively. However commercial distributions are also available, which often offer additional proprietary functionality or support packages.

Named from the initials of the two men who first developed the language at the University of Auckland, Robert Gentleman and Ross Ihaka, R has become very popular in recent years and is continuing to become more so, due to the explosion in analytic activities being carried out by business.

R’s strengths as a statistical programming language draw from the fact it is designed from the ground up to facilitate matrix arithmetic – carrying out complex, often automated calculations on data which is held in a grid of rows and columns. R is very good for creating programs which can carry out calculations on these datasets, even when the datasets are constantly growing in size at an ever-increasing rate, and producing real-time visualizations based on this data.

Its capability at producing these visualizations is another core strength of R. Its designers realised that visualization was key to being able to understand the complex datasets that are being explored, so incorporated functionality to translate data into charts, graphs and complex multi-dimensioned matrices – as well as many user-defined methods of visualization – into its core.

Online, R code is everywhere although you won’t see it, as it’s always hidden behind pretty graphical interfaces. But when you use Google, Facebook or Twitter you are almost certainly executing R code running on the servers of those organizations. In fact it is often cited as the most widely used programming language for data science. APIs exist for almost all of these services, allowing applications written in R to access data from these outside sources and include it in their own analytics routines.

Thanks to this huge user base, just about every function that you might need for data analysis is available, often through open source extensions (known as packages) made available by the community. It is also capable of executing code written in other languages such as C++ or Java, so resources coded in those languages can be made available. Because it can be compiled to run on any major operating system, R code can easily be ported between Unix, Windows or Mac environments.

Python is probably R’s biggest rival – but as both are non-commercial entities (as are most languages, computer or otherwise!) it’s not necessarily a rivalry in the traditional sense. However coders will often argue vociferously for their favorite of the two. Python, having more in common with more traditional, longer established programming languages, is often cited as being easier to learn, particularly for someone with prior experience of different high-level programming languages. The R environment, on the other hand, is likely to be more familiar to someone with an academic background in statistics. It’s worth noting that Python tends to have a wider range of uses outside of the world of statistics and analytics, whereas R is generally exclusively used for those purposes.

With a reported two million users worldwide, and thousands of deployed applications created using it, R is undoubtedly one of the backbone technologies of the Big Data revolution. If you are thinking of getting involved with the techie end of data analysis, then a thorough grounding in the language should be considered an essential element of your toolbox. If you want to learn more, or have a go at creating your own code in R to see what it can do, there are plenty of great resources online, such as those at Coursera, Code School and R Studio .

Data Strategy Book | Bernard Marr

Related Articles

Should I Choose Machine Learning or Big Data | Bernard Marr

Should I Choose Machine Learning or Big Data?

Big Data and Machine Learning are two exciting applications of technology that are often mentioned together in the space of the same breath [...]

What Are The Latest Trends in Data Science | Bernard Marr

What Are The Latest Trends in Data Science?

Here’s an overview of how this usage is evolving – signposts that point the direction of travel between where we are today and where data science will take us tomorrow [...]

3 Key Ways to Monetize Your Data | Bernard Marr

3 Key Ways to Monetize Your Data

I’ve written a book on data strategy, and one of my primary jobs is guiding businesses through the process of using their data effectively. [...]

The Future of Quantum Computing | Bernard Marr

The Future of Quantum Computing

A Chinese team of researchers has recently unveiled the world’s most powerful quantum computer [...]

How Facebook Is Using Artificial Intelligence

Every day, nearly 2.5 billion people log in to one of the [...]

Amazon: Using Big Data to understand customers

Amazon has thrived by adopting an “everything under one [...]

Stay up-to-date

  • Get updates straight to your inbox
  • Join my 1 million newsletter subscribers
  • Never miss any new content

Social Media

0
Followers
0
Likes
0
Followers
0
Subscribers
0
Followers
0
Subscribers
0
Followers
0
Readers

Podcasts

View Podcasts