Big Data-As-A-Service: How To Choose The Best Provider?
2 July 2021
The market for Big Data and analytics technology is in a state of fast change and rapid growth. A recent development is the emergence of a class of platforms and managed toolsets which can be termed Big Data-as-a-Service (BDaaS).
It’s easy to see the appeal. Instead of building a data centre, developing an analytics toolset stack, and investing in a team of trained data scientists – a costly and time consuming project for any enterprise – why not simply pay-as-you-go?
I predict that the size of this market will grow phenomenally in the near future, driven by the increasing adoption of Big Data analytics across the mid-tier enterprise. Minimising the need for large financial outlays before any results can be seen make them a very attractive proposition for those organisations where a clear business case is a necessity, rather than a luxury.
The rapid growth of the market does however mean that currently it is unsettled and constantly changing. The current standard model is that of a managed, cloud hosted Hadoop distribution alongside an ecosystem of open source or proprietary analytics, data management and security technology. I can see a future in which the growing number of organisations with a need for analytics may create even more transparent, highly managed service offerings. Though all of the services I cover at the end of this article are based on Hadoop, the extent to which this is mentioned in their individual marketing literature differs.
Some of the providers are household names for reasons other than their cloud business services, and their status as digital giants is intended to inspire confidence that they know what they are doing when it comes to security and compliance. Others exist purely to provide BDaaS.
Key questions anyone should ask when choosing a particular BDaaS provider for their business include:
Does it offer low or zero start-up costs?
Many of the providers here offer a free trial so in theory you could see results before you spend a penny.
Is the solution scalable? Big Data projects have a tendency to grow in size beyond the initial vision – can you easily and affordably buy more storage and processing resources as you need them?
Is it already in use in my industry?
If you are paying for consultancy and project planning support alongside your data hosting and analytics, does your provider have experience of supporting your business cases and customers?
Does it fit my organisation’s needs?
BdaaS is particularly suited to strategies which involve analysis of very large, messy, unstructured datasets. Additionally there will be a requirement to move large amounts of data to a third party provider which is likely to raise security and compliance issues.
Does it offer real-time analysis and feedback?
Today’s most exciting and rewarding Big Data projects provide insights based on what is happening now, not just what was happening last week, meaning action can be taken when it is needed rather than simply learning from the past.
Is it managed or self-service?
Most offer a mixture of both approaches, with technical staff working behind the scenes to provide you with services in as transparent a way as possible. However, the level of support and consultancy included in your package will vary.
Here’s a quick introduction to some of the most prominent BDaaS services available today.
Google Cloud Dataproc
Google’s managed Big Data service has enjoyed fast growth since it went into general release earlier this year. Clearly it has been able to leverage their presence and reputation for cloud innovation into a package which industry has found attractive. It runs Hadoop and Spark on Google’s Cloud Platform and integrates with the BigTable storage and BigQuery analytics frameworks.
Amazon Web Services.
AWS is the collective name for Amazon’s cloud-based business tools and services. Their managed Hadoop service is called Amazon Elastic MapReduce and it runs on Amazon’s S3 storage infrastructure. It is the market leader in providing business cloud computing services and customers benefit from their world class data security infrastructure.
Microsoft Azure HDInsight
Microsoft’s strong presence in the business software market made it a sure bet that it would play for a slice of the BDaaS pie. It has built on its Azure cloud framework by increasingly adding in functionality and compatibility with open source technology such as Spark and Storm. For many organisations, a big plus will be and User Interface (UI) features that are immediately familiar to millions even if they have never been near an analytics dashboard previously.
Salesforce Wave Analytics
Salesforce built up partnerships with companies including Google and Cloudera to bring Hadoop-based Big Data analytics to its cloud data services. Wave Analytics uses a UI which will be familiar to any users of its market leading CRM software to enable dynamic visualisations. It is also optimised for use on smart phones and watches, making it a strong contender if you want to put real time analytics in the hands of a mobile or shop floor workforce.
Qubole Data Service
Less of a household name than others mentioned previously, Qubole was founded by former Facebook data scientists who saw a need for a self-service Big Data platform for enterprise. It is designed to be operated from a UI which assumes no prior experience of using Hadoop. As Qubole is not a cloud storage provider its solution can be configured to run on Amazon, Google or Microsoft cloud infrastructure.
IBM BigInsights on Cloud
IBM’s data management systems already have high penetration, so again it was natural that they would be looking to move into business cloud computing. By integrating advanced analytics tools it has put together a suite of services aimed at lowering the entry barrier to Big Data analytics. IBM has also forged partnerships with social media companies such as Twitter, making it easier to gain insights, and developed its own cognitive, natural language processing engine, Watson, allowing data to be queried and analysed using natural human language.
I hope that this is giving you an overview of the current BDaaS market. As this market is evolving very fast, I will be keeping a close eye on the developments.
Where to go from here
If you would like to know more about big data and its tools, cheque out my articles on:
- How is Big Data used in practise? 10 use cases everyone must read
- What is Hadoop: An Easy Explanation For Absolutely Anyone
- The 6 Best Hadoop Vendors For Your Big Data Project
- Big Data: Too many questions, not enough answers
Or browse the Big Data section of this site to find more articles and many practical examples.
Related Articles
AI Gone Wild: How Grok-2 Is Pushing The Boundaries Of Ethics And Innovation
As AI continues to evolve at breakneck speed, Elon Musk's latest creation, Grok-2, is making waves in the tech world.[...]
Apple’s New AI Revolution: Why ‘Apple Intelligence’ Could Change Everything
Apple's announcement of 'Apple Intelligence' marks a seismic shift in how we interact with our devices.[...]
Why AI Models Are Collapsing And What It Means For The Future Of Technology
Artificial intelligence has revolutionized everything from customer service to content creation, giving us tools like ChatGPT and Google Gemini, which can generate human-like text or images with remarkable accuracy.[...]
Where Will Artificial Intelligence Take Us In The Future?
Just a few years back, if you had been told that by 2024, you would be able to have a conversation with a computer that would seem almost completely human, would you have believed it?[...]
AI: Overhyped Fantasy Or Truly The Next Industrial Revolution?
The term “fourth industrial revolution” has been used in recent years to describe the transformative impact that many believe AI and automation will have on human society.[...]
The World On Edge: 5 Global Mega Threats That Could Reshape Our Future
In an era of unprecedented global interconnectedness, humanity faces a perfect storm of challenges that threaten to reshape our world.[...]
Sign up to Stay in Touch!
Bernard Marr is a world-renowned futurist, influencer and thought leader in the fields of business and technology, with a passion for using technology for the good of humanity.
He is a best-selling author of over 20 books, writes a regular column for Forbes and advises and coaches many of the world’s best-known organisations.
He has a combined following of 4 million people across his social media channels and newsletters and was ranked by LinkedIn as one of the top 5 business influencers in the world.
Bernard’s latest book is ‘Generative AI in Practice’.
Social Media