Written by

Bernard Marr

Bernard Marr is a world-renowned futurist, influencer and thought leader in the fields of business and technology, with a passion for using technology for the good of humanity. He is a best-selling author of over 20 books, writes a regular column for Forbes and advises and coaches many of the world’s best-known organisations. He has a combined following of 4 million people across his social media channels and newsletters and was ranked by LinkedIn as one of the top 5 business influencers in the world.

Bernard’s latest books are ‘Future Skills’, ‘The Future Internet’, ‘Business Trends in Practice’ and ‘Generative AI in Practice’.

Generative AI Book Launch
View My Latest Books

Follow Me

Bernard Marr ist ein weltbekannter Futurist, Influencer und Vordenker in den Bereichen Wirtschaft und Technologie mit einer Leidenschaft für den Einsatz von Technologie zum Wohle der Menschheit. Er ist Bestsellerautor von 20 Büchern, schreibt eine regelmäßige Kolumne für Forbes und berät und coacht viele der weltweit bekanntesten Organisationen. Er hat über 2 Millionen Social-Media-Follower, 1 Million Newsletter-Abonnenten und wurde von LinkedIn als einer der Top-5-Business-Influencer der Welt und von Xing als Top Mind 2021 ausgezeichnet.

Bernards neueste Bücher sind ‘Künstliche Intelligenz im Unternehmen: Innovative Anwendungen in 50 Erfolgreichen Unternehmen’

View Latest Book

Follow Me

The Amazing Ways Snowflake Uses Generative AI For Synthetic Data And Natural Language Queries

24 September 2023

You probably know that the new generation of generative AI tools that have exploded onto the scene can generate words, pictures and even videos that closely resemble those created by humans. But did you know that it can also be used to generate data itself?

The Amazing Ways Snowflake Uses Generative AI For Synthetic Data And Natural Language Queries | Bernard Marr

Modern artificial intelligence (AI) works by recognizing patterns in data and using it to answer questions or predict what comes next. In the case of generative AI like Open AI‘s ChatGPT, it uses it to create more data that follows the rules of the data it’s trained on.

But real data comes with complications – it can be difficult and expensive to collect and brings security and privacy obligations.

Think about a dataset comprising thousands of human faces, for example – as used to train facial recognition algorithms. You have to find and photograph thousands of people and then get their permission to store and use their data. Then, myriad checks and balances must be followed to ensure your data isn't harmfully biased.

One solution is synthetic data. This is data created by machines and closely resembles real-world data that can be used for many of the same purposes.

Snowflake is one of the world's biggest "data-as-a-service" companies that, in addition to their analytics services, also offers a data marketplace covering thousands of topics, including healthcare, finance and retail.

Now, it’s augmenting these offerings with synthetic, AI-generated datasets and putting generative AI to use in several other interesting applications. Let's take a look!

First, What Is Synthetic Data?

Synthetic data is information that has been artificially generated in order to have the same characteristics as a real-world dataset but without including any real-world data.

Generative AI is particularly suited to this task as it can easily analyze any dataset and then create synthetic data that closely matches it. It means businesses can train AI algorithms and perform tests and simulations without exposing private or sensitive information that might be contained in real-world data.

It’s used in finance to train fraud detection algorithms to spot deliberately falsified transactions, in healthcare to avoid using sensitive patient data, and in retail and marketing to create synthetic customers and analyze their buying behavior.

According to Gartner research, business leaders are most likely to turn to synthetic data because of difficulties with accessibility, complexity and availability of real-world data. It also found that partially synthetic datasets – where real-world data is augmented with synthetic data – are more commonly used than fully synthetic datasets.

By generating synthetic data, companies can create any information they need to plug gaps in existing records or create entirely new datasets. It doesn’t negate the need for real-world data, which is needed to create synthetic data in the first place. But when used effectively, it can reduce the cost, speed up the training of machine learning models, and help businesses automate and make better decisions.

Generative Synthetic Data At Snowflake

Snowflake sells data to businesses via its Snowflake marketplace, which is one of the largest B2B data brokerages in the world.

Alongside its thousands of real-world datasets, Snowflake now offers access to synthetic datasets created by generative AI algorithms. One example is San Francisco-based Synthesis AI’s synthetic human face dataset, comprising 5,000 individual images of diverse human faces.

In the past, facial recognition algorithms have been criticized and even banned due to concerns over biases in the datasets used to train them. This has led to differences in their ability to identify people of different ethnic backgrounds and accusations that they could be unfair or prejudiced.

Using synthetic data in this way can help to tackle those problems (note – I will not say it solves them entirely) as datasets can be created in line with whatever level of representation or inclusiveness is needed.

While synthetic data existed before the emergence of generative AI, the new class of generative algorithms means that datasets can quickly be scaled to any size that's needed. Datasets created in this way can also be easily customized to fit the needs of different customers around the world.

It also offers synthetic financial data from Clearbox AI, consisting of simulated mortgage applications designed to mimic both legitimate and fraudulent applications. The data in these sets had been augmented by data created by generative AI.

Snowflake has made it clear that it expects synthetic data generated by AI to play an important role in its business going forward. As generative models such as large language models (LLMs) become more sophisticated, we will see them becoming capable of creating synthetic data that more and more accurately reflects the real world, leading to cheaper and more efficient insights for businesses.

How Else is Generative AI Used at Snowflake?

As well as offering access to AI-generated synthetic data, Snowflake has created a number of tools based on generative AI for its customers to use.

Thanks to its acquisition this year of Neeva – a search startup founded by former employees of Google- it is implementing natural language querying of its datasets. Effectively, this will let users talk to their data, getting insights by asking straightforward questions rather than running traditional data science analysis. CEO Frank Slootman told VentureBeat, "Engaging with data through natural language is becoming popular … this will increase our opportunity to allow non-technical users to extract value from their data.”

It has also launched a partnership with Nvidia, using the chip maker’s NeMo LLM to create a platform that lets Snowflake users build generative AI applications like Chatbots and search engines with the ability to access Snowflake data.

Another LLM initiative is creating its Document AI tool that allows users to query documents – legal contracts or invoices, for example – and extract meaning for them. This was developed with technology that Snowflake acquired when it bought the Swedish natural language platform Applica in 2022.

Altogether, it's clear that Snowflake has big hopes for generative AI to create synthetic data and build tools to help us analyze and extract value from it.

Business Trends In Practice | Bernard Marr
Business Trends In Practice | Bernard Marr

Related Articles

The Eight Biggest HR Trends In 2024

For those working in employee and people management, the focus in 2024 will be on managing[...]

Coca-Cola’s Latest Generative AI Initiative Is All About Festive Customer Engagement

Generative AI is transforming the way that brands engage with consumers.[...]

The Rise Of Generative AI In Design: Innovations And Challenges

Artificial Intelligence has been used in design and manufacturing for some time[...]

AI-Enhanced Employee Onboarding: A New Era In HR Practices

Onboarding new employees has always been a pivotal part of HR's responsibilities.[...]

The Biggest Banking And Financial Services Trends For 2024

2024 promises to be a landmark year in banking and finance, marked by significant[...]

The Evolution Of Data-Driven And AI-Enabled HR

The pulse of any organization lies not just in its products or services but in its people.[...]

Sign up to Stay in Touch!

Bernard Marr is a world-renowned futurist, influencer and thought leader in the fields of business and technology, with a passion for using technology for the good of humanity.

He is a best-selling author of over 20 books, writes a regular column for Forbes and advises and coaches many of the world’s best-known organisations.

He has a combined following of 4 million people across his social media channels and newsletters and was ranked by LinkedIn as one of the top 5 business influencers in the world.

Bernard’s latest book is ‘Generative AI in Practice’.

Sign Up Today

Social Media

0
Followers
0
Followers
0
Followers
0
Subscribers
0
Followers
0
Subscribers
0
Yearly Views
0
Readers

Podcasts

View Podcasts