Home
>
Data Science
>
Update Confluent Brings SQL Querying to Kafka Streaming Data

March 29, 2022 by Phu Nguyen

Update Confluent Brings SQL Querying to Kafka Streaming Data

Main Contents:

Confluent Brings SQL Querying to Kafka Streaming Data is an article under the topic Data Science Many of you are most interested in today !! Today, let’s InApps.net learn Confluent Brings SQL Querying to Kafka Streaming Data in today’s post !

Read more about Confluent Brings SQL Querying to Kafka Streaming Data at Wikipedia

You can find content about Confluent Brings SQL Querying to Kafka Streaming Data from the Wikipedia website

With ever-increasing volumes of data comes an ever-increasing need to process that data. Confluent has made a business out of helping enterprises handle never ending streams of data with its commercial packaging of Apache Kafka. And now, at Kafka Summit in San Francisco this week, Confluent introduced a new open source project, called KSQL, that it says will allow users to apply SQL queries against streaming data.

In this move, Confluent is one of a growing number of companies, such as SQLSteam, attempting to apply the bringing the rigors of SQL to the world of real-time data analysis.

Neha Narkhede, CTO and co-founder of Confluent, said that KSQL offers a number of potential use cases to enterprises, from processing data as it comes into an organization to handling extract, transform and load (ETL)-like work on data warehouses and data transfers between systems.

Said Narkhede, “KSQL is a completely interactive distributed SQL engine for Apache Kafka. It lets you do all sorts of continuous stream processing and transformations against infinite streams that flow through Kafka.”

Traditionally, processing stream data through Kafka required a developer to write Java or Python code, said Narkhede. KSQL brings data developers and SQL experts into the stream processing fold. KSQL will become an independent product offering down the line, said Narkhede, and is already available on GitHub. The project is currently in developer preview and should be generally available in a few months.

Narkhede said KSQL is, similar but not entirely compliant with ANSI SQL. It is a modified version of SQL customized specifically for querying streams of data. In a database, SQL is used to query past transactions.

“This is turning the database inside out in the sense that instead of querying the past, you start querying the future,” Narkhede said. “At this moment the grammar is pretty good, but there are more features we plan to add, such as to insert statements into Kafka topics down the line.”

While KSQL brings SQL to Confluent’s product line for the first time, it is not the first such SQL-on-streams system out there. Companies like Striim, Kinetica, and SQLStream, for example, have offered similar functionality for almost a decade. SQLStream, in fact, already offers SQL on Kafka.

So what makes KSQL different? Narkhede says it’s the distributed processing model. “KSQL builds on top of the Kafka partitioning model. You can easily distribute queries on the cluster so you can actually get away with normal sized boxes, coupling them together like Kafka. It’s integrated very closely with the fundamental building blocks of Kafka, and has the ability to run several queries in parallel. Kafka takes care of the load balancing if one machine goes down, and how queries shift over time.”

As an example, here is a KSQL query for ETL work:

<br />
CREATE STREAM vip_actions AS<br />
SELECT userid, page, action<br />
FROM clickstream c<br />
LEFT JOIN users u ON c.userid = u.user_id<br />
WHERE u.level=”Platinum”;

CREATE STREAM vip_actions AS

SELECT userid, page, action

FROM clickstream c

LEFT JOIN users u ON c.userid = u.user_id

WHERE u.level = ‘Platinum’;

Damian Black, CEO and founder of SQLStream, said that Narkhede and her team came to visit a year ago, and were obviously taking notes. Currently, his company’s biggest source of users is Amazon Kinesis, which was built using Amazon’s Kafka-like streaming system, and SQLStream’s SQL processing system. He said the reason SQLStream is popular with Amazon is its speed.

“The reality is we are so much faster, that you need a fraction of the number of servers. One of our customers had a job take 180 servers [running MapR] three hours. It takes 12 of our servers running at 40 percent to process the data in real time,” said Black.

Black said that one of the biggest issues with building SQL processing inside Kafka is that Kafka is written in Java. SQLStream actually runs inside the JVM, but is built in C++ and highly optimized to the point where it generates no garbage in the JVM. That means SQLStream runs its queries at true real time speeds.

Black said it’s too early to comment on KSQL’s capabilities, but mentioned that another stream SQL processing engine, that of Apache Spark, is batch-based and cannot handle queries in real time.

SQLStream also runs on Kafka, so Black said his team is familiar with Confluent’s work there. “Confluent is a great messaging system. That’s why we use it. It’s free and it performs well considering it’s written in Java. We also work with Amazon Kinesis and AMQP. We provide a much richer experience. It’s not SQL-like, it’s true standards-based SQL. It’s written in C++ and it’s lock-free. A 32-bit integer is a 32-bit integer,” said Black.

SQLStream has been offering similar capabilities since it launched eight years ago. Black said that he feels like a pioneer of the market and that many enterprises are still just waking up to the potential of stream processing with SQL.

One aspect of the decision process for most enterprises, however, is the cloud in which such a system will be based. That could mean big changes for many data processing systems, however. Narkhede said that she sees a lot of growth currently coming from Google’s Cloud. That’s being driven by BigQuery adoption, and that, she said, is a reason the team is working to integrate Kafka with BigQuery.

That could mean big changes to the way data is processed, as teams forego an on-site IT managed approach in favor of simply using Google’s own on-demand services. In that sense, KSQL and SQLStream may only be the preferred solution until Google wins over the market. That may sound bad for Confluent, but Narkhede said she’s seeing a large internal push at Google to bring Kafka into the system to work with BigQuery.

“BigQuery is making people go to Google,” said Narkhede. “[Google] wants Kafka to work with BigQuery. There are a lot of people out there asking them for Kafka and BigQuery is the draw. It’s one of the most amazing data systems out there in the world.”

Source: InApps.net

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.

Let’s create the next big thing together!

Coming together is a beginning. Keeping together is progress. Working together is success.

Let’s talk

Recommended

Tech News

May 29, 2025 by Anh Hoang

Update Confluent Brings SQL Querying to Kafka Streaming Data

Read more about Confluent Brings SQL Querying to Kafka Streaming Data at Wikipedia

AI Automation for Business in 2025: A Step-by-Step Guide

FITNESS APP DEVELOPMENT

ONLINE COURSE APP

EVE HR – WEB DESIGN

AIRGOGO WEBSITE

WALLET APP DEVELOPMENT

Ho Chi Minh City Launches Digital Traffic App 2017

Why Your Business Needs a Mobile App Rather Than a Website

7 Questions To Ask Yourself Before You ‘App’ | Entrepreneur

Homestays Marketplace Application Development

Blog post

9 Practical Tips to Choose a Mobile App Development Company for 2023

AI Automation for Business in 2025: A Step-by-Step Guide

Top 10 Offshore Development Companies (ODCs) in 2025

How can businesses effectively integrate AI into their operations?

Locations

Read more about Confluent Brings SQL Querying to Kafka Streaming Data at Wikipedia

Get a custom Proposal

You need to enter your email to download

Blog post

Locations