- Home
- >
- Software Development
- >
- Hazelcast Launches an Open Source In-Memory Stream Processing Engine – InApps Technology 2025
Hazelcast Launches an Open Source In-Memory Stream Processing Engine – InApps Technology is an article under the topic Software Development Many of you are most interested in today !! Today, let’s InApps.net learn Hazelcast Launches an Open Source In-Memory Stream Processing Engine – InApps Technology in today’s post !
Key Summary
This article from InApps.net details Hazelcast’s launch of Hazelcast Jet, an open-source, Apache 2-licensed in-memory stream processing engine, alongside updates to its In-Memory Data Grid (IMDG). Key points include:
- Hazelcast Jet Overview:
- Designed for near real-time processing of data-intensive applications (e.g., smart home sensors, e-commerce, social media, fraud detection).
- Ingests high-velocity data via socket, file, HDFS, or Kafka, processing it with low latency using a one-record-per-time architecture (unlike Spark’s micro-batching).
- Built for Java 8 developers, leveraging the java.util.stream API for parallel processing, scalable up to 600 servers, with plans to support other languages.
- Architectural Features:
- Combines computation and storage in memory, co-locating Jet with Hazelcast IMDG for efficiency.
- Uses Directed Acyclic Graph (DAG) for computations, with optimizations like data locality, partition mapping affinity, single-producer/single-consumer queues, and green threads for high throughput and low latency.
- Employs cooperative multithreading and wait-free algorithms to maximize core usage and minimize context-switching costs.
- Performance: Internal benchmarks show Jet is 20x faster than Hadoop, 2-3x faster than Spark, and 50% faster than Flink for word count tasks, though past benchmarking disputes with GridGain are noted.
- Hazelcast IMDG 3.8:
- Enhances persistence and multi-data center deployment capabilities.
- Used in hundreds of thousands of clusters, with over 17 million monthly server starts, by clients like American Express and AT&T.
- Industry Shift: CEO Greg Luck highlights Jet as part of a move to embed stream processing within applications, reducing reliance on separate Big Data infrastructure like Hadoop or Spark, making it a developer-controlled concern.
- Partnerships and Integrations:
- Partnered with Striim for real-time data synchronization with Hazelcast Striim Hot Cache.
- Integrated as a tile on Pivotal Cloud Foundry for broader accessibility.
- Company Background: Originally launched in Turkey, Hazelcast is now headquartered in Palo Alto, with offices in Istanbul and London, serving major enterprises.
Read more about Hazelcast Launches an Open Source In-Memory Stream Processing Engine – InApps Technology at Wikipedia
You can find content about Hazelcast Launches an Open Source In-Memory Stream Processing Engine – InApps Technology from the Wikipedia website
Hazelcast, known chiefly for its open source in-memory data grid (IMDG), has launched an open source lightweight, distributed data-processing engine called Hazelcast Jet.
This new Apache 2-licensed open source project is designed to enable processing in near real time for data-intensive applications such as smart home sensors, in-store e-commerce systems, social media platforms, log analysis, monitoring and fraud detection.
The company has also released version 3.8 of Hazelcast IMDG, which includes advanced capabilities for managing persistence and multi-data center deployments. The Hazelcast IMDG is used with hundreds of thousands of installed clusters and over 17 million server starts per month, according to the company.
Big Data processing with Hadoop and Spark has been really complicated, according to Hazelcast CEO Greg Luck, making it similar to the old mainframe days when there would be a dedicated staff to load and run your jobs on expensive dedicated technology then provide you with the results.
“We think we’re part of a big shift in taking Big Data back to a stream-processing problem and just running it within the application that you’ve written — embedding it within the application rather than needing to run on separate infrastructure. We think that’s a profound shift,” he said.
“It makes the choice of Big Data platform and how to solve that problem an application concern within the control of developers and architects. Not even Ops needs to necessarily be involved in Hazelcast. Ops only needs to be involved if it’s a separate stand-alone cluster.”
Jet ingests data at high velocity via socket, file, HDFS or Kafka interfaces, and processes the business logic or complex computation on incoming data.
“We do think we can bring something new to this space. We didn’t want to do this unless we could be faster than everybody else. … What it brings to a relatively crowded market at this stage, Hazelcast is famously simple and easy to get started with,” he said.
Though Hazelcast IMDG supports six languages, Jet was launched with a focus on developers using Java 8. It plans to support other languages later on.
Hazelcast IMDG uses a Java package called Java.util.concurrent that was added in Java 5. Jet uses a similar thing, java.util.stream, an API for parallel processing of data that was added to Java 8.
It was designed for use within a single virtual machine, though, and with Jet, it’s distributed.
“You can use that same API to express what you want to do, but it will execute on our Jet grid. And our largest compute grid at the moment is 600 servers, so it can scale very, very high. But still, you can program it with this very simple API, and if you’re a Java developer, probably one that you already know,” he said.
It’s built for speed and low latency, using one-record-per-time architecture so it processes incoming records as soon as possible, as opposed to accumulating records into micro-batches like Spark does. And though this process sounds similar to Apache Flink, it doesn’t borrow anything from Flink, Spark or Hadoop, according to Luck.
Like other Big Data frameworks, Hazelcast Jet uses the Directed Acyclic Graph (DAG) abstraction to model computations but takes some novel approaches to boost speed at low latency, including data locality; partition mapping affinity; single-producer, single-consumer (SP/SC) queues and green threads
He explained Jet’s architectural approach this way:
- It keeps computation and data storage in memory by combining Hazelcast Jet with the Hazelcast data grid on the same servers. Depending on the use case, some or all of the data that Jet will process will be already in RAM on the same machine as the computation.
- Jet allows you to define an arbitrary object-to-partition mapping scheme on each edge. This allows reading in parallel with many threads from each partition and member and thus server. It can use this to harmonize and optimize throughput from other distributed data systems whether it be HDFS, Spark or Hazelcast. Thus when performing DAG processing, local edges can be read and written locally without incurring a network call and without waiting.
- Local edges are implemented with the most efficient kind of concurrent queue: the SP/SC bounded queue. It employs wait-free algorithms on both sides and avoids volatile writes by using lazySet method.
- Vertexes are implemented by one or more instances of “Processor” on each member. Each vertex can specify how many of its processors will run per cluster member using the “localParallelism” property so it can use all the cores even in the largest machines. With many cores and execution threads, the key to Hazelcast Jet performance is to smoothly coordinate these with cooperative multithreading is much lower context-switching cost and precise knowledge of the status of a processor’s input and output buffers, which determines its ability to make progress.
- And Hazelcast Jet uses “green threads” to allow very high throughput where cooperative processors run in a loop serviced by the same native thread.
In its internal Word Count benchmarks, reading initial words from the file system, Jet was 20 times faster than Hadoop, 2 to 3 times faster than Spark and 50 percent faster than Flink, Luck said. It should be noted, however, that Luck, prior to Jet’s release, got into a row with rival GridGain last year over benchmarking.
Launched in Turkey, Hazelcast is now headquartered in Palo Alto, California, and maintains offices in Istanbul and London. Its data grid customers include American Express, AT&T, General Dynamics, Ericsson and Domino’s Pizza.
In November, it announced a partnership with Striim, which offers a real-time data integration and streaming analytics platform, to launch the Hazelcast Striim Hot Cache to ensure continuous real-time synchronization between the cache and its underlying database.
It announced in January that the grid is now available as a tile on the Pivotal Cloud Foundry network.
Feature Image: “street painting” by pHotosHo0x, licensed under CC BY-SA 2.0.
Source: InApps.net
Let’s create the next big thing together!
Coming together is a beginning. Keeping together is progress. Working together is success.