Update Apache Geode Spawns ‘All Sorts of In-Memory Things’

Main Contents:

Apache Geode Spawns ‘All Sorts of In-Memory Things’ is an article under the topic Data Science Many of you are most interested in today !! Today, let’s InApps.net learn Apache Geode Spawns ‘All Sorts of In-Memory Things’ in today’s post !

Peer-to-Peer System

At its heart, Geode is a distributed hashmap for the JVM, Swapnil Bawaskar, distributed systems software engineer for Pivotal, explains in an interview with Software Engineering Daily podcast.

Its abstractions are the cache and region. The cache is a unit of data management, basically a collection of regions.

A region is “a key/value store on steroids,” according to Lamba. They’re distributed and highly concurrent. You can store data in memory, but also push it to disk. A region can be replicated — each member in the cluster gets the same data — or you can choose to partition the data across a number of nodes, which is a more scalable solution.

Because it highly values consistency, if you make two copies of the data, it ensures the primary data and the two redundant copies are consistent before returning an acknowledgment back to the user saying the put, for example, has completed.

For a replicated region, you can choose to do asynchronous sends to other members; that’s not allowed for partitioned regions.

A member is a role — a client, a server. It’s a layer where you can store data. It’s embeddable in the application. It can be defined as a client cache. You can have different configurations or topologies that you can configure it to, Lamba explained.

For data that’s looked up frequently, you can create a local copy of that data — a client cache — within your application, that’s synchronized with the main copy.

It also has a discovery service called Locator that helps Geode servers find each other and Geode clients to find the servers.

It’s a peer-to-peer system. If the locator goes down, the other servers still know what the others are doing and will be able to resolve issues among themselves. You can add servers to scale, and they discover and know each other.

Its interface is Java, REST or CLI, though interfaces with Redis, Lucene and Spark are experimental, Lamba said.

Geode can be used anytime you’re having trouble scaling your relational database to meet your application needs, according to Bawaskar.

“We’ve also seen it used to implement the CQRS (Command Query Responsibility Segregation) pattern. The idea is the ability to scale your reads and writes independently. Typically would use a system like Kafka that deals with all the events coming into your system, then because Geode has the query semantics as well as the continuous query functionality, you use Geode on the CQRS pattern.

“Also for applications that need to be deployed across geographies in an active-active manner, Geode can do that, too,” he said, such as a trading firm with operations in London, Tokyo and New York that wants to use its local cluster, but also replicate that data to the other sites to reconcile and do business processing.

He pointed to CAP theorem, which states that on a partitioned system, you have to choose between consistency or availability because you cannot give up partition tolerance in the cloud. Geode allows users of partitioned systems to choose between greater focus on consistency or availability.

Seeking Diversity

The Geode codebase was originally developed under the name of GemFire by Gemstone Systems in 2002, which was later acquired by VMware. The GemFire technology went to Pivotal when that company was spun out in 2013. Pivotal submitted the GemFire technology to the Apache Incubator in April 2015 under the project name Geode. It became a top-level project last month.

It originally captured interest as the transactional data engine used in Wall Street trading platforms. Now more than 600 enterprises use the technology for high-scale business applications that require low latency and 24×7 availability.

Pivotal announced in 2015 plans to open source its entire big data suite, including GemFire, analytical massive parallel processing (MPP) data warehouse Greenplum Database and HAWQ, an SQL-on Hadoop analytic engine.

Geode and Apache Ignite started out from a similar purpose, according to Shaposhnik, but he sees Ignite moving more toward a SAP HANA replacement.

Pivotal is still developing its commercial GemFire offerings, focusing on Spring integration that involves decoupling the application logic from the database logic, making it easier to switch databases in microservices architectures, according to Shaposhnik.

Since the project’s inception, the bulk of the work has been around refactoring the core of Geode to run on the latest version of jgroups; WAN replication and continuous query functionality; and native client functionality.

“I would love to see the project getting more diverse just by getting more companies participating,” Shaposhnik said. “We have different companies, but let’s be honest, it’s still dominated by Pivotal by a number of contributors.”

He said he believes the governance model will allow the project to achieve to the level of diversity that Spark or Hadoop have, in which 30 percent or less of its contributors come from one company.

Integrations with Spring, Spark and a Redis-to-Geode adaptor are in the works.

Feature image via Pixabay.

Source: InApps.net

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.