Update TigerGraph Builds a Bigger Graph Database

Main Contents:

TigerGraph Builds a Bigger Graph Database is an article under the topic Data Science Many of you are most interested in today !! Today, let’s InApps.net learn TigerGraph Builds a Bigger Graph Database in today’s post !

Faster Graph Traversal

Early-generation native graph technologies cannot store a graph across multiple machines. Xu considers Neo4J, a native graph database that stores data as nodes and expresses their connectedness through edges, an example of “Graph 1.0.” However, Neo4J has worked with IBM to iron out scalability problems.

Most early graph databases, he says, are storage-focused with limited analytics capabilities rather than providing the computational chops needed for the workloads companies want to run against graph databases. Most time out at two hops, while TigerGraph NPG is built to traverse 10 or more.

He calls Apache Giraph and other parallel graph databases that sit atop NoSQL “Graph 2.0” — they lack the ability to make updates in real time.

The company bills TigerGraph, which Xu calls “Graph 3.0” a “complete, distributed, graph analytics platform supporting web-scaled data analytics in real time.” It says it works as well for limited, fast queries that use only a small part of the graph as well as complex analysis that touches every vertex in the graph.

Despite those who rail against the weaknesses of Hadoop and MapReduce, Xu based TigerGraph on MapReduce and says it’s all in how it’s implemented.

“Built everything from the ground up, using C++, so we could control the whole stack. We built our own storage engine, our own cross-communication engine,” he said.

The graph is stored both on disk and in memory, allowing the system to take advantage of the data locality on disk, in-memory and CPU cache.

It also allows the user to add data to the database continually without needing to re-run extract transform and load (ETL) processes.

In its own benchmarks, the company claims 4- to 100-times faster graph traversal and query response times compared to Neo4j and Titan, another graph database.

It also claims its parallel computational capability boosts loading speed by 10x — 50 to 150 GB of data per hour, per machine.

It offers multiple ways to load data, including RESTful application program interfaces, high-level mapping of comma-separated value and JavaScript Object Notation files to graphic vertices and connectors to popular data sources.

The company also announced the availability of a hosted version of TigerGraph on Amazon EC2 and GraphStudio, its visual software development kit (SDK).

Source: InApps.net

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.