Home
>
Data Science
>
Update Intel Gives the Etcd Key-Value Store a Needed Boost

March 30, 2022 by Phu Nguyen

Update Intel Gives the Etcd Key-Value Store a Needed Boost

Main Contents:

Intel Gives the Etcd Key-Value Store a Needed Boost is an article under the topic Data Science Many of you are most interested in today !! Today, let’s InApps.net learn Intel Gives the Etcd Key-Value Store a Needed Boost in today’s post !

Understanding the Problem

To make the Tectonic stack a reality, Intel had to understand a particular scaling problem that the Kubernetes community had commented about. Kubernetes had scale limits.

How they solved the problem shows the way new technologies are not always the answers to the challenges posed by distributed architectures. Sometimes, it’s the technologies that were developed decades ago that make the difference.

Some Background

Google developed Kubernetes, an orchestration system, basing it on Borg — its own container system that manages just about everything at Google. Tectonic is a new service developed by CoreOS that combines its OS and components with Kubernetes. This combination is hoped to make Google’s form of infrastructure available to any enterprise customer that needs to scale their operations in some manner.

CoreOS, which last month received $12 million from Google Ventures, has made its mark with a container-centric Linux distribution for large-scale server deployments and its secure, distributed platform for auto-updating servers. Much as the Chrome browser updates automatically, so does CoreOS for Linux server deployments. It’s used by companies such as MemSQL, Rackspace and Atlassian.

Etcd uses the Raft consensus algorithm. On the Raft consensus algorithm web page, consensus is described as a fundamental problem with fault-tolerant distributed systems. The consensus involves what is described as multiple servers agreeing on values. Once a decision is made about a value, the decision is final:

Typical consensus algorithms make progress when any majority of their servers are available; for example, a cluster of 5 servers can continue to operate even if 2 servers fail. If more servers fail, they stop making progress (but will never return an incorrect result).

A paper written by the Raft creators provides a more detailed analysis of the algorithm.

In etcd, the Raft consensus algorithm is most efficient in small clusters — between three and nine peers. For clusters larger than nine peers, etcd selects a subset of instances to participate in the algorithm in order to keep it efficient.

According to CoreOS, when writing to etcd, the peers redirect to the leader of the cluster, which then redirects back to the peers immediately:

A write is only considered successful when a majority of the peers acknowledge the write.

As described by CoreOS, that means in a cluster of five peers, the write operation is only as fast as the third fastest machine. Leaders are elected by a majority of the active peers before cluster operations can continue. With this process, write performance becomes an issue when running at scale, as there must be acknowledgement of the leaders by the peers before a cluster operation can continue, which can be an issue when writing about performance in high-latency environments, such as a cluster spanning multiple data centers.

Intel’s team under Nicholas Weaver, director of emerging technologies, looked at the Raft protocol for the first place they could find a bottleneck.

They discovered in the etcd code that every inbound entry to a follower was writing to disk before acknowledging to the leader. Weaver has a background in storage. He looked at the volume of entries and recognized how expensive things could get at scale. The likely reason: a requirement for what looked like “stable storage.”

The concept of stable storage runs deep in tech history. According to Wikipedia, “stable storage is a classification of computer data storage technology that guarantees atomicity for any given write operation and allows software to be written that is robust against some hardware and power failures. To be considered atomic, upon reading back a just written-to portion of the disk, the storage subsystem must return either the write data or the data that was on that portion of the disk before the write operation.”

Stable storage offers a view into the late 1980s when the demand for mirroring data became a higher order of need, simply due to the cost that had come with using mainframes, the massive computers of the day. Again, from Wikipedia, the RAID controller served as way to implement the disk writing algorithms, which then allowed the disks to act as a means of stable storage.

But in this all lies a problem in the requirements to acknowledge all the writes that comes with the Raft protocol. That’s where DRAM enters the picture, and more specifically, “Asynchronous DRAM” (ADR), a feature that, according to Intel, automatically flushes memory controller buffers into system memory, and places the DDR into self-refresh mode in the event of a power failure.

To reduce the latency impact of storing to disk, Weaver’s team looked to buffering as a means to absorb the writes and sync them to disk periodically, rather than for each entry.

Tradeoffs? They knew memory buffers would help, but there would be potential difficulties with smaller clusters if they violated the stable storage requirement.

Instead, they turned to Intel’s silicon architects about features available in the Xeon line. After describing the core problem, they found out this had been solved in other areas with ADR. After some work to prove out a Linux OS supported use for this, they were confident they had a best-of-both-worlds angle.

And it worked. As Weaver detailed in his CoreOS Fest discussion, the response time proved stable. ADR can grab a section of memory, persist it to disk and power it back. It can return entries back to disk and restore back to the buffer. ADR provides the ability to make small (<100MB) segments of memory “stable” enough for Raft log entries. It means it does not need battery-backed memory. It can be orchestrated using Linux or Windows OS libraries. ADR allows the capability to define target memory and determine where to recover. It can also be exposed directly into libs for runtimes like Golang. And it uses silicon features that are accessible on current Intel servers.

Using the new capability, Intel did its work with Raft and etcd in the lab. They tested a five node etcd cluster and found the maximum number of writes without ADR is about 4,000 to 5,000 writes per second:

etcd2.08

With ADR, etcd could handle about 10,000, essentially doubling the transaction speed:

etcdpatched

What This All Means

Weaver and his team have demonstrated how the state of a single machine has much in common with distributed systems. The difference comes in terms of how the state gets managed. In some respects it comes down to what gets monitored. Until more recently, the most valuable tools monitored the machines that existed in client/server environments.

I talked about this topic last week with SignalFx Founder and CEO Karthik Rau. In our conversation we discussed containers and how they behave requires people to collect the data and analyze it so the applications work accordingly across clusters.

That, more than anything, means a change in how people communicate. Apps are at the center of the universe. Increasingly, the compute will swarm to the data, which will be automated and orchestrated through microservices architectures. The services will consist of disposable containers that are portable, connected to git environments and the rest. The way containers are programmed and behave on these architectures means people will need to have different communication patterns themselves. That’s a business issue that can not be solved with org charts. The answers will surface in the data to such questions as speed and latency issues.

CoreOS, Rackspace, Red Hat and SignalFx are sponsors of InApps Technology.

Feature image via Flickr Creative Commons.

Source: InApps.net

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.

Let’s create the next big thing together!

Coming together is a beginning. Keeping together is progress. Working together is success.

Let’s talk

Recommended

Tech News

February 11, 2025 by Tam Ho

Update Intel Gives the Etcd Key-Value Store a Needed Boost

Read more about Intel Gives the Etcd Key-Value Store a Needed Boost at Wikipedia

Understanding the Problem

Some Background

What This All Means

FITNESS APP DEVELOPMENT

ONLINE COURSE APP

EVE HR – WEB DESIGN

AIRGOGO WEBSITE

WALLET APP DEVELOPMENT

Ho Chi Minh City Launches Digital Traffic App 2017

Why Your Business Needs a Mobile App Rather Than a Website

7 Questions To Ask Yourself Before You ‘App’ | Entrepreneur

Homestays Marketplace Application Development

Applying blockchain in the telecom industry ecosystem

Blog post

9 Practical Tips to Choose a Mobile App Development Company for 2023

Top 10 Offshore Development Companies (ODCs) in 2025

How can businesses effectively integrate AI into their operations?

FITNESS APP DEVELOPMENT

Locations

Read more about Intel Gives the Etcd Key-Value Store a Needed Boost at Wikipedia

Understanding the Problem

Some Background

What This All Means

Get a custom Proposal

You need to enter your email to download

Blog post

Locations