Update Forget the File System: The Future of Scalable Cloud Storage Will Be Objects

Main Contents:

Forget the File System: The Future of Scalable Cloud Storage Will Be Objects is an article under the topic Data Science Many of you are most interested in today !! Today, let’s InApps.net learn Forget the File System: The Future of Scalable Cloud Storage Will Be Objects in today’s post !

Storage Futures

Andrew Boag: Why mess with file systems? The future of storage is object-based.

Over time, you want to move to object storage, advised Andrew Boag, managing director of New Zealand open source systems provider Catalyst IT Limited, during a presentation on clustered file systems at the most recent OpenStack Summit, held in Barcelona in October.

“We are seeing the future of our cloud-native applications as using objects, and we are already starting to do that,” Boag said. “There are just so many advantages to objects.”

Catalyst IT has used a variety of clustered file systems to provide a scalable storage solution for its clients with extreme amounts of data and found them to be lacking. For instance, the company used GlusterFS to run an Australian MOOC (Massive Open Online Course).

One potential troublesome area of GlusterFS is that the files may only be accessible if the file system itself is mounted, which can be problematic if you are having performance issues with a mounted GlusterFS share.

Overall, GlusterFS is best for large, immutable files and less suitable for oft-accessed smaller-sized files. Problematic GlusterFS implementations result in time-outs, performance degradation, file operation failures and even cluster failures.

Boag is bullish on object storage mainly because traditional file-based storage requires increasingly more management the larger systems get larger. Object storage abstracts away the low-lying work such as worrying about individual disks, capacity management, backup.

Open Clustered File Systems in Your Application Stack on YouTube.

The Future of Storage is the Future of Storage

The definition of object storage can vary greatly, depending on who is offering the definition, noted Jacob Farmer, chief technology officer for storage consultancy Cambridge Computer, speaking at the Data Storage Day mini-conference within the USENIX LISA 2016 conference, held December in Boston.

At its most basic, object storage simply refers to chunks of data that can be individually addressed. The data can have metadata associated with them and accessed through an API.

Some object addresses are not bound to physical hardware locations. Sometimes objects are mutable (that is to say, they can be modified) and sometimes they are not. You can reconcile an object store to any level of a traditional storage stack, or if you could make up weird and different storage stacks, Farmer noted.

Most modern storage arrays are based on an object store, where they got an internal logic that chunks and allocates and addresses storage. Most of NoSQL databases are, in fact, object stores. MongoDB, for instance, stores everything as a JSON document. Many de-duplication devices and content management systems — Sharepoint, Documentum, Filenet — store content as objects and fetch it using metadata.

By now most organizations have petabytes worth of storage. And it’s not a trivial task managing this much data, which could be upwards of a billion files, Farmer noted. “It’s not really the storage that is the problem, it’s the metadata and the file system,” he said. “There are only a handful of vendors who can do it reliably and they always charge a premium for it.

Most often, storage systems this large become too brittle. Backing up and storing these data become too difficult. Moving a billion files to a second location and then moving them back “is too much data to move,” Farmer said. Switching out hardware becomes more difficult, as does managing namespace issues.

The beauty of object storage is that you are not managing large blocks of data — such as all the data on a single drive — but rather, smaller chunks, which can be more easily distributed, and replicated, across multiple disks. “You’re applying your data protection logic not to hard drives but to the data itself,” Farmer said.

The approach that many object storage vendors seem to be taking in an offering, such as EMC’s Isilon, is a RAID 6 approach using erasure coding. In effect, you can RAID-stripe your data set. For resiliency, organizations can even stripe their data across a wide local area network (WLAN), though for this to happen it would need both the bandwidth and a certain tolerance for latency, Farmer pointed out.

That said, the move to object-based storage will come with challenges, Boag said. The biggest change may be that your applications will have to access its objects through an API. Developers will have to go through their applications and replace calls to file systems with calls for objects. Objects accessibility may not offer the same consistency as file-based storage.

“That is fine if your application understands that,” Boag said. Access latency can still be an issue as well.

Calling object storage the future of storage is like saying ‘storage is the future of storage’ –Jacob Farmer @CambridgeComp #LISA16 #USENIX pic.twitter.com/y0ph4Wsb01
— InApps Technology (@thenewstack) December 7, 2016

Open on Twitter.

Let Kubernetes Sort it Out

Of course, many cloud providers offer object storage. Amazon Web Services’ Simple Storage Service is one of AWS’ most popular services, serving as the base of storage for the consumption of other AWS services. Azure offers Blob storage, which like S3 offers for 500TB, though by way of a single blob container, rather than the 100 5TB objects that AWS carves out for each account. Unified Object Storage is an offering from the Google Cloud. DigitalOcean has told us it plans to offer object storage, though hasn’t offered a timeline as of yet.

But what if you want to keep your data in-house? One company working on a scalable object storage is Minio. Minio co-founder and CEO Anand Babu Periasamy, was one of the co-founders of Gluster, the company behind GlusterFS. Gluster was purchased in 2011 by Red Hat. Periasamy said he wanted to avoid the mistakes that GlusterFS made.

Open-sourced under an Apache 2.0 license, the Minio object storage server provides an AWS S3-compatible storage service. So as far as the developer is concerned, the REST API provides the gateway to store blobs of immutable data. Objects can have a maximum size of 5TB. The server can be bundled, so quoteth the GitHub page, into an application stack, much like is done for Node.js, Redis and MySQL.

All well and good. But here is the genius part: Unlike, say, GlusterFS, Minio does not try to build a clustered file system, which would require an increasing amount of coordination between all the nodes. It leaves the work of scaling to the Kubernetes container orchestrator, which regular TNS readers well know, is just nifty at managing multiple workloads on distributed systems.

The Minio Server comes with an embedded web based object browser.

To make this magic happen, Minio runs on Docker, with each container holding a separate account. Then, Kubernetes balances the use accounts across multiple servers — just like you would deploy multiple copies of Node.js you would do the same for Minio.

This approach makes sense, especially for web-scale applications.

“If you build a [service] like Dropbox, you don’t want to put all the customers into a single large monolithic distributed application,” Periasamy said.

In a multitenant environment, Periasamy argued, it makes more sense to get give each tenant its own virtual storage server, rather than try to have multiple accounts vie for the same set of storage management resources. Managing a clustered file system requires admin attention whenever new nodes, or new users, are added.

InApps Technology is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Docker.

Red Hat is a sponsor of InApps Technology.

Feature image: The mascot for Starfish Virtual Global File System, at the USENIX LISA conference, 2016.

Source: InApps.net

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.