In the wake of a historic data breach affecting 70 million consumers, Target Corporation announced the dethroning of CEO Gregg Steinhafel, a 35-year employee of the company. You can bet that any CEO that wasn’t already paying attention to the importance of data security certainly is now. And enterprises—especially large, publicly traded ones—will be paying even more attention to the security of data under the company’s care.
Because of their role in consolidating data from many disparate data sources into a single location for storage or analysis, big data systems are a particularly compelling target for attackers and, as a result, present amplified risks to enterprises that deploy them.
In spite of the potential for increased attacks, enterprise-grade security has long been an after-thought for many big data projects, including the elephant in the room, Hadoop. HDFS, Hadoop’s file system, introduced a basic security model in spring 2008. However, according to the Yahoo-hosted Hadoop Tutorial, the HDFS’ initial permission system wasn’t ever intended to provide strong security:
“Starting with Hadoop 0.16.1, HDFS has included a rudimentary file permissions system. This permission system is based on the POSIX model, but does not provide strong security for HDFS files…
The HDFS permissions system is designed to prevent accidental corruption of data or casual misuse of information within a group of users who share access to a cluster. It is not a strong security model that guarantees denial of access to unauthorized parties.”
Fortunately, big data projects are starting to bring additional focus to the area of security. The recently released Hadoop 2.4.0 release added ACLs, allowing users to specify fine-grained file permissions for named users or groups.
A recent survey of security-focused product announcements by Sqrrl’s Joe Travaglini also highlighted the addition of support for perimeter security provided by Apache Knox in Hortonworks’ HDP 2.1, Cloudera’s inclusion of index-level security in Cloudera Search 1.2.0 and a “slew of security improvements” in MongoDB 2.6, including field-level redaction.
With the business increasingly focused on data security and privacy, enterprise IT will demand more security-focused features from the big data products and projects they use. Expect the pace of security-related announcements in this space to increase.
Image credit via Create Commons on Flickr