Hadoop
Operational Security Architecture:
By default Hadoop runs in non-secure mode there is no security so we
need to configure Hadoop security model. Considering a typical Hadoop cluster
with Hadoop ecosystem tool with day-to-day operations, how can we secure our
Hadoop clusters and data? Let’s discuss this in detail.
Hadoop
Cluster Operational Security Architecture
Cluster Security:
Hadoop Cluster Security, indeed any cluster security can be achieved by
authentication, authorization, encryption, key management, and logging. In a
Hadoop cluster, by configuring secure mode, each user and service needs to be
authenticated by Kerberos.
Authentication:
Kerberos: Kerberos is an authentication server which identifies client's
identity. It ensures that client’s password is encrypted and transferred
through the network to authenticate. We can integrate existing LADP/AD
authentication systems with Kerberos. Kerberos is an essential step for user
authentication, but it is not sufficient in itself as it lacks the ability to
hide cluster entry points and block access at the perimeter. Apache Knox is
built to bridge these gaps with Perimeter Level Security.
Knox: The Apache Knox
Gateway is a REST API Gateway for interacting with Apache Hadoop clusters. Knox
provides authorization, authentication, SSL, and SSO capabilities to enable a
single access point for Hadoop.
Authorization
& Audits:
Ranger: The Apache
Ranger providing a framework for central administration of security policies
and monitoring of user access, log audits, key management and fine grained data
access policies across HDFS, Hive, YARN, Solr, Kafka and other modules in
Hadoop Cluster.
Data Protection:
HDFS Encryption: HDFS offers encryption on data in
transit/rest. HDFS supports encrypting network traffic as data flows into and
through the Hadoop cluster over RPC, HTTP, Data Transfer Protocol (DTP) and
JDBC. Network traffic over each of these protocols can be encrypted to provide
privacy for data movement. Data on rest are encrypted Hadoop Key Management
Service (KMS) or integrated with third party key management.
Here are the high level point to install Kerberos and
integrating it in to Hadoop cluster using Apache Ambari:
1. Install KDC Server: yum install krb5-server krb5-libs
krb5-workstation
2.
Edit /etc/krb5.conf and change the realm and copy it
over to /var/lib/ambari-server/resources/scripts/krb5.conf
3.
Create KDC Database:
kdb5_util create –s
4.
Start KDC services
/etc/rc.d/init.d/krb5kdc start
/etc/rc.d/init.d/kadmin start
5.
Create admin user:
kadmin.local -q "addprinc admin/admin"
6.
Edit KDC acl:
7.
Restart KDC admin service:
/etc/rc.d/init.d/kadmin restart
8.
Now connect to Ambari admin and enable Kerberos:
9.
Select ‘Existing MIT KDC’ since we have already
configured Kerberos.
10. Enter your KDC
details:
11. Download CSV to verify for which Kerberos is enabled:
HDFS Directory/File Security through Access Control
List (ACL):
To Setup ACL we need to enable ACL in hdfs-site.xml by
setting up dfs.namenode.acls.enabled property with ‘true’.
1.
Granting access to another User:
hdfs dfs –setfacl –m user:kpi:rwx /kpi_dw
To check the access control by running hdfs dfs -getfacl /kpi_dw
2. Granting access to another group:
hdfs dfs -setfacl -m group:kpi_group:r-- /kpi_dw
To check the access
control by running hdfs dfs -getfacl /kpi_dw
3.
ACL with automatic replication to its Childs:
hdfs dfs -setfacl -m default:group:kpi_group:r-x
/kpi_dw
To check the access control by running
a.
hdfs dfs -mkdir /kpi_dw/abc_project
b.
hdfs dfs -getfacl /kpi_dw/abc_project
4.
Removing/Blocking access:
hdfs dfs -setfacl -m user:kpi:--- /kpi_dw
To check the access control by running hdfs dfs
-getfacl /kpi_dw
The rest of the security implementation will be available soon.
Key Words: MAVEN , Maven , mvn , SPARK , Spark, spark , Scala , scala , SCALA . Eclipse , eclipse , ECLIPSE , Lambda , Hadoop , Big Data .














Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating Hadoop Administration Online Training Bangalore
ReplyDelete