killchain-compendium/misc/hadoop.md

58 lines
2.0 KiB
Markdown
Raw Normal View History

2022-03-10 01:31:54 +01:00
# Hadoop
Distributed storage and computing
* [Hadoop Attack Libs](https://github.com/wavestone-cdt/hadoop-attack-library.git)
## Terminology
* __Cluster__, forms the datalake
* __Node__, single host inside the cluster
* __NameNode__, node that keeps the dir tree of the Hadoop file system
* __DataNode__, slave node that stores files and is instructed by the NameNode
* __Primary NameNode__, current active node responsible for keeping the directory structure
* __Secondary NameNode__, hot standby for Primary NameNode. There may be multiple on standby inside the cluster
* __Master Node__, Hadoop management app like HDFS or YARN Manager
* __Slave Node__, Hadoop worker like HDFS or MapReduce. a node can be master and slave at the same time
* __Edge Node__, hosting Hadoop user app like Zeppelin or Hue
* __Kerberised__, security enabled cluster through Kerberos
* __HDFS__, Hadoop Distributed File System, storage device for unstructured data
* __Hive__, primary DB for structured data
* __YARN__, scheduling jobs and resource management
* __MapReduce__, distributed filtering, sorting and reducing
* __HUE__, GUI for HDFS and Hive
* __Zookeeper__, cluster management
* __Kafka__, message broker
* __Ranger__, privileged ACL
* __Zeppelin__, data analytivs inside a webUI
## Zeppelin
* Try [default logins](https://zeppelin.apache.org/docs/0.8.2/setup/security/shiro_authentication.html#4-login)
* Try execution inside notebooks
## Ktabs
* Finding `ktpass`es to authenticate at the kerberos TGS
* Output principals and use them to init
```sh
klist -k <keytabfile>
kinit <prinicpal name> -k -V -t <keytabfile>
```
## HDFS
* User the `hdfs` utility to enumerate the distributed network storage
```sh
hdfs dfs -ls /
```
* Current user and user on the storage do not have to correspond
* Touched files on the storage may be owned by root
```sh
hdfs dfs -touchz testfile /tmp/testfile
hdfs dfs -ls /tmp
```
* Impersonate by sourcing keytab file of the user, __NodeManager__ is the highest user in regards to permission