killchain-compendium/Reverse Engineering/Deobfuscation.md

121 lines
3.2 KiB
Markdown

# Deobfuscation
## Principles of Obfuscation
* Software obfuscation may be divided into a theoretical layered approach, done by [Hui Xu et. al](https://cybersecurity.springeropen.com/track/pdf/10.1186/s42400-020-00049-3.pdf)
* These layers and what's obfuscated are:
* __Code Element__
* Layout
* Controls
* Data
* Classes
* Methods
* __Software Component__
* __Inter Component__
* Library calls
* Used Resources
* __Application__
* DRM System
* Neural Networks
## Evade Statical Rules
* Critical data is obfuscated by the __Code Element__ layer which contains the following methods of obfuscation
* __Array Transformation__
* __Data Encoding__
* __Data Procedurization__
* __Data Splitting & Merging__
### Splitting & Merging of Strings
* Breaking signature by modifying data distribution inside the code
* This may be done by modifying strings and functions through following measures
* __Joining__
```python
"CAFFEE" + "BABE"
```
* __Reordering__
```python
a = "BABE"
b = "CAFFEE"
f"{b}{a}"
```
* __Whitespaces of functions which are not interpreted__
```c
int main ( void ) {
printf ( "The answer is %d", 42 ) ;
}
```
* __Adding ticks which are not interpreted__
* __Change `uPpER aNd loWeRcAsE oF cHaRaCtErS iN tHe StRinG`__
### Adding Unnecessary Instructions
* Obfuscation of layout and controls inside the code
* __Junk Stubs__
* __Separation of Related Code__
* __Stripping Redundant Symbols__
* __Meaningless Identifiers__
* __Converting Explicit to Implicit Instructions__
* __Dispatcher Based Controls Executed During Runtime__
* __Probabilistic Control Flows__
* __Bogus Control Flows__
### Control Flow
* Changing or adding to the flow of the code through change of conditions
* Changes may be set to arbitrary code segments by __Opaque Predicates__
* An __Opaque Predicate__ is a control path and value known by the obfuscater and hard to find out by the reverse engineer
### Protecting Data
* Stripping and protecting
* __Code Structure__
* __Object names__
* __File & Compilation Properties__
* To strip symbols
```sh
strip --strip-all <binary>
```
* Check via
```sh
nm <binary>
```
## Usage
* Find a deobfuscator like [de4dot](https://github.com/de4dot/de4dot.git) for e.g. deobfuscating dotfuscator
* In case of dotnet: __Do not only use ghidra for reversing, use [ILSpy](https://github.com/icsharpcode/ILSpy.git) as well__
* Another alternative is [dnSpy](https://github.com/0xd4d/dnSpy)
* Use [Floss](https://github.com/mandiant/flare-floss/) for string deobfuscation via
```sh
floss --no-static-strings $BINARY_FILE
```
## Tools
### Packers
* UPX is a common packer, take a look at the binary if it is possibly packed via upx. Use the upx cli command to deobfuscate the binary
```sh
upx -d <binary>
```
### Demangler
The binary may be mangled and needs to be demangled again for better readability. In case of C++ demangling, use `c++filt` to demangle the data types
```sh
c++filt _ZNSt7__cxx1114collate_bynameIcEC2ERKNS_12basic_stringIcSt11char_traitsIcESaIcEEEm
std::__cxx11::collate_byname<char>::collate_byname(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long)
```