121 lines
3.2 KiB
Markdown
121 lines
3.2 KiB
Markdown
# Deobfuscation
|
|
|
|
## Principles of Obfuscation
|
|
|
|
* Software obfuscation may be divided into a theoretical layered approach, done by [Hui Xu et. al](https://cybersecurity.springeropen.com/track/pdf/10.1186/s42400-020-00049-3.pdf)
|
|
|
|
* These layers and what's obfuscated are:
|
|
* __Code Element__
|
|
* Layout
|
|
* Controls
|
|
* Data
|
|
* Classes
|
|
* Methods
|
|
* __Software Component__
|
|
* __Inter Component__
|
|
* Library calls
|
|
* Used Resources
|
|
* __Application__
|
|
* DRM System
|
|
* Neural Networks
|
|
|
|
## Evade Statical Rules
|
|
|
|
* Critical data is obfuscated by the __Code Element__ layer which contains the following methods of obfuscation
|
|
* __Array Transformation__
|
|
* __Data Encoding__
|
|
* __Data Procedurization__
|
|
* __Data Splitting & Merging__
|
|
|
|
### Splitting & Merging of Strings
|
|
|
|
* Breaking signature by modifying data distribution inside the code
|
|
* This may be done by modifying strings and functions through following measures
|
|
|
|
* __Joining__
|
|
```python
|
|
"CAFFEE" + "BABE"
|
|
```
|
|
|
|
* __Reordering__
|
|
```python
|
|
a = "BABE"
|
|
b = "CAFFEE"
|
|
f"{b}{a}"
|
|
```
|
|
|
|
* __Whitespaces of functions which are not interpreted__
|
|
```c
|
|
int main ( void ) {
|
|
printf ( "The answer is %d", 42 ) ;
|
|
}
|
|
```
|
|
|
|
* __Adding ticks which are not interpreted__
|
|
|
|
* __Change `uPpER aNd loWeRcAsE oF cHaRaCtErS iN tHe StRinG`__
|
|
|
|
### Adding Unnecessary Instructions
|
|
|
|
* Obfuscation of layout and controls inside the code
|
|
* __Junk Stubs__
|
|
* __Separation of Related Code__
|
|
* __Stripping Redundant Symbols__
|
|
* __Meaningless Identifiers__
|
|
* __Converting Explicit to Implicit Instructions__
|
|
* __Dispatcher Based Controls Executed During Runtime__
|
|
* __Probabilistic Control Flows__
|
|
* __Bogus Control Flows__
|
|
|
|
|
|
### Control Flow
|
|
|
|
* Changing or adding to the flow of the code through change of conditions
|
|
* Changes may be set to arbitrary code segments by __Opaque Predicates__
|
|
* An __Opaque Predicate__ is a control path and value known by the obfuscater and hard to find out by the reverse engineer
|
|
|
|
### Protecting Data
|
|
|
|
* Stripping and protecting
|
|
* __Code Structure__
|
|
* __Object names__
|
|
* __File & Compilation Properties__
|
|
|
|
* To strip symbols
|
|
```sh
|
|
strip --strip-all <binary>
|
|
```
|
|
|
|
* Check via
|
|
```sh
|
|
nm <binary>
|
|
```
|
|
|
|
## Usage
|
|
|
|
* Find a deobfuscator like [de4dot](https://github.com/de4dot/de4dot.git) for e.g. deobfuscating dotfuscator
|
|
* In case of dotnet: __Do not only use ghidra for reversing, use [ILSpy](https://github.com/icsharpcode/ILSpy.git) as well__
|
|
* Another alternative is [dnSpy](https://github.com/0xd4d/dnSpy)
|
|
|
|
* Use [Floss](https://github.com/mandiant/flare-floss/) for string deobfuscation via
|
|
```sh
|
|
floss --no-static-strings $BINARY_FILE
|
|
```
|
|
|
|
## Tools
|
|
|
|
### Packers
|
|
|
|
* UPX is a common packer, take a look at the binary if it is possibly packed via upx. Use the upx cli command to deobfuscate the binary
|
|
```sh
|
|
upx -d <binary>
|
|
```
|
|
|
|
### Demangler
|
|
|
|
The binary may be mangled and needs to be demangled again for better readability. In case of C++ demangling, use `c++filt` to demangle the data types
|
|
```sh
|
|
c++filt _ZNSt7__cxx1114collate_bynameIcEC2ERKNS_12basic_stringIcSt11char_traitsIcESaIcEEEm
|
|
std::__cxx11::collate_byname<char>::collate_byname(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long)
|
|
```
|