In this blog post series about analysis and obfuscation, I decided to start with one of the most common techniques: Flattening.
Control-Flow flattening is one of the many obfuscation and anti-analysis techniques used by both legitimate software and malware. This method is usually implemented automatically as there’s plenty obfuscators available that can implement it before compiling the source code saving a considerable amount of time to the developer.
The method for flattening a function can be described like this. The main goals are to break the body to very basic blocks and to generate nesting, the more the better. The equaled level of blocks are dependent of whatever is after or before this code. This can be achieved for example using jumptables in the structure within the looping.
This flow has be directed by a variable to direct the flow and to construct the logic for the jumptable in order to achieve the goal eg. build a string. If we use “if ” statements or “switch case” statements we generate different assembly.
For this PoC I created two different code snippets in C++ using control-flow flattening technique and combined it with a simple obfuscation in one of them to review how is represented by the disassembler.
- The 1st a very simple function to print the string ‘powershell.exe’, then calls CreateProcessA to create that process.
- However the 2nd contains obfuscation, but the functionality is exactly the same.
The code has been uploaded and is available in my personal GitHub page.
Example 1: Standard flow vs Flattening
The first example contain two functions that do the exact same thing but one of them is using obfuscation.
The first function(let’s call it “standard”) looks like this:
As can be observed in the graph, the flow is straight forward and in the right side of the image, the string ‘powershell.exe’ is going to be used along with another string ‘Executed command: ’. These strings will be printed out via std::cout, then ‘powershell.exe’ will be passed as an argument to another function which I renamed to “create_process” as it just creates it.
Let’s jump and check the second example(“obfuscated”).
In the image above we can see the “create_process” subroutine call, however in the image below we look for the same functionalities but the flow looks a bit more confusing and the string for that process is not there for us.
If we look at the rest of the subroutine, the flattening that builds the string looks not as pleasing as it was before.
Now the flow looks totally divided into smaller pieces to harden the analysis. Can be observed before “jmp def_401630” (break statement) the instructions move and jmp responsible for determining the control of the flow.
Looking at the graph for this subroutine, the flattening creates the following view.
Example 2: Standard flow vs Flattening+obfuscation
For the sake of the demonstration I decided to keep playing with the obfuscation while keeping this technique and adding other common techniques used by malware developers such as:
- Useless instructions and conditions that will never be executed
- Unclear and bigger variables
- Split arithmetical operations to direct the flow of the next instruction
- Added just more junk code. Why not?
Like in the previous example, the standard function looks exactly the same as it wasn’t touched, so let’s jump right in into the obfuscated subroutine.
The image looks pretty similar, however there are more jumps and the flow looks not as flat as before. This was made by the conditions plus junk data within the flattened flow of the jumptable, making harder to spot useful data specially from a static analysis point of view.
The graph mode shows that now is not as flat as it was in the previous example and that the addition of this very simple obfuscation changes the overall shape; from a flat view to a more kind of tree nesting that keeps the complexity added by the flattening.