PEB Obfuscation: Malware Dev 0x4

13 Feb 2024 by @synawk

The PEB (Process Environment Block) structure is one of the most used in malware development to avoid being detected when you are trying to gather information about the process. However, the methods to obtain it are very limited, so it’s quite easy to detect. This post explores some smart ways to avoid being detected by obfuscating the method of obtaining the PEB.

There is much information about how to obtain the PEB structure and retrieve details such as ImageBaseAddress or determine if the process is being debugged. Almost all these methods are focused on the result of the technique itself, but the truth is that detecting this is quite easy. Most of the techniques use the following lines of code to get the PEB (or TEB in some cases)

//x64
mov rax, gs:[0x60]
ret

//x86
mov eax, fs:[0x30]

These methods are commonly used in malware development to avoid importing Windows functions (Winapi) by obtaining the Kernel32 base address. Actually, you can check the Meterpreter code that uses this method.

Using YARA rules, you can detect this method. For example, you can check this rule and observe that specific bytes are specified:

65 48 8B 04 25 60 00 00 00

//mov    rax,QWORD PTR gs:0x60

Some rules aim to detect x64 or x86, but in many cases, the detection is straightforward because fixed bytes need to be specified. So basically, the goal to bypass static detection is, in some way, to avoid using fixed bytes and be able to generate different bytes based on certain conditions.

Extra steps

The first method is very simple. It consists of basically adding some extra steps to achieve the same result. In this particular case, I just add some operations before attempting to get the PEB. For example, this is the code for a common way to obtain the PEB:

0000000000000000 <GetPEB>:
  65 48 8b 04 25 60 00    mov    rax,QWORD PTR gs:0x60
  00 00
  c3                      ret

But addding some extra steps such as add or maybe another instructions will be something like this

0000000000000000 <GetPEB>:
   0:   48 33 d2                xor    rdx,rdx
   3:   48 83 c2 10             add    rdx,0x10
   7:   48 83 c2 50             add    rdx,0x50
   b:   65 48 8b 02             mov    rax,QWORD PTR gs:[rdx]
   f:   c3                      ret

Of course, we are talking about static analysis because if I consider dynamic analysis, it would be useless, given that the EDR will analyze the behavior or, in some cases, emulate the obtaining of PEB. As you can see, the bytes generated by the compiler are different from the previous method.

You are able to get the same address by many methods; for example, another way to do it is:

0000000000000000 <GetPEB>:
   0:   48 33 c0                xor    rax,rax
   3:   65 48 03 04 25 60 00    add    rax,QWORD PTR gs:0x60
   a:   00 00
   c:   c3                      ret

The important thing is to try to avoid the original bytes 65 48 8b 04 25 60 00. You can use instructions like sub, or, or any other instruction that allows you to move the gs:60 into the register. This technique is straightforward to implement, but as you can see, the generated bytes always retain the same first bytes 65 48, which is not an issue but some EDR could base its detection on those bytes. Let see in the next method how to solve this.

RDGSBASE/RDFSBASE

When a process is created, a memory address is assigned to the PEB/TEB. This procedure cannot be controlled; it simply takes the first available address in memory. After assigning this address to the process, at some point, the kernel sets the gs segment with the base address of the TEB.

By using the RDGSBASE instruction, you will be able to read the FS/GS segment base. Then, by adding the correct offset, you will be able to reach the TEB address.

0000000000000004 <GetPEB>:
   4:   f3 48 0f ae c8          rdgsbase rax
   9:   48 83 c0 60             add    rax,0x60
   d:   48 8b 00                mov    rax,QWORD PTR [rax]
  10:   c3                      ret

As you can see, the bytes perform the task to get the PEB, and, of course, you can change the registers and combine this technique with the previous one in order to have more steps and generate different bytes each time.

Syscalls

Another technique involves using syscalls. Unlike other methods, using syscalls may be more susceptible to detection. Therefore, consider this technique as just a Proof of Concept. If you already know how syscalls work in Windows, you are aware that it can be complicated to populate parameters for some functions and prepare the data to pass to these syscalls.

By utilizing the function NtQueryInformationProcess, you can obtain the PEB address without invoking the GS register. This function is called from within the kernel. Remember, our objective is to modify the bytes to bypass rules used in static analysis.

For this example, I’m going to skip some steps, assuming that you already know how to get the Syscall ID (SSN). It’s important to mention that the 0x19 reference is hardcoded, and the call to the instruction syscall is basically a direct syscall. With another implementation, it could be more useful. Following the x64 calling convention, I was able to build the proper parameters and finally retrieve the PROCESS_BASIC_INFORMATION structure at rsp+0x30

0000000000000004 <GetPEB>:
   4:   48 33 d2                xor    rdx,rdx
   7:   41 b9 30 00 00 00       mov    r9d,0x30
   d:   4c 8d 44 24 30          lea    r8,[rsp+0x30]
  12:   48 c7 c1 ff ff ff ff    mov    rcx,0xffffffffffffffff
  19:   4c 8b d1                mov    r10,rcx
  1c:   48 c7 c0 19 00 00 00    mov    rax,0x19
  23:   0f 05                   syscall
  25:   48 8b 44 24 38          mov    rax,QWORD PTR [rsp+0x38]
  2a:   c3                      ret

As you can see, we generate more bytes than in the previous techniques, but in a certain way, we also avoid using the GS/FS segments. The next image shows the execution via the GS segment, and in both cases, the result is the same.

This last techinque bring more issues than solutions, actually could be hooked by EDR and intercept your request, also in order to improve this should do it with indirect syscall, wich increase the size of bytes. In any case, could be used as alternative for PEB address retrieving.

Conclusion

The presented techniques are not new; actually, you could achieve them using common sense or by reading the Intel manual. In any case, they could help avoid certain detections by EDRs. It’s important to mention that this post is focused on bypassing static analysis. In all the techniques, in some way, the GS/FS segment was called but is only detected in behavior analysis.

Another important consideration is that you can use these methods to enhance the way you handle other parameters inside PEB, such as debugging or capturing the image base address. Finally, it’s worth mentioning that I use x64 as a reference in all the code; for x86, the FS segment should be used, and the offsets are different depending on the parameter.

References

https://www.intel.com.br/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf

https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170

https://antonioparata.blogspot.com/2023/01/the-segment-memory-model-and-how-it.html