Skip to content

[BUG] Msgpack length parsing uses pointer cast instead of memcpy (unaligned access) #374

@MarkLee131

Description

@MarkLee131

The msgpack length field decoding at src/ucl_msgpack.c:964-970 reads uint16_t/uint32_t/uint64_t through a cast on a potentially unaligned pointer. This is UB per the C standard and causes SIGBUS on strict-alignment platforms (ARM, SPARC). The int/float parsers later in the same file already do this correctly via memcpy; the length-parsing path just wasn't updated to match. Affects 0.9.4 and earlier.

Description

The length fields are read with direct casts:

// ucl_msgpack.c:964-970
case 2:
    len = FROM_BE16(*(uint16_t *) p);   // unaligned
    break;
case 4:
    len = FROM_BE32(*(uint32_t *) p);   // unaligned
    break;
case 8:
    len = FROM_BE64(*(uint64_t *) p);   // unaligned
    break;

The pointer p points to the byte immediately after the type byte in the msgpack stream. Since msgpack is a byte-packed format, p has no alignment guarantee.

Notably, the integer/float value parsers in the same file already use the correct pattern:

// ucl_msgpack.c:1402 (correct)
memcpy(&iv16, pos, sizeof(iv16));
iv16 = FROM_BE16(iv16);

Reproduction

UBSAN detects this on msgpack input where a str16, str32, bin16, bin32, array16, array32, map16, or map32 length field falls at an unaligned offset. This requires a preceding entry to push the length field to an odd byte boundary.

Minimal reproducer (str16 key at unaligned offset):

#include <ucl.h>

int main(void) {
    /* msgpack: fixmap(2){
     *   fixstr(1)"a" : fixint(1),     -- 3 bytes, pushes offset to odd
     *   str16(3)"key" : fixint(2)     -- str16 length at byte offset 5 (odd!)
     * }
     * Layout: 82 a1 61 01 da 00 03 6b 65 79 02
     *                      ^^ ^^ ^^-- length field at offset 5 (unaligned) */
    unsigned char data[] = {
        0x82,              /* fixmap, 2 entries */
        0xa1, 0x61,        /* fixstr(1) "a" */
        0x01,              /* fixint 1 */
        0xda, 0x00, 0x03,  /* str16, length 3 (length bytes at offset 5-6) */
        0x6b, 0x65, 0x79,  /* "key" */
        0x02               /* fixint 2 */
    };

    struct ucl_parser *p = ucl_parser_new(0);
    ucl_parser_add_chunk_full(p, data, sizeof(data), 0,
                              UCL_DUPLICATE_APPEND, UCL_PARSE_MSGPACK);
    ucl_object_t *obj = ucl_parser_get_object(p);
    if (obj) ucl_object_unref(obj);
    ucl_parser_free(p);
    return 0;
}

Build with UBSAN (requires both library and test compiled with -fsanitize=alignment):

cmake -B build -DCMAKE_C_COMPILER=clang \
  -DCMAKE_C_FLAGS="-fsanitize=alignment -fno-sanitize-recover=all -g"
cmake --build build
clang -fsanitize=alignment -fno-sanitize-recover=all -g -I include \
  -o repro repro.c build/libucl.a
./repro
# runtime error: load of misaligned address 0x... for type 'uint16_t',
# which requires 2 byte alignment
# ucl_msgpack.c:964

Suggested Fix

Replace the pointer casts with memcpy, matching the pattern already used in the int/float parsers:

case 2: {
    uint16_t tmp;
    memcpy(&tmp, p, sizeof(tmp));
    len = FROM_BE16(tmp);
    break;
}
// Similarly for case 4 and case 8

Happy to send a PR if that helps.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions