Skip to content

[BUG] Config emitter roundtrip corruption: heredoc EOD injection and missing backslash escaping in single-quoted strings #370

@MarkLee131

Description

@MarkLee131

Two escaping bugs in the UCL_EMIT_CONFIG emitter (src/ucl_emitter_utils.c) break emit-then-reparse roundtrips. First, ucl_elt_string_write_multiline() writes heredoc content without checking for the EOD terminator inside the string, so a string containing a literal \nEOD\n prematurely ends the heredoc block. Second, ucl_elt_string_write_squoted() escapes single quotes but not backslashes, so a string containing \' gets mangled on re-parse. Both affect 0.9.4 and earlier. (CWE-138)

Bug 1: Heredoc does not escape EOD terminator

ucl_elt_string_write_multiline() (line 178) writes string content verbatim between <<EOD\n and \nEOD delimiters. If the string itself contains \nEOD, the heredoc terminates prematurely on re-parse.

// ucl_emitter_utils.c:178-186
func->ucl_emitter_append_len("<<EOD\n", sizeof("<<EOD\n") - 1, func->ud);
func->ucl_emitter_append_len(str, size, func->ud);        // NO ESCAPING
func->ucl_emitter_append_len("\nEOD", sizeof("\nEOD") - 1, func->ud);

Reproducer

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ucl.h>

int main(void) {
    ucl_object_t *root = ucl_object_typed_new(UCL_OBJECT);
    /* String > 80 chars (triggers heredoc) with \nEOD embedded */
    const char *payload =
        "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
        "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
        "\nEOD\ninjected_key = true;\n<<EOD\nBBBBBBBBBBBB";
    ucl_object_t *val = ucl_object_fromstring_common(payload, strlen(payload), 0);
    ucl_object_insert_key(root, val, "test", 4, false);

    char *emitted = (char *)ucl_object_emit(root, UCL_EMIT_CONFIG);
    printf("=== EMITTED ===\n%s\n=== END ===\n", emitted);

    struct ucl_parser *p = ucl_parser_new(UCL_PARSER_DISABLE_MACRO);
    ucl_parser_add_string(p, emitted, strlen(emitted));
    ucl_object_t *reparsed = ucl_parser_get_object(p);

    if (!reparsed)
        printf("CONFIRMED: re-parse failed: %s\n", ucl_parser_get_error(p));

    ucl_parser_free(p);
    free(emitted);
    ucl_object_unref(root);
    return 0;
}
cc -g -I include -o repro_heredoc repro_heredoc.c build/libucl.a
./repro_heredoc

Output:

=== EMITTED ===
test = <<EOD
AAAA...AAAA
EOD
injected_key = true;
<<EOD
BBBBBBBBBBBB
EOD;
=== END ===
CONFIRMED: re-parse failed: 'key must begin with a letter'

The \nEOD\n in the string prematurely terminates the heredoc. injected_key = true; appears as a top-level directive in the emitted output.

Suggested fix

Before emitting as heredoc, check if the string contains the terminator sequence. If it does, fall back to JSON-escaped string output:

if (memmem(str, size, "\nEOD", 4) != NULL || (size >= 3 && memcmp(str, "EOD", 3) == 0)) {
    ucl_elt_string_write_json(str, size, ctx);
    return;
}

Bug 2: Single-quoted emitter does not escape backslash

ucl_elt_string_write_squoted() (line 145) escapes ' to \' but does not escape pre-existing \ characters. Per the README:

All values passed in single quoted strings are NOT escaped, with two exceptions: a single ' character just before \ character, and a newline character just after \ character that is ignored.

The parser treats \' as an escaped single-quote and \\ as an escaped backslash. But the emitter only produces \' for quotes, not \\ for backslashes. A string containing the literal sequence \' is emitted as ...\\'..., which the parser interprets as escaped-backslash + end-of-string.

Reproducer

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ucl.h>

int main(void) {
    ucl_object_t *root = ucl_object_typed_new(UCL_OBJECT);
    /* String with literal backslash + single-quote */
    ucl_object_t *val = ucl_object_fromstring_common("hello\\'world", 12, 0);
    val->flags |= UCL_OBJECT_SQUOTED;
    ucl_object_insert_key(root, val, "test", 4, false);

    char *emitted = (char *)ucl_object_emit(root, UCL_EMIT_CONFIG);
    printf("Emitted: %s\n", emitted);

    struct ucl_parser *p = ucl_parser_new(UCL_PARSER_DISABLE_MACRO);
    ucl_parser_add_string(p, emitted, strlen(emitted));
    ucl_object_t *reparsed = ucl_parser_get_object(p);

    if (!reparsed)
        printf("CONFIRMED: re-parse failed: %s\n", ucl_parser_get_error(p));

    ucl_parser_free(p);
    free(emitted);
    ucl_object_unref(root);
    return 0;
}
cc -g -I include -o repro_squote repro_squote.c build/libucl.a
./repro_squote

Output:

Emitted: test = 'hello\\'world';
CONFIRMED: re-parse failed: delimiter is missing

Suggested fix

In ucl_elt_string_write_squoted(), escape \ to \\ in addition to the existing ' to \':

if (*p == '\\') {
    // flush accumulated, then emit "\\\\"
}

Both fixes are straightforward. Happy to send a PR if that helps.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions