Skip to content

Investigate: Map '\n' to EBCDIC LF (0x25) instead of NEL (0x15) in c2asm370 #26

@mgrossmann

Description

@mgrossmann

Summary

Investigate changing c2asm370 (GCC 3.2.3 fork) to generate EBCDIC LF (0x25) instead of NEL (0x15) for the C character literal \n. This would eliminate the NL/NEL asymmetry that causes translation bugs across the entire mvslovers ecosystem.

Background

All mvslovers projects share a fundamental EBCDIC character encoding problem: the C compiler (c2asm370) generates EBCDIC NEL (0x15) for \n. This is IBM convention, but it creates cascading translation problems in every component.

The NL/NEL Asymmetry

IBM-1047 (z/OS USS / Zowe standard) maps asymmetrically:

ASCII LF  (0x0A) → atoe → EBCDIC NEL (0x15)
EBCDIC NEL(0x15) → etoa → ASCII LF  (0x0A)   ← roundtrip OK

BUT:
EBCDIC LF (0x25) → etoa → ASCII NEL (0x85)   ← NOT LF!
ASCII NEL (0x85) → atoe → EBCDIC NEL(0x15)   ← NOT 0x25!

CP037, by contrast, is symmetric:

ASCII LF  (0x0A) ↔ EBCDIC LF  (0x25)  ← roundtrip clean
ASCII NEL (0x85) ↔ EBCDIC NEL (0x15)  ← roundtrip clean

But because the compiler generates 0x15 for \n, even CP037 tables need hacks. The HTTPD 3.3.x httpetoa.c had a switch override specifically for this:

case n:          /* EBCDIC NL (0x15) */
    buf[i]=0x0A;    /* ASCII LF */
    break;

Concrete Bug Example

When HTTPD sends http_printf(httpc, "HTTP/1.0 200 OK\r\n"):

  1. Compiler generates: 0x0D 0x15 (CR + NEL)
  2. Pure CP037 table translates 0x15 → 0x85 (Latin-1 NEL)
  3. Browser receives CR + 0x85 — not recognized as line ending
  4. HTTP response headers unparseable → empty response

Current workaround: switch hack in httpetoa.c, or modified tables that are no longer standard CP037.

The Proposed Fix

Change c2asm370 so \n produces 0x25 (EBCDIC LF) instead of 0x15 (EBCDIC NEL).

With this change:

  • "\r\n" becomes 0x0D 0x25 in the compiled EBCDIC binary
  • CP037 translates 0x25 → 0x0A (ASCII LF) directly — no hack needed
  • Standard CP037 tables achieve 256/256 roundtrip
  • No switch overrides, no modified tables, no codepage confusion

Risk Analysis

Risk 1 — Hardcoded 0x15 in crent370

If any code uses 0x15 as hex literal (instead of \n), it breaks — the meaning of \n changes but the hex literal does not.

Action required:

grep -rn "0x15\|\\\\x15" src/ include/

Every hit must be classified: "the value of newline" (must change to 0x25) vs. "the value of NEL" (stays 0x15).

Risk 2 — MVS system interface expectations

WTO, TPUT/TGET, spool I/O may expect 0x15 as newline. If wtof("message\n") sends 0x25 instead of 0x15, console output might break.

Action required: Identify all MVS system interfaces in crent370 that process text with newlines. Test each with 0x25.

Risk 3 — TCP/IP stack (DYN75)

The socket layer uses CP037 internally. If any part of DYN75 relies on \n being 0x15, networking could break.

Action required: Audit DYN75 socket code for newline handling.

Risk 4 — Existing data on disk

Files written with old compiler contain 0x15 as newline. New compiler code searches for 0x25 when comparing against \n. Mismatch on text file reads.

Mitigation: MVS datasets use record-oriented I/O (LRECL-delimited, not newline-delimited). UFS files are the main concern — after a full rebuild, new files are consistent. Old UFS files may need migration or a compatibility shim.

Risk 5 — Full ecosystem rebuild required

Everything must be recompiled: crent370, ufs370, lua370, mqtt370, httpd, mvsmf, ftp370, ufsd. No mixing old-compiler and new-compiler code — \n would have different values.

Action required: Plan coordinated release. CI must enforce consistent compiler version.

Risk 6 — Assembler code mismatch

Hand-written IFOX00 assembler uses EBCDIC constants directly (DC C\n etc.). These are unaffected by a compiler change, but the mismatch between C code (\n=0x25) and ASM code (possibly using 0x15) could cause subtle bugs.

Action required: Audit all .asm files across the ecosystem for NL/NEL constants.

Suggested Approach

  1. Audit crent370 — grep for hardcoded 0x15 and \x15, classify each hit
  2. Audit DYN75 — socket code newline handling
  3. Audit assembler — all .asm files across projects for NL constants
  4. Prototype — change c2asm370 character mapping, rebuild crent370, run basic tests
  5. Integration test — rebuild httpd + mvsmf, run full curl test suite
  6. Decision — go/no-go based on findings
  7. Coordinated release — all projects rebuilt with new compiler

Context

Identified during the HTTPD 4.0.0 codepage translation redesign. The current workaround is a post-translation NL fix in the HTTP transport layer (httpxlat.c). This issue (Option D) would make all workarounds unnecessary.

Cross-reference: mvslovers/httpd HTTPD 4.0.0 codepage work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions