-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Summary
Investigate changing c2asm370 (GCC 3.2.3 fork) to generate EBCDIC LF (0x25) instead of NEL (0x15) for the C character literal \n. This would eliminate the NL/NEL asymmetry that causes translation bugs across the entire mvslovers ecosystem.
Background
All mvslovers projects share a fundamental EBCDIC character encoding problem: the C compiler (c2asm370) generates EBCDIC NEL (0x15) for \n. This is IBM convention, but it creates cascading translation problems in every component.
The NL/NEL Asymmetry
IBM-1047 (z/OS USS / Zowe standard) maps asymmetrically:
ASCII LF (0x0A) → atoe → EBCDIC NEL (0x15)
EBCDIC NEL(0x15) → etoa → ASCII LF (0x0A) ← roundtrip OK
BUT:
EBCDIC LF (0x25) → etoa → ASCII NEL (0x85) ← NOT LF!
ASCII NEL (0x85) → atoe → EBCDIC NEL(0x15) ← NOT 0x25!
CP037, by contrast, is symmetric:
ASCII LF (0x0A) ↔ EBCDIC LF (0x25) ← roundtrip clean
ASCII NEL (0x85) ↔ EBCDIC NEL (0x15) ← roundtrip clean
But because the compiler generates 0x15 for \n, even CP037 tables need hacks. The HTTPD 3.3.x httpetoa.c had a switch override specifically for this:
case n: /* EBCDIC NL (0x15) */
buf[i]=0x0A; /* ASCII LF */
break;Concrete Bug Example
When HTTPD sends http_printf(httpc, "HTTP/1.0 200 OK\r\n"):
- Compiler generates:
0x0D 0x15(CR + NEL) - Pure CP037 table translates 0x15 → 0x85 (Latin-1 NEL)
- Browser receives CR + 0x85 — not recognized as line ending
- HTTP response headers unparseable → empty response
Current workaround: switch hack in httpetoa.c, or modified tables that are no longer standard CP037.
The Proposed Fix
Change c2asm370 so \n produces 0x25 (EBCDIC LF) instead of 0x15 (EBCDIC NEL).
With this change:
"\r\n"becomes 0x0D 0x25 in the compiled EBCDIC binary- CP037 translates 0x25 → 0x0A (ASCII LF) directly — no hack needed
- Standard CP037 tables achieve 256/256 roundtrip
- No switch overrides, no modified tables, no codepage confusion
Risk Analysis
Risk 1 — Hardcoded 0x15 in crent370
If any code uses 0x15 as hex literal (instead of \n), it breaks — the meaning of \n changes but the hex literal does not.
Action required:
grep -rn "0x15\|\\\\x15" src/ include/Every hit must be classified: "the value of newline" (must change to 0x25) vs. "the value of NEL" (stays 0x15).
Risk 2 — MVS system interface expectations
WTO, TPUT/TGET, spool I/O may expect 0x15 as newline. If wtof("message\n") sends 0x25 instead of 0x15, console output might break.
Action required: Identify all MVS system interfaces in crent370 that process text with newlines. Test each with 0x25.
Risk 3 — TCP/IP stack (DYN75)
The socket layer uses CP037 internally. If any part of DYN75 relies on \n being 0x15, networking could break.
Action required: Audit DYN75 socket code for newline handling.
Risk 4 — Existing data on disk
Files written with old compiler contain 0x15 as newline. New compiler code searches for 0x25 when comparing against \n. Mismatch on text file reads.
Mitigation: MVS datasets use record-oriented I/O (LRECL-delimited, not newline-delimited). UFS files are the main concern — after a full rebuild, new files are consistent. Old UFS files may need migration or a compatibility shim.
Risk 5 — Full ecosystem rebuild required
Everything must be recompiled: crent370, ufs370, lua370, mqtt370, httpd, mvsmf, ftp370, ufsd. No mixing old-compiler and new-compiler code — \n would have different values.
Action required: Plan coordinated release. CI must enforce consistent compiler version.
Risk 6 — Assembler code mismatch
Hand-written IFOX00 assembler uses EBCDIC constants directly (DC C\n etc.). These are unaffected by a compiler change, but the mismatch between C code (\n=0x25) and ASM code (possibly using 0x15) could cause subtle bugs.
Action required: Audit all .asm files across the ecosystem for NL/NEL constants.
Suggested Approach
- Audit crent370 — grep for hardcoded 0x15 and \x15, classify each hit
- Audit DYN75 — socket code newline handling
- Audit assembler — all .asm files across projects for NL constants
- Prototype — change c2asm370 character mapping, rebuild crent370, run basic tests
- Integration test — rebuild httpd + mvsmf, run full curl test suite
- Decision — go/no-go based on findings
- Coordinated release — all projects rebuilt with new compiler
Context
Identified during the HTTPD 4.0.0 codepage translation redesign. The current workaround is a post-translation NL fix in the HTTP transport layer (httpxlat.c). This issue (Option D) would make all workarounds unnecessary.
Cross-reference: mvslovers/httpd HTTPD 4.0.0 codepage work.