Skip to content

Commit bc9f11c

Browse files
committed
feat: unified ASCII/EBCDIC translation subsystem (httpxlat)
Replace the hybrid CP037/IBM-1047 translation tables (asc2ebc.c, ebc2asc.c, httpetoa.c, httpatoe.c) with a single httpxlat module providing three codepage pairs: - CP037 (default) — symmetric NL/LF roundtrip, no data corruption - IBM-1047 — for Open Systems / z/OS USS conventions (Zowe, mvsmf) - LEGACY — exact 3.3.x behavior preserved for backward compatibility The server-default codepage is selected at startup via http_xlate_init(). CGI modules can choose a specific codepage per call through the HTTPX vector table (http_xlate + codepage pair pointers appended at offsets 0x110-0x11C). Existing http_etoa/http_atoe stay at HTTPX offsets 0x74/0x78 with unchanged signatures. Global pointers asc2ebc/ebc2asc are retained for backward compatibility with old httpd code. Fixes the pipe character corruption bug where EBCDIC 0x4F was mapped to ASCII ']' instead of '|' in the old hybrid tables. Fixes #30
1 parent 38d576c commit bc9f11c

File tree

14 files changed

+1257
-132
lines changed

14 files changed

+1257
-132
lines changed

doc/httpxlat-integration.md

Lines changed: 191 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,191 @@
1+
# HTTPD Translation Subsystem — Integration Guide
2+
3+
## New Files
4+
5+
- `src/httpxlat.c` — All tables, codepage management, http_xlate(), httpetoa(), httpatoe()
6+
- `include/httpxlat.h` — HTTPCP typedef, function declarations
7+
8+
## Files to DELETE
9+
10+
- `src/asc2ebc.c` — Replaced by tables in httpxlat.c
11+
- `src/ebc2asc.c` — Replaced by tables in httpxlat.c
12+
- `src/httpetoa.c` — httpetoa() now in httpxlat.c (no switch hacks)
13+
- `src/httpatoe.c` — httpatoe() now in httpxlat.c
14+
15+
## Files to MODIFY
16+
17+
Both `httpd.h` and `httpcgi.h` define their own copy of `struct httpx` and
18+
the macro layer. **Both must be updated in sync.**
19+
20+
- `httpd.h` — used by the server itself (full crent370/libufs includes,
21+
`extern` declarations with `asm()` labels, `#ifndef HTTP_PRIVATE` guard)
22+
- `httpcgi.h` — used by external CGI modules (lightweight, no crent370
23+
dependency, macros only)
24+
25+
### include/httpd.h
26+
27+
1. Add `#include "httpxlat.h"` near top (after existing includes)
28+
29+
2. Add `HTTPCP` typedef alongside the other typedefs (after `HTTPCGI`):
30+
31+
```c
32+
typedef struct httpcp HTTPCP; /* Codepage pair */
33+
```
34+
35+
3. In the HTTPX struct, **append at end** (after mqtc_pub at offset 0x10C):
36+
37+
```c
38+
unsigned char *(*http_xlate)(unsigned char *, int, const unsigned char *);
39+
/* 110 translate with explicit table */
40+
HTTPCP *xlate_cp037; /* 114 CP037 codepage pair */
41+
HTTPCP *xlate_1047; /* 118 IBM-1047 codepage pair */
42+
HTTPCP *xlate_legacy; /* 11C legacy hybrid codepage pair */
43+
```
44+
45+
4. Add `extern` with `asm()` label (near the other extern declarations):
46+
47+
```c
48+
extern unsigned char *http_xlate(unsigned char *, int, const unsigned char *) asm("HTTPXLAT");
49+
```
50+
51+
5. In the `#ifndef HTTP_PRIVATE` macro section, add CGI-accessible macro:
52+
53+
```c
54+
#define http_xlate(buf,len,tbl) \
55+
((httpx->http_xlate)((buf),(len),(tbl)))
56+
```
57+
58+
6. Keep the `extern` declarations for old globals (lines 81-82) — they
59+
still exist (defined in httpxlat.c for backward compat) but should not
60+
be used in new code:
61+
62+
```c
63+
extern UCHAR *ebc2asc; /* backward compat — points to active default */
64+
extern UCHAR *asc2ebc; /* backward compat — points to active default */
65+
```
66+
67+
### include/httpcgi.h
68+
69+
1. Add `HTTPCP` typedef alongside the other forward declarations:
70+
71+
```c
72+
typedef struct httpcp HTTPCP; /* Codepage pair — opaque */
73+
```
74+
75+
2. In the HTTPX struct, **append at end** (after mqtc_pub at offset 0x10C):
76+
77+
```c
78+
unsigned char *(*http_xlate)(unsigned char *, int, const unsigned char *);
79+
/* 110 translate with explicit table */
80+
HTTPCP *xlate_cp037; /* 114 CP037 codepage pair */
81+
HTTPCP *xlate_1047; /* 118 IBM-1047 codepage pair */
82+
HTTPCP *xlate_legacy; /* 11C legacy hybrid codepage pair */
83+
```
84+
85+
3. Add `http_xlate` macro (after the `mqtc_pub` macro):
86+
87+
```c
88+
#define http_xlate(buf,len,tbl) \
89+
((httpx->http_xlate)((buf),(len),(tbl)))
90+
```
91+
92+
### src/httpx.c
93+
94+
Add the four new entries at the end of the static vector:
95+
96+
```c
97+
mqtc_pub, /* 10C mqtc_pub() */
98+
http_xlate, /* 110 http_xlate() */
99+
&http_cp037, /* 114 CP037 codepage pair */
100+
&http_cp1047, /* 118 IBM-1047 codepage pair */
101+
&http_legacy, /* 11C legacy hybrid codepage pair */
102+
};
103+
```
104+
105+
### src/httpconf.c (Lua version, until Parmlib replaces it)
106+
107+
Add `http_xlate_init()` call during startup:
108+
109+
```c
110+
/* After process_httpd() or at the end of http_config(): */
111+
http_xlate_init("CP037"); /* or read from Lua/config */
112+
```
113+
114+
When Parmlib replaces Lua, the parser reads `CODEPAGE` keyword and calls
115+
`http_xlate_init(value)`.
116+
117+
### Parmlib addition
118+
119+
```
120+
CODEPAGE CP037 # default if omitted
121+
# CODEPAGE IBM1047
122+
# CODEPAGE LEGACY
123+
```
124+
125+
## How mvsMF Uses This
126+
127+
### Before (private tables in xlate.c)
128+
129+
```c
130+
#include "xlate.h" /* mvsMF private IBM-1047 tables */
131+
mvsmf_atoe(buf, len);
132+
mvsmf_etoa(buf, len);
133+
```
134+
135+
### After (via HTTPX vector)
136+
137+
```c
138+
/* Use server default (what CODEPAGE says in Parmlib) */
139+
http_etoa(buf, len);
140+
http_atoe(buf, len);
141+
142+
/* Or use explicit codepage */
143+
HTTPX *httpx = http_get_httpx(session->httpd);
144+
http_xlate(buf, len, httpx->xlate_1047->etoa);
145+
http_xlate(buf, len, httpx->xlate_cp037->atoe);
146+
```
147+
148+
mvsMF can then delete `src/xlate.c` and `include/xlate.h`.
149+
150+
## Key Design Decisions
151+
152+
1. **CP037 as default** — Symmetric NL/LF roundtrip. No data corruption.
153+
2. **Legacy tables preserved exactly** — Including the httpetoa.c switch
154+
behavior baked into legacy_etoa[0x15]=0x0A. No existing behavior lost.
155+
3. **HTTPX backward compatible** — http_etoa/http_atoe stay at offsets
156+
0x74/0x78 with identical signature. New entries appended at end.
157+
4. **512 bytes per codepage** (256 atoe + 256 etoa). Total: 1,536 bytes
158+
for all three. Negligible.
159+
5. **Global pointers asc2ebc/ebc2asc retained** — Old httpd code that
160+
references them directly (httpgets.c, httpdeco.c, etc.) continues
161+
to work. They point to the active default tables.
162+
163+
## Roundtrip Verification (Critical Characters)
164+
165+
### CP037 (default)
166+
```
167+
ASCII LF (0x0A) -> atoe -> 0x25 -> etoa -> 0x0A OK
168+
ASCII | (0x7C) -> atoe -> 0x4F -> etoa -> 0x7C OK
169+
ASCII [ (0x5B) -> atoe -> 0xAD -> etoa -> 0x5B OK
170+
ASCII ] (0x5D) -> atoe -> 0xBD -> etoa -> 0x5D OK
171+
ASCII ^ (0x5E) -> atoe -> 0xB0 -> etoa -> 0x5E OK
172+
ASCII \ (0x5C) -> atoe -> 0xE0 -> etoa -> 0x5C OK
173+
```
174+
175+
### IBM-1047
176+
```
177+
ASCII LF (0x0A) -> atoe -> 0x25 -> etoa -> 0x85 ASYMMETRIC (known)
178+
EBCDIC NEL(0x15) -> etoa -> 0x0A (NEL -> LF, 1047 convention)
179+
ASCII | (0x7C) -> atoe -> 0x4F -> etoa -> 0x7C OK
180+
ASCII [ (0x5B) -> atoe -> 0xAD -> etoa -> 0x5B OK
181+
ASCII ] (0x5D) -> atoe -> 0xBD -> etoa -> 0x5D OK
182+
```
183+
184+
### LEGACY (preserved 3.3.x behavior)
185+
```
186+
ASCII | (0x7C) -> atoe -> 0x6A -> etoa -> 0x7C OK (self-consistent but non-standard)
187+
ASCII [ (0x5B) -> atoe -> 0xAD -> etoa -> 0x5B OK
188+
ASCII ] (0x5D) -> atoe -> 0xBD -> etoa -> 0x5D OK
189+
EBCDIC 0x4A -> etoa -> 0x5B [ (maps to [ like 0xAD — double mapping)
190+
EBCDIC 0x4F -> etoa -> 0x5D ] (maps to ] — legacy bug preserved)
191+
```

0 commit comments

Comments
 (0)