Writing 16-Bit Tiny Model Standalone C Program With Open Watcom
A few days ago I've heard that Open Watcom is able to generate "pure" 16-bit code from C source code, so I decided to try setting up a workflow with it.
Explicit FAR pointer
8086 has a 20-bit address bus, but its registers are 16-bit; so it breaks a 20-bit address into two 16-bit parts called the "segment" and the "offset" and combine them using the formula ADDR = SEGMENT * 0x10 + OFFSET
. This yields the two different types of pointer: FAR pointer (which involves both the "segment" and "offset" part) and NEAR pointer (which only involves the "offset" part). Using the "tiny" model Open Watcom C treats all pointers as NEAR pointer by default, so in order to directly refer to a "full" location in memory one must explicitly uses a FAR pointer. For example, when compiled with Open Watcom, this piece of code will generate the wrong result:
/* Compile with `wcc -0 -ms [filename]` */ void main(void) { /* Notice that it's 0xb8000000, which wouldn't fit in a single 16-bit register */ char* x = (char*) 0xb8000000; x[0] = 'A'; }
(with Open Watcom you have to use the SEGMENT:OFFSET
format as well instead of the actual result of SEGMENT * 10h + OFFSET
, so to point to the VGA text mode video memory one should use 0xb8000000
instead of 0xb8000
.)
disassemble the generated object file using wdis [obj-filename]
:
Segment: _TEXT BYTE USE16 0000000E bytes 0000 main_: 0000 B8 04 00 mov ax,0x0004 0003 E8 00 00 call __STK 0006 53 push bx 0007 31 DB xor bx,bx 0009 C6 07 41 mov byte ptr [bx],0x41 000C 5B pop bx 000D C3 ret
you can see it's XOR BX, BX
(which is a faster equivalent of MOV BX, 0
) and the 0xb800
part is no where to be seen; (the __STK
is for stack overflow checking only) this is because x
is being treated as a NEAR pointer. To fix this you need to use char far *
:
void main(void) { char far* x = (char far *) 0xb8000; x[0] = 'A'; }
this will yield the following result:
Segment: ex01_TEXT BYTE USE16 00000017 bytes 0000 main_: 0000 B8 06 00 mov ax,0x0006 0003 9A 00 00 00 00 call __STK 0008 53 push bx 0009 BB 00 80 mov bx,0x8000 000C B8 0B 00 mov ax,0x000b 000F 8E C0 mov es,ax 0011 26 C6 07 41 mov byte ptr es:[bx],0x41 0015 5B pop bx 0016 CB retf
Replacing the wrapper
Watcom C will link the object file compiled from the C source code against a certain "wrapper" file when generating executable. This wrapper file is actually the "real" entrance to the program. In Watcom's documentation it says:
Tiny memory model programs are created by compiling all modules with the small memory model option and linking in the special initialization file "
CSTART_T.OBJ
". This file is found in the Open Watcom C/C++LIB286\DOS
directory. It must be the first object file specified when linking the program.
This CSTART_T.OBJ
contains DOS-specific code (it uses the DOS syscall INT 21h), so we cannot have a "real" standalone if we don't force the linker to use our own wrapper. A proper wrapper for Watcom C will need to meet the following criteria:
- The wrapper file needs to export a
_cstart_
label because this will be the real entry to the overall program. Watcom C will append an underscore_
after the name of the function when compiling, so to call themain
function in C we need to callmain_
in_cstart_
. This also means you actually don't have to provide amain
function in C code; you can make the wrapper callentry_
and the main entrance in C will be the functionentry
. But of course you'll want to just callmain_
to avoid possible confusion. - The wrapper file needs to export a
_small_code_
label. This label does absolutely nothing per se, but the linker will use this to do some kind of "checking" (probably for checking if the memory model is consistent between object files). - The wrapper file also needs to export a
__STK
label even if you're not using or implementing any stack overflow checking. This is optional because you can skip generating code for that using the-s
command line option.
Full code
wrapper.asm
.8086 .model tiny .code ;; this ORG line is necessary for .COM files so that the data ;; section will not be referred by wrong locations. ORG 100h ;; reference to the `main` function in C. extern main_: near ptr ;; export these symbols public _cstart_, __STK, _small_code_ ;; Required for linker to recognize _small_code_ label near ;; program starting point. ;; there shouldn't be any data or instructions before this point _cstart_: ;; enable interrupt and setup stack pointer. normally ;; these two would be done by DOS when loading .COM file. ;; these instructions are here just to make sure. ;; because we're using the tiny model so we don't need ;; to setup SS. STI MOV SP, 0FFFEh ;; jump to the main function in C. CALL main_ ;; because this is for .COM file under DOS, one should have ;; a RET instruction here. should be replaced with things ;; like HLT or infinite loop if for "real" standalone programs. RET __STK: ;; stack overflow checking. ;; this does nothing but you can add your own checking here. RET end _cstart_
compile with wasm wrapper.asm
.
main.c
void main(void) { char far* x = (char far*) 0xb8000000; x[0] = 'A'; x[1] = 0x70; return; }
compile with wcc -0 -ms main.c
.
main.lnk
format dos com option map name main.com file wrapper.o, main.o
Link with wlink @main.lnk
(the @
is necessary). The option map
generates the memory map of the linked executable. This linker script will generate main.com
(as specified by the name
directive). Executing main.com
will put an 'A' at the top left corner.
This is a screenshot of the generated .COM file; notice the uppercase letter 'A' with white background at the top left corner.
© Sebastian Higgins 2021 All Rights Reserved.
Content on this page is distributed under the CC BY-NC-SA 4.0 license unless further specified.
Last update: 2021.9.14