Writing 16-Bit Tiny Model Standalone C Program With Open Watcom

A few days ago I've heard that Open Watcom is able to generate "pure" 16-bit code from C source code, so I decided to try setting up a workflow with it.

Explicit FAR pointer

8086 has a 20-bit address bus, but its registers are 16-bit; so it breaks a 20-bit address into two 16-bit parts called the "segment" and the "offset" and combine them using the formula ADDR = SEGMENT * 0x10 + OFFSET. This yields the two different types of pointer: FAR pointer (which involves both the "segment" and "offset" part) and NEAR pointer (which only involves the "offset" part). Using the "tiny" model Open Watcom C treats all pointers as NEAR pointer by default, so in order to directly refer to a "full" location in memory one must explicitly uses a FAR pointer. For example, when compiled with Open Watcom, this piece of code will generate the wrong result:

/* Compile with `wcc -0 -ms [filename]`  */
void main(void) {
    /* Notice that it's 0xb8000000, which wouldn't fit in a single 16-bit register */
    char* x = (char*) 0xb8000000;
    x[0] = 'A';

(with Open Watcom you have to use the SEGMENT:OFFSET format as well instead of the actual result of SEGMENT * 10h + OFFSET, so to point to the VGA text mode video memory one should use 0xb8000000 instead of 0xb8000.)

disassemble the generated object file using wdis [obj-filename]:

Segment: _TEXT BYTE USE16 0000000E bytes
0000                            main_:
0000  B8 04 00                          mov             ax,0x0004
0003  E8 00 00                          call            __STK
0006  53                                push            bx
0007  31 DB                             xor             bx,bx
0009  C6 07 41                          mov             byte ptr [bx],0x41
000C  5B                                pop             bx
000D  C3                                ret

you can see it's XOR BX, BX (which is a faster equivalent of MOV BX, 0) and the 0xb800 part is no where to be seen; (the __STK is for stack overflow checking only) this is because x is being treated as a NEAR pointer. To fix this you need to use char far *:

void main(void) {
    char far* x = (char far *) 0xb8000;
    x[0] = 'A';

this will yield the following result:

Segment: ex01_TEXT BYTE USE16 00000017 bytes
0000                            main_:
0000  B8 06 00                          mov             ax,0x0006
0003  9A 00 00 00 00                    call            __STK
0008  53                                push            bx
0009  BB 00 80                          mov             bx,0x8000
000C  B8 0B 00                          mov             ax,0x000b
000F  8E C0                             mov             es,ax
0011  26 C6 07 41                       mov             byte ptr es:[bx],0x41
0015  5B                                pop             bx
0016  CB                                retf

Replacing the wrapper

Watcom C will link the object file compiled from the C source code against a certain "wrapper" file when generating executable. This wrapper file is actually the "real" entrance to the program. In Watcom's documentation it says:

Tiny memory model programs are created by compiling all modules with the small memory model option and linking in the special initialization file "CSTART_T.OBJ". This file is found in the Open Watcom C/C++ LIB286\DOS directory. It must be the first object file specified when linking the program.

This CSTART_T.OBJ contains DOS-specific code (it uses the DOS syscall INT 21h), so we cannot have a "real" standalone if we don't force the linker to use our own wrapper. A proper wrapper for Watcom C will need to meet the following criteria:

  • The wrapper file needs to export a _cstart_ label because this will be the real entry to the overall program. Watcom C will append an underscore _ after the name of the function when compiling, so to call the main function in C we need to call main_ in _cstart_. This also means you actually don't have to provide a main function in C code; you can make the wrapper call entry_ and the main entrance in C will be the function entry. But of course you'll want to just call main_ to avoid possible confusion.
  • The wrapper file needs to export a _small_code_ label. This label does absolutely nothing per se, but the linker will use this to do some kind of "checking" (probably for checking if the memory model is consistent between object files).
  • The wrapper file also needs to export a __STK label even if you're not using or implementing any stack overflow checking. This is optional because you can skip generating code for that using the -s command line option.

Full code


        .model tiny

        ;; this ORG line is necessary for .COM files so that the data
        ;; section will not be referred by wrong locations.
        ORG 100h

        ;; reference to the `main` function in C.
        extern main_: near ptr

        ;; export these symbols
        public _cstart_, __STK, _small_code_

        ;; Required for linker to recognize
_small_code_ label near

        ;; program starting point.
        ;; there shouldn't be any data or instructions before this point
        ;; enable interrupt and setup stack pointer. normally
        ;; these two would be done by DOS when loading .COM file.
        ;; these instructions are here just to make sure.
        ;; because we're using the tiny model so we don't need
        ;; to setup SS.
        MOV SP, 0FFFEh

        ;; jump to the main function in C.
        CALL main_

        ;; because this is for .COM file under DOS, one should have
        ;; a RET instruction here. should be replaced with things
        ;; like HLT or infinite loop if for "real" standalone programs.

        ;; stack overflow checking.
        ;; this does nothing but you can add your own checking here.

        end _cstart_

compile with wasm wrapper.asm.


void main(void) {
    char far* x = (char far*) 0xb8000000;
    x[0] = 'A';
    x[1] = 0x70;

compile with wcc -0 -ms main.c.


format dos com
option map
name main.com
file wrapper.o, main.o

Link with wlink @main.lnk (the @ is necessary). The option map generates the memory map of the linked executable. This linker script will generate main.com (as specified by the name directive). Executing main.com will put an 'A' at the top left corner.

This is a screenshot of the generated .COM file; notice the uppercase letter 'A' with white background at the top left corner.



© Sebastian Higgins 2021 All Rights Reserved.
Content on this page is distributed under the CC BY-NC-SA 4.0 license unless further specified.
Last update: 2021.9.14