Mini16 Part 1: FAT12, Near & Far Pointers, and Thinking Before Coding

I was so scared that I would've wasted 7 days for absolutely nothing - an fate even worse than wasting 7 days on a 16-bit toy operating system (which, of course, is also the same as nothing). Finally finished something presentable at the end of this week.

The current state of things

I have another floppy disk image created with MSDOS mounted as the B: drive. `@read_file` now currently reads to 1000:0000. All the "NIHAO" are folders. The sigma sign seems to be a "deleted" mark used by MSDOS.

I switched from directly manipulating video memory to teletype output (INT 10/AH=0eh).
I tried to implement LBA addressing for disk access but failed. The reason this is important is that legacy BIOS CylinderHeadSector addressing combined with the IDE interface only gives you access to the first ~504 MBytes of the disk (more info about this can be read here).
You can shutdown now (but reboot is still broken under QEMU).
Part of FAT12 support.

I'll probably rewrite the whole OS after isolating the few necessities - the bootloader, the linker script, the keyboard handler and the FAT12 driver. These parts when combined together can serve as the basis for a 16-bit toy operating system. I have a few other ideas that are drastically different from the conventional approach; I'll probably stop working on Mini16 before realizing those ideas.

The near and far pointer hussle

The fact that I really didn't have any consistent rules about what kind of pointer should I use finally bites me back. When will the difference between near and far pointer become a trouble? When you need to work across multiple segment. When your kernel fits within one segment (64KBytes), it's better to use near pointers internally and only use far pointers when implementing syscall. 32-bit protected mode does not have this issue because there's just one kind of pointer.

FAT12

As someone who have never worked towards a RAM-confined situation, I have never felt so much pain before. Maybe I'm just this bad at programming.

Thinking before coding

File drivers aren't the kind of thing that you can just write a "Minimal Viable Product" and iterate upon it: the full spec *is* the MVP, so unless you did everything you have nothing. After a few days of mindlessly generating garbage and getting nowhere, I decided to start again and write stuff like this:

// find next cluster with a FATClusterPointer
// NOTE: FAT values are pointers pointing to the next cluster
// 0.  require a DriveParameter(dp)
//             a FATDescriptor(desc)
//         and a FATClusterPointer(cp)
// 1.  calculate how many cluster there is in 3 sector (n_cluster_in_group):
//         (3 * desc.bytes_per_sector) / 12
// 2.  calculate which group is cp.cluster_id in (group_current)
// 3.  calculate which group is cp.cluster_value in (group_next)
// 4.  if group_current != group_next:
//         load fat table (group group_n)
// 5.  calculate the offset
//     because we are talking about fat12 here so things are trickier. for fat12
//     the clusters are arranged like this:
//             AB CD EF GH IJ KL ...
//         three bytes maps to 2 cluster & the whole thing is in little-endian,
//         so it's actually:
//             cluster 1 & 0: EFCDAB --> 0xEFC 0xDAB
//                 cluster 0: offset 1 lower nibble ~ offset 0 byte
//                 cluster 1: offset 2 byte ~ offset 1 higher nibble
//             cluster 3 & 2: KLIJGH --> 0xKLI 0xJGH
//                 cluster 2: offset 4 lower nibble ~ offset 3 byte
//                 cluster 1: offset 5 byte ~ offset 4 higher nibble
//     we should make next_cluster_id be next_cluster_id % length_of_group_in_cluster first.
//     let modded_id = next_cluster_id % (desc.bytes_per_sector * 3 / 12)
//     so for even cluster id:
//         offset [modded_id / 2 * 3 + 1] lower nibble ~ offset [modded_id / 2 * 3] byte
//     for odd cluster id:
//         offset [modded_id / 2 * 3 + 2] byte ~ offset [modded_id / 2 * 3 + 1] higher nibble
// 6.  retrieve the cluster value.
// 7.  update cp.
// RETURNS 1 and don't do anything if end of cluster chain.

It works like a charm. The coding was way easier after I wrote stuff down, and there were way less bugs after I have done the coding.