The Design of Flowmark, a Text Macro Language


This is an idea I've had after writing the last blog post. The book Etude for Programmers contains a chapter for "automatic text formatting", and it mentioned that it was typesetted in a variant of TRAC. That got me thinking what if I somehow combine the both; I'll make it generate Postscript (since that's the only language I've had experience with and it seemed like it's the easiest to codegen) and since TRAC is a text macro language the necessary typesetting primitives can themselves be text macros that expands into Postscript code as well. It all seemed very natural so I decided that this is worth doing. This blog post is about the language side of things.

Introduction

A taste of Flowmark

The following is an example of the factorial function

\def(Factorial,(\
  \ifeq.int(<1>,0,\
    0,\
    (\ifeq.int(<1>,1,1,(\mult.int(<1>,\call(Factorial,\sub.int(<1>,1))))))\
  )\
));
\init.macro(Factorial);
\print(\call(Factorial,5));

The following is an example of a solution to the Tower of Hanoi problem.

\def.free($,(\print((
))));
\def(Hanoi,\
  (\ifeq.int(<1>,0,,\
    (\ifeq.int(<1>,1,\
      (\print(Move from <from> to <to>)$),\
      (\call(Hanoi,\sub.int(<1>,1),<from>,<via>,<to>)\
      \print(Move from <from> to <to>)$\
      \call(Hanoi,\sub.int(<1>,1),<via>,<to>,<from>))\
    ))\
  ))\
);
\init.macro(Hanoi,,from,to,via);
\print(\call(Hanoi,3,A,C,B));

Difference between Flowmark and TRAC T64

  • Hashes # that starts a function call is replaced with slashes \.
  • Function call syntax is slightly different (\func(arg1,arg2,...) vs. #(func,arg1,arg2,...)).
  • Default meta character in Flowmark is semicolon ; instead of apostrophe '.
  • In T64 spaces are preserved, forcing many TRAC source code to be left-aligned. In Flowmark, any consequential whitespaces after a slash \ is ignored altogether; this allows one to indent their code; if whitespaces are needed, one could always use the protective parentheses.
  • The at-sign @ is used as some kind of "global escape character"; it is guaranteed that the next character after an at-sign @ (except when it's in protective parentheses) is retained regardless of any syntax rules and previously defined macros.
  • The way to define text macros is slightly different; in Flowmark it's like a combination of T64 and T84. (explained later)
  • It's possible to extend the processing algorithm to a limited content by using something called a freeform macro. (explained later)

Defining text macros in Flowmark

In Flowmark there are two kinds of macros, called normal macro and freeform macro respectively.

Defining normal macro

Normal macro is the same as plain-old macros in T64. In Flowmark, defining a normal macro is done in two steps:

  • Define a form with \def;
  • Turn the defined form with \init.macro.

The syntax for normal macro in Flowmark is taken from TRAC T84: gaps are represented by integers surrounded with angle brackets <>. For example:

\def(STR,(The quick brown <2> jumps over the lazy <1>.));
\init.macro(STR);

is equivalent to this in T64:

#(ds,STR,(The quick brown FOX jumps over the lazy DOG.))'
#(ss,STR,DOG,FOX)'

One can also use named gaps like this:

\def(STR,(The quick brown <FOX> jumps over the lazy <DOG>.));
\init.macro(STR,DOG,FOX);

The end result is the same.

Pieces

A piece (in Flowmark terminology) is a minimal semantically meaningful substring. A piece can be one of the followings:

  • A single character that is not a part of any special construct (either by themselves being not a part of any special construct or by escaping with at-sign @);
  • A function call, both active and neutral;
  • A whitespace escape sequence;
  • A freeform macro name (explained later);

The concept of piece in Flowmark is quite important; we'll see this very soon.

Forward-reading

Flowmark supports forward-reading, which allows text macros themselves to read the upcoming source text themselves instead of delegating the reading to the processing algorithm; this is similar to reader macro in LISPs, the difference being forward-reading occurs at runtime.

Freeform macro

A freeform macro is a kind of "special text macro" that's directly expanded during the execution of the processing algorithm instead of full/partial calling (i.e. by primitives like call and recite.*)

The name for a freeform macro can only contain the following characters:

  • A hash #;
  • A tilde ~;
  • A backtick `;
  • A dollar sign $;
  • A percent sign %;
  • A circumflex ^;
  • An ampersand &;
  • An underscore _;

Although freeform macros do not have the ability to take an argument list, it can still handle the upcoming text by expanding into forward-reading primitives. Consider this example for defining syntax sugar for superscripts, subscripts and math mode in a possible typesetting library; one would define the freeform macro ^, _ and $$ as follows:

\def.free($$,(\toggle(mode.math)));
\def.free(^,(\format.superscript(\\next.piece)));
\def.free(_,(\format.subscript(\\next.piece)));
$$a^2+b^2=c^2$$
$$e^(\pi i)=1$$
$$A_i + B_(ij) <= C_k$$

This would be the equivalent to:

\toggle(mode.math)a\format.superscript(2)+b\format.superscript(2)=c\format.superscript(2)\toggle(mode.math)
\toggle(mode.math)e\format.superscript(\pi i)=1\toggle(mode.math)
\toggle(mode.math)A\format.subscript(i) + B\format.subscript(ij) <= C\format.subscript(k)\toggle(mode.math)

Implementation

A work-in-progress implementation in Nim can be found here. Have to stop working on it for now; got other stuff to do.


Back

Last update: 2024.1.13