The History of Cleword (As I Remembered It)


NOTE: This has been written across a long time, so some part of it might sound incoherent.

"Cleword" is the name of a series of DSLs made by an internet education company first founded in Amoy and later moved to Shenzhen named Forchange Technology Co. Ltd (and from this point forwards I shall refer to it as "the Inc" for brevity) for authoring educational content on content presentation systems (I'll call them "engines" for brevity) made by the same company. The latest installment of Cleword exists in the form of multiple command languages implemented as a subset of YAML used with the engines hosted on a public website named "ClewordPub", which is owned by the Inc.

Cleword is the most important work I've done throughout 2019~2021, and it's honestly embarassing that this is the case. The things I'm about to disclose here is technically still under NDA (or at least I believe so), and the act of writing all of this down was not approved by any one from the Inc. The reason why I'm choosing to break the law here is that I broke the partition table of my laptop's hard drive recently and I lost a nearly completed draft of this article which I've decided to publish once the Inc went out of business; this particular part of my memory has been fading quite a lot since I first started that version of the article, so I'm not willing to let these memories completely disappear; also I'm getting tired of keeping it as a secret. I'm writing all of this down when I could still remember, so that people in the future could at least gain some sort of insights from it.

The beginning of Cleword (2017)

I started working for the Inc as an intern since August 2018, and the first version of Cleword already existed before; it was first created as a browser plugin for Chrome in 2017. To understand why this is the case, I must first explain how the writing team of the Inc pushed content out back then:

  • The dev team made the engine as a webapp. Imagine it being like a chat window with a pre-determined conversation which you can advance in your own pace. At the early beginning this was their whole shtick - a thing that you can advance in your own pace. I'm not saying this is unimpressive or this is an bad idea - to provide education material for an undetermined number of people, you'll have to allow people to consume in their own speed eventually.
  • The job of the writing team is to come up with the content of the conversation. Before Cleword, they were using an online multi-user collaborative rich-text editor; they needed to have collaboration features and they were familiar with things like Microsoft Word, which makes it a perfectly reasonable choice.
  • The dev team, on the other hand, made a configuration system to the webapp which allows them to put the text from the writing team into their database. (If you're lucky enough to tried out StoryNexus from Failbetter Games yourself - it's like that, but for a conversation-like format.)
  • This created a gap between the creation of the content and the deployment of the content, and manually adding things line by line was slow and error-prone.

The first version of Cleword was designed to solve this problem. As a browser plugin, it directly extracts the text from the online rich-text editor the writing team was using, parses them, and stores them into the database. The act of parsing requires the writing team to at least follow some kind of syntax & grammar, so a syntax was defined in the form of the implementation; this was the beginning of the language. The CTO of the Inc came up with the name "Cleword", which is a portmanteau of "clever" and "Word" (which refers to Microsoft Word).

This solution worked but worked awkwardly, because it relies on knowledge about that one specific online rich-text editor, which at that time tends to change a lot and thus break the plugin; thus they've decided that it should be based on plain-text, and it's like a programming language and they should have a "compiler" for it; later, when the Inc decided to turn its direction towards teaching adults programming in Python, a new system for programming exercises was made and another DSL was made for it, named "Cleword Exercise". This was the Cleword the Inc was using when I joined.

"Cleword 2" (2018)

Cleword 2 was another implementation of Cleword done by another person in the Cleword team as an experiment. The author, back then, was an acquaintance of mine, and it was due to his help that I got a job at the Inc. The implementation was written in Racket and made use of DrRacket, Racket's own rich-text editor. The idea was that at one point in time the writing team would need a simpler editor for authoring Cleword documents. The same person also suggested that the company should introduce Git (in the form of GitLab) to the writing team to replace the collaboration functionality lost when transitioning from the aforementioned online rich-text editor (it was still commonly used among the writing team, but instead of the actual content of the materials itself it was used for authoring drafts with multiple people) to the home-brew plain-text DSL. This version of Cleword, however, never saw massive usage within the Inc.

I have no contact of him these days, but I still think of him from time to time.

Cleword Index (2019.1)

My earliest contribution to the Cleword family was something called Cleword Index, which is a simple configuration language intended to config the possible bundles of the courses they were selling. At the time they were teaching everyone Markdown besides Cleword, so I copied Markdown's syntax for headings and unordered lists for this. The first version was done in a few hours. The language was very simple and it didn't change much syntax-wise, because it was for a very config-y purpose.

Cleword Questionary (2019.5)

Cleword Questionary was an extension to the classic Cleword language intended for a newly-made course feedback system; the idea is on certain times during the course (mostly at the end of sections) there would be this questionaire thing that asks how you feel about the content up till this point; maybe the content worded in a confusing way so you couldn't understand properly, maybe the tone of the language offended you a bit, maybe you were thoroughly satisfied and had no bad words about it; either way, the writing team needed some kind of feedback so that the content can be properly improved.

When it was decided that we should make the tools required to support the extension, the person who wrote the original implementation of Cleword (I shall refer to him as Mr. OG from now on) was getting married and applied for a vacation for his marriage; the CTO (which technically was our tech lead at this point) agreed, so I was tasked on understanding his codebase and build the extension upon it; and also, for some reasons I genuinely forgot, the deadline was insanely tight for a junior developer who still haven't graduated yet (I was still an intern) to understand someone else's codebase just enough to be able to add new things without breaking the existing things. So I decided to re-implement Cleword and Cleword Exercise, and then build the extension upon my code instead of his. I actually managed to accomplish this, since the old Cleword wasn't a very complicated language, but my version failed many times when being used to handle existing Cleword documents, since a good part of the conventions the writing team was following were actually edge cases and the way they were using it made them depend on the specific behaviour of the old compiler; there were also no previously existing test cases within the old code base. A few days later, he came back and completed the extension on his own with slightly different syntax; but the writing team at this time were still using my implementation, so to avoid confusion I choose to support his syntax as well. Then at one point in time I was told that I can stop maintaining mine since the writing team has switched back to the old one.

Early attempt of a standardized language (2019.1 ~ 2019.6)

For whatever reasons, the CTO was not content with Cleword as it is, and I was tasked with designing the "ultimate" version that fits he and the CEO's vision of the ultimate online learning solution; it should be, in the word of the CTO, "one single language" so that the writing team won't have to learn a completely new one whenever they decided to make a new engine. I believe this is actually impossible - for every never-seen-before new engine, they (i.e. the writing team) would definitely need to learn how it works before they can work with it. I managed to convince him that while we can't have only one single semantics (I suggested something like HyperCard and Visual Basic; but he refused, saying that it would require slightly too much effort for the writing team to learn how to use it and it would require way too much effort for the dev team to implement it) we can still have one single syntax (not grammar; in this blog post I shall informally define "syntax" as "how things should be written" and "grammar" as "how things should be to make sense").

Actually, if you go down this "one single syntax" line of thought, you will eventually end up either returning to SGML and DTD or something similar, or returning to S-expr or something similar; and when you've reached that point, you'll ask yourself why we are re-inventing SGML or S-expr for however many times there is; it's not like HTML is some complicated incomprehensible magic stuff, people just get on with it like it's nothing back in the GeoChities era. I don't know why we (at least me myself) were so hell-bent on the idea of "it must be easy to learn and easy to use" to the point where we implicitly assume that the users (in this case, the writing team) won't pay any effort in learning. Later at some point, the team lead of product managers (I'll refer to him as Mr. TC, because he's an ex-Tencent employee) said something along the lines of "it's your job to design the language so easy that the writing team wouldn't have to pay effort learning it!" and I got unreasonably angry. I didn't lash out publicly though.

ClewordPlus (2019.7)

ClewordPlus is the second and the most important contribution to Cleword that I've made. It was completely different language designed for a visual novel engine - we called it "scenario educational system" because it was intended to simulate office scenarios, but it was essentially a visual novel engine for mobile web. This language is very important in the history of Cleword because it's exactly at this point I've decided that Cleword's syntax should not be like Markdown but like a structured programming language instead. This decision was largely influenced by the scripting language of Ren'py. Because we were making Python courses before this so I decided to make the syntax indent-relevant which, one year later, led to the bleak future of using YAML as the basis for Cleword, and it's for this very reason that I often wonder if what I've made was a bad choice here.

Cleword Native (2019.11)

Cleword Native is not a language per se; it's an attempt to port the engines to native, since all engines at this point were web applications. This project never came to fruition.

Cleword 2020 (2019.12)

Cleword 2020 was the first (and the last in the Inc) attempt at a properly standardized syntax for Cleword, although no compilers that implement it were actually in use. Later iterations of Cleword would, due to different reasons, adopt a simpler but different syntax.

At the turn of 2020

At the very end of 2020, a new guy joined the Inc; I shall refer to him as Mr. TL which stands for Team Leader, for reasons that shall be clear later. He would play a very important role in the following story so that's why I'm mentioning him here.

I still remembered the last day of 2019, after leaving for work, me and him had dinner at a local restaurant, and absolutely everything seemed just fine back then. O how Lord enjoys messing with His creation's fate and absolutely destroys them in the process…

Three of a perfect(ly unfortunate) pair

Somewhere between late 2019 and early 2020 they were planning to revamp their original presentation system for teaching Python and combine elements from the visual novel engine of ClewordPlus. This new system, called "Shanguangdan" internally (which means "flash bomb" in Mandarin), later renamed to Project Guyu (because it got delayed until late April, a.k.a. the Guyu period). During this time, three Cleword compilers were in development.

Cleword Nova (2019.12 ~ 2020.7)

Cleword Nova was an implementation of Cleword 2020 with semantics tailored to this new system developed by Mr. OG. For whatever reasons he decided to learn Rust and develop it in Rust. He never completed it and the project was dropped later, mainly because he didn't finish it on time and I had to make a temporary implementation so that the writing team could have something to work with. As far as I know, Mr. OG later went on to create an online text editor designed specifically for the language supported by this temporary implementation and became one of the main tools used by the writing team.

shanclec (2020.1 ~ 2020.8?)

This version of Cleword never had an official name - I never gave it one. "shanclec" was the name of the executable, which was an acronym of "Shanguangdan Cleword Compiler". The reason why this happened is that Shanguangdan itself was done in multiple parts - they first made a replica of their original system so that they could have something to launch to the public at the deadline they set for themselves and only added new stuff afterwards. Since Cleword Nova failed, it remained as the official tool for this specific system. Sometimes people in the Inc referred to it as Cleword Nova after the original Cleword Nova failed but I never approved this. It was essentially two languages stitched together - one with a syntax similar to the classic Cleword in 2018 and one with a syntax similar to ClewordPlus.

Cleword Block & ClewordPub (2020.1? ~ 2020.3?)

Before COVID lockdown, most of the actual work are done by me and Mr. OG; Mr. TL was, for a while, free to explore on his own. He made a few really interesting things, including the only Cleword family language that is actually a general-purpose programming language (which never got used however; in retrospective, we really should ask him to make a compiler generating WebAssembly). After that, he started to make his own implementation of Cleword 2020 as well; he also made the first version of ClewordPub, which would later became the last version of "canonical" Cleword.

Other stuff that happened

Another guy who used to work for Sourcebrella was hired. He did not contribute much to the design of the language per se, but he made some pretty interesting thing during his time in the Inc. I'm mentioning him because I want no one in the original team to be left behind not being mentioned in this story like how the company left me behind on the news of restructuring (see below).

Dev team restructuring & company feud

Now if I remembered correctly around this time CTO was having a bit of self-doubt and was actually planning to give the position to someone else; he kept the position of course, but when he asked me about this around April 2019 I refused since my self-esteem was at all-time low after a big fail of my presentation of my undergrad graduation project (I still haven't forgive the ones who judged till this day).

He kept looking for his successor after COVID - after the first few months of 2020 there weren't big outbreaks in Shenzhen, so we went back to the office for work. I noticed quite a few new faces during that time:

  • There was this guy who, if my research was indeed correct, used to run a startup himself until the idea never worked and the company eventually folded (they were doing a TripAdvisor-like thingy for the Chinese with destinations limited to China, which is as stupid of an idea as you're going to get). We used an IM designed for enterprises by Alibaba, and I saw the position he held was "CEO counselor" (or "top-level counselor"; I forgot which). Till this very day I still don't know why CTO decided to hire him and why both the CEO and the CTO trusted him that much; in my honest opinion, people who couldn't figure out why the TripAdvisor thing wouldn't work upon first listen shouldn't bother with starting a business in the first place, let alone being a fucking counselor for some other way-more-successful businesses.
  • Other than this counseller guy there were other people as well; CTO was really into the concept of (or at least the phrase) "software architect" and he hired a few new people, giving them "architect" titles: one "Frontend Architect", one "Backend Architect" and one "Data Architect" (who deals with the data mining and machine learning part; the Inc used to have a data mining team. I never quite figure out what exactly was their purpose, probably something related to making business decisions). I'm sure they are decent in their own expertise, but I never liked them and they probably never liked me either.
  • Other than the architects there was also this Product Manager guy from Tencent. We already had quite a few ex-Tencent product managers, nearly all of them were unimaginative and incompetent. But the feud itself did not involve this new guy so I will not talk about him in this blog post.

The reorganization was done without me knowing anything - in fact, I believe I was the last one in the dev team to know anything about it and the only one who didn't know throughout the entire reorganization process, which in itself was a pretty insane accomplishment. I received no discussion from the CTO (he tended to ask me for opinions on things before); Mr. TL recommended himself to be the leader of the new team under the new structure and settled everything behind my back. When I caught wind of this, it was already too late to say anything or change anything.

Under normal circumstances, I was fine being a subordinate - or so I thought when the CTO asked me if I was interested in taking his place; but, while Mr. TL was indeed a competent programmer, he could not handle the pressure from the CTO who seemed to trust him so much and the pressure from all the new architects the CTO trusted so much; and the most important of all, that counselor guy seemed to have decided to have a feud with him and the whole Cleword team. Mr. TL has a long history of being awfully direct with his words that one might start suspecting him being autistic (not that it's inherently immoral; me myself sometimes are like that too), and after a few clashes with the counselor guy and the architects, he decided that it was enough and he couldn't take it anymore.

So one day in late April (or early May) he asked me to have a short meeting with him during the 2-hour lunch break & napping time, in which he told me everything about this whole fuss and wanted me to take his place. That counselor guy, he concluded, was trying to frame us as the culprit of the failed launch of Project Guyu because we "weren't efficient enough" (which was stupid, because I delivered shanclec on time) and was trying to get the CTO to laid us off. He also asked me to go to the next dev team leader meeting with him, which I agreed. In the afternoon after that meeting I quickly came up with a plan, basically telling people that:

  • We were "not efficient enough" (whatever that means), as proven by shanclec;
  • The reason why we seemed to be "not efficient enough" was not because we were not given enough freedom; it was not that "they have the least amount of work and their performance was still bad", it was the opposite - because we were restricted to work on the DSL only, we couldn't fix issues that existed on the global scale.
  • I'll skip you the details of other issues we've spotted on the global scale (or at least we claimed to have spotted on the global scale) but I'll tell you this one, because this is the only one I still remembered:

    • Before the writing team comes into play, there's a separate "education research" team whose main job was to come up with propotype courses (including the content and the form of the system) and run small-scale tests.
    • The way they did this is to make the prototype with Microsoft Powerpoint, record everything with a camera during the test, play that footage back and try to analyze the tester's reaction of things.
    • This was awfully inefficient; one could easily spot there were spaces for improvements:
      • A prototyping tool and an automatic user behaviour collecting & analyzing tool can obviously be made here;
      • A "prototype-to-front-end" conversion tool can also be made and used to kickstart the "traditional" development process, which would eliminate the need of a separate "product manager", whose job at that point was only converting the idea from the education research team to something the dev team could understand and use as a specification for development.

    This establishes a structure of two iterations: a small one within the education research team and the beta testers, and a large one that involves the writing team and the whole dev team. Now that I think about it, maybe this was one of the reasons why all the ex-Tencent product managers within the Inc secretly dislike me.

I have no idea why I chose to contribute some of my best idea to date to this group of people who completely disregarded me, but I contributed anyway. None of my idea were actually implemented, however; I tried to make a few demos in my own free time, but none of them ever finished.

Cleword Block (2020.3 ~ 2020.6)

During the time I was working on shanclec, Mr. TL, together with the CTO, had been pushing a new implementation of Cleword on the writing team; I forgot the details of it, but Mr. TL also had a conflict with the writing team which I single-handedly defused.

The company feud, part 2

So after the conversation with Mr. TL I went to the dev team leader meeting. I had chronic headache back then due to a plethora of reasons, and being stuck within a tiny meeting room filled with 10+ people's worth of CO2 for hours definitely didn't help; by the end of the meeting I could barely think, and the CTO, once again, decided everything himself. Now that I think about it, it didn't feel like I was actually leading anyone (Mr. TL "handed his position to me" so I was technically "supposed to be" the team leader; it definitely didn't feel like that's the case) or having power over anything, it felt like I was being used as a piece of asswipe just to clean up the mess made by Mr. TL; but at the end of the day, the conflict was dealt with, the situation was defused, and everyone could get back on track. This is still fine, I thought, despite being severly lacking and not as conclusive as I would like.

Me officially quitting the Cleword team

One day, Mr. TC invited a group of people across the whole Inc to discuss the "future directions of Cleword". I, being the one guy who should've had the most say in this matter, was of course invited as well. I prepared two designs for future Cleword - one was the dual-iteration proposal I've been perfecting in my spare time, and the other one only designed as a tool to grab as much power as we could in this war of workplace politics, which back then I believed was still going on. The latter, of course, was subpar and not so compelling compared to the ideas I've come up with. After failing to convince anyone to adopt my solution, I tried to discuss things with Mr. TL, because I still thought there's war to be fought; but this time, because Mr. TL and his Cleword Block were being backed by the CTO himself, so all the things I've planned was no longer considered needed by him. I forgot the details, but we had one last disagreement, and to me, that was enough. I had a mental breakdown and declared that I no longer want to be involved in this in the group chat - no one in this company would actually listen to what I have to say when it comes to Cleword now, so why should I care?

I got assigned to do other works later. For a short while, it was genuinely liberating. I had much schadenfreude watching them trying to force their subpar tools upon everyone; they may be stupid and wasting the potentials of what they have, but that no longer had anything to do with me now.

Cleword YAML (2020.6? ~ now)

Cleword YAML, or CleYAML for short, was a re-implementation of Cleword Block using YAML as its base syntax instead of the old Cleword syntax. Using YAML was, according to the account of Mr. TL, suggested by the Data Architect, and was accepted by everyone under the false premise of "the writing team would not want to learn a new skill that is only for this company and couldn't be transferred to their next job" while completely ignoring the fact that the writing team likely wouldn't use YAML in their future jobs either. This is the only Cleword language to date that was ever released to the public alongside a newly rewritten ClewordPub.

I shall now write down how the current ClewordPub works in 2021:

  • The course documents are written in YAML. This document is checked against a schema (declared alongside with the system the document is written for) and converted to data by CleYAML.
  • The documents themselves are stored on a GitLab instance. ClewordPub reads and stores these documents through GitLab's API.
  • The data output by CleYAML is then send to the presentation system through window's message event. (MDN Documentation)

I, obviously, was not satisfied with this. I kept trying to come up with a completely new architecture that was superior and worthy of being the competitor of the US government, the only one in the eLearning scene to date that I deemed worths my time. I never succeeded on this during my time in the Inc - way too many things need to be learned, to be made, to be tested and to be thrown away, and I don't have enough time. Actually, I am always reluctant to work on this kind of things - they are time-consuming, they are likely to fail and most of them will eventually, and I was never in a position safe enough where I can just absorb the damage when it eventually didn't work out.

Aftermath

I got laid off by the company in the summer of 2021 after working on a failed experimental project sabotaged by the newly-hired ex-Tencent product manager mentioned above (which is a fucking story on its own so I would not bother you with the details here). After that, I:

  • made a VSCode plugin which became the internal tool replacing ClewordPub's own online text editor;
  • made a few features for ClewordPub, but as far as I know the features never got deployed;
  • helped writing code for a few courses;
  • attempted to create a new toolchain to replace CleYAML; I never finished it however.
  • attempted to port ClewordPlus to ClewordPub so that they could have something to use or build upon if they were going to move everything to ClewordPub; I later got told that they didn't need that.
  • attempted to make my own engine, which spawned Csardas (now lost) and CsardasLite.

But it was obvious at that time that I no longer had any power over anything related to Cleword, so I eventually stopped. The counselor guy left the company before I got laid off (good riddance); the "architects" were all gone not long after I left. Now, at this point in time, I have no reason to believe Cleword is still a living project since none of the original designers are still in the Inc; the old software might still be used internally, but I have no way to know. The ClewordPlus/shanclec syntax was inherited by the OpenIndented Markup Language, which was done by me and was intended to be used in my other personal projects. Mr. TL later went on to work in another company; I used to talk with him from time to time but now I've cut ties with him for having disagreeing political opinions. I remain no contact with other team members and personnel from the Inc nowadays, especially not after this blog post; when I wrote everything down, it was painfully obvious how many times I was betrayed and stabbed in the back.

Other than all the things I've just written down above, there's also this one thing that I still remembered. When I joined the Inc, my job did not fit in any part of the common "front-end" "back-end" classification, thus I was assigned to the front-end team because it was closer to front-end than back-end. One day, the front-end team leader decided to host a learning session about this new-fangled thing called the "React functional component" and everyone in the team could join; me, being assigned to the front-end team, decided to go as well; but when I showed up in the meeting room for the learning session, no less than 3 people began to ask each other why the fuck am I here, and someone audibly said "I dunno he's technically in the front-end team I guess". Was I just never welcomed? For the love of me even till this day I still couldn't shake off that feeling of shame and embarrassment.

What now?

I'm planning to pick it up again, this time planning everything myself, or at least with a completely different team; I no longer trust anyone from that company anymore.


Back

Last update: 2024.6.10