First Spaceship on Venus

In one of my previous posts, I wrote about a tool called Lem. This tool designed to solve some of the challenges with writing about code.

The main idea of it is to have a way to discuss the code while it is changing. Even when I am writing about a finished project – I am doing small fixes and changes as I go. For me, the hobby projects are like home renovations – you can only start them.

I also would like to have a way to layout and preview the outcome in WYSIWYG mode. It will be especially valuable if the content is independent of the rendering. Like if, at some point, I would like to change the platform on which I host my blog – it will be a matter of recompiling the content for the new frontend.

To sum up, I need a tool, which can convert a repository into an easy to read rich text. While the easiness solely depends on my writing skills, I definitely want some help with the rich text part. With this in mind, I was looking for inspiration.

Introducing LemV2

The original idea of LemV1 based on the concept of Literate Programming – an approach proposed by non-other than Donald Knuth – I hope you do not need my introduction for him.

Literate programming is a programming paradigm in which a computer program is given an explanation of its logic in a natural language, such as English, interspersed with snippets of macros and traditional source code.

While implementing my take on the idea, I noticed that the most significant chunk of work is related to the common elements of structured text: links, tables, lists, styles, etc. It was pretty noticeable that I am in the tedious process of reinventing a bicycle.

Thankfully, the main force behind any progress, the laziness, offered its help in the form of somebody else’s solution. That solution was to include the support for Markdown in the project.

I also removed all non-essential things from the code – "architecture," dependency injection, abstraction layers – all those things that disallow enterprise solutions from becoming a normal working product.

In this, hopefully short, article, I want to walk you down through the implementation details of LemV2. Hopefully, that will be interesting for you and beneficial for me in the form of feedback.

Ceasar of programming: doing multiple things at the same time

As with any other program, it should start somewhere. Following a longstanding tradition, the entry point for LemV2 is called main. There is a couple of things to note:


The program operates on the notion of a scenario – a file describing a backbone for the article. It consists of a mix of Markdown and commands. A command is a way to perform a meta operation on Markdown: for example, to include a code snippet.

com/gzozulin/LemV2App.kt::main

Main function starts with the measurement

fun main() {
    val millis = measureTimeMillis {
        runBlocking {

We list all the scenarios in the input folder and launch tasks asynchronously:

            val scenarios = inputFolder.list()!!
            val deferred = mutableListOf<Deferred<Unit>>()
            for (scenario in scenarios) {
                val scenarioFile = File(inputFolder, scenario)
                val outputFile = File(outputFolder, "$scenario.html")
                check(scenarioFile.exists() && scenarioFile.isFile)
                deferred.add(async(Dispatchers.Default) { renderScenario(scenarioFile, outputFile) })
            }

Joining the fork for launched tasks:

            deferred.awaitAll()
        }
    }

Here I usually notice that my optimizations are futile

    println("Finished in %.2f seconds".format(millis / 1000f))
}

Each scenario is handled in isolation:

com/gzozulin/LemV2App.kt::renderScenario

Scenario file starts with a set of arguments: github url, repo path, etc.

private fun renderScenario(scenario: File, output: File) {
    val (root, url, lines) = extractArguments(scenario)

Next, we want to identify and apply the meta commands

    val withCommands = identifyCommands(lines)
    val onlyMarkdown = applyCommands(withCommands, root, url)
    lateinit var htmlSnippets: List<SnippetHtml>

When we have a final markdown, it can be rendered to html in parallel manner

    runBlocking {
        val deferred = mutableListOf<Deferred<SnippetHtml>>()
        for (snippetMarkdown in onlyMarkdown) {
            deferred.add(async(Dispatchers.Default) { renderHtml(snippetMarkdown) })
        }
        htmlSnippets = deferred.awaitAll()
    }

And flushed into an output file

    renderFile(output, htmlSnippets)
}

Now, let us have a look at how the commands identified and applied.

Going meta: identifying and applying commands

Commands play a crucial role in the application. Their primary purpose is to glue together the code and the explanation.

They also have a secret goal: It will be relatively unfair just to use a third-party library on Markdown and call it my project. But now, I can feel good about my accomplishment.

Back to commands – before I can apply, I need to identify them inside of the scenario file.

com/gzozulin/LemV2App.kt::identifyCommands

Everything is straightforward here – we just check if the line starts with @ symbol

private fun identifyCommands(lines: List<String>) = lines
    .map { if (it.startsWith(COMMAND_PREFIX)) SnippetCommand(it) else SnippetMarkdown(it) }

The next step is to apply them. Since the result of the command can be evaluated independently, it is one more good point to fork the execution:

com/gzozulin/LemV2App.kt::applyCommands

The first part of this routine will asynchronously apply commands one-by-one

private fun applyCommands(snippets: List<Snippet>, root: File, url: String): List<SnippetMarkdown> {
    val result = mutableListOf<SnippetMarkdown>()
    lateinit var handled: ArrayList<List<SnippetMarkdown>>
    runBlocking {
        val deferred = mutableListOf<Deferred<List<SnippetMarkdown>>>()
        for (snippet in snippets) {
            if (snippet is SnippetCommand) {
                deferred.add(async (Dispatchers.Default) { applyCommand(snippet, root, url) })
            }
        }
        handled = ArrayList(deferred.awaitAll())
    }

When all of the results are available, we can assemble them into a list in the same order as they were started

    for (snippet in snippets) {
        if (snippet is SnippetCommand) {
            result.addAll(handled.removeAt(0))
        } else {
            result.add(snippet as SnippetMarkdown)
        }
    }
    return result
}
com/gzozulin/LemV2App.kt::applyCommand

To apply the specific command I just switch by its label and call the appropriate method

private fun applyCommand(command: SnippetCommand, root: File, url: String): List<SnippetMarkdown> {
    try {
        val cmdBody = command.command.removePrefix(COMMAND_PREFIX)
        val split = cmdBody.split(whitespacePattern)
        val location = parseLocation(root, url, split[2])
        return when (split[1]) {
            COMMAND_DECL -> includeDecl(location)
            COMMAND_DEF -> includeDef(location)
            else -> TODO()
        }

Since the commands can contain literally anything, one can assume that errors will also happen and prepare for that

    } catch (th: Throwable) {
        error("Failed to handle command: ${command.command}")
    }
}

At this point, we have a final Markdown, which will we will render into HTML, but let us firstly look at how the commands work behind the scenes – no stone should be left unturned.

Looking for a needle in the hay: extracting code snippets

Since most of the commands operate on the repository, naturally, we need to find the location of the requested code snippet.

Locations can be pretty intricate: I allow for home ~ symbol, which unwraps into source code root directory.

Modules are also allowed. Here are some examples of locations:


Whole command can look like this: include def ~/com.gzozulin.LemV2App::parseLocation

The parsing of the location itself is a straightforward process:

com/gzozulin/LemV2App.kt::parseLocation
private fun parseLocation(root: File, url: String, location: String): Location {
    val noDots = location.replace(".", "/")
    val withHome = noDots.replace("~", HOME_DIR)
    val (filename, identifier) = withHome.split("::")
    val file = File("$filename.kt")
    check(file.exists()) { "File do not exists: $location" }
    return Location(root, file, identifier, url)
}

When the code snippet located, we can extract it. It is possible to obtain either declaration or definition. The declaration only includes, non-surprisingly, the declaration of the entity. For the method, it will be its name, parameters, and return type. On the other hand, the definition also includes the method body. Similar rules work for the class and other entities.

For the sake of brevity, I only show how that will work for the declaration. The definition works precisely the same.

com/gzozulin/LemV2App.kt::includeDecl
private fun includeDecl(location: Location): List<SnippetMarkdown> {
    lateinit var tokens: List<Token>
    var line = 0
    val file = File(location.root, location.file.path)

To parse and understand Kotlin code, I am using a tool called Antlr. Their marketing team outlines the following advantages of the framework: "ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files." I do not want to go into too many details about parsing – grammar files are available for most of the modern languages. If the language of your preference is not in the long list reconsider its advantages.

    kotlinParserCache.useParser(file) { parser, stream ->
        val declaration = locateKotlin(parser, location)
        line = declaration.start.line
        val firstToken = findFirstToken(stream, declaration)
        val lastToken = findLastToken(stream, declaration)
        tokens = stream.get(firstToken, lastToken)
    }
    val result = mutableListOf<SnippetMarkdown>()

At this point we can also create a convenient link, pointing to the Github repo.

    result.add(createHeaderLink(location, line))
    result.addAll(extractStatements(tokens))
    return result
}

After all of the commands applied, the resulting Markdown is ready to be parsed and rendered into HTML. The process is trivial and therefore omitted.

Final thoughts: good enough

While working on this article, I was able to use Lem to the full extent. Surprisingly, it is quite a comfortable tool to work with.

For the WYSIWYG part, I am using an external editor called StackEdit. It works quite well and has a lot of useful features. I am copying the content from StackEdit just before the ‘compilation’ by Lem. If anybody is interested, here is the scenario file for this article.

All of the scenario files and resulting HTML kept under Git in the single repo. That allows me to track changes and notice unexpected quirks with HTML rendering. For example, if some API’s changed, and this change not reflected in the corresponding scenario file, I will immediately notice that after recompilation.

There are, of course, things to improve. For example, I would like to move scenarios into a separate repo. External interface for the tool also would be excellent, since now you need to fire up Gradle to perform the rendering.

I also would like to expand the commands set. I am thinking about commands to include common HTML fragments like a table of contents, Twitter badges, maybe annoying ads – that sort of things. In any case, it is doing what it is designed to do, and I mostly like it as it is now.

I hope you liked the article and it was entertaining enough. Have a great time, and will see you again on the pages of Journal. Meanwhile, let me recompile this article just one more time:

Finished in 5.22 seconds