Asciidoc to Markdown

I have some free time and so I am trying to contribute to the clojure-cookbook project. (It’s really interesting, you should check it out)

I wrote a couple of very stupid recipes and now I am waiting for the proof reading and correction, if you wanna help:

Deploy on lein

Database up and running

Meanwhile, I thought that you might be interested to those recipe. They are very entry-level recipes so they may help just somebody who is very new to clojure, however if you know clojure just a little bit you might find those interesting and you may help me to review either my English or my clojure.

But there is a problem: the clojure-cookbook project use the .asciidoc (it is kinda like markdown but it looks more powerful, and it is a better fit to write real-paper book) while my blog run with jekyll that use the markdown format. Jekyll is thinking about allow other file format, but it is not ready yet.

So I tried to figure out a solution by myself and I write a couple of lines of code.

The script convert basic asciidoc to basic markdown, nothing fancy at all but it takes care of the title, of the block of code and also of the in-line code, enough for my recipes.

It uses just with regex and it is really dummy, but it works.

You can check the code on github

How you can see the code is not very interesting, it is mainly regex application.

One thing may, however, be interesting: how I manage to “translate” the titles.

In asciidoc titles have this syntax:

= H1 TITLE 
== H2 TITLE
==== H4 TITLE

In markdown it is very similar, but instead of = you can use #.

So I just needed a regex to change that = in #, well I am not a regex expert and I haven’t find out any smart way to do it but the one that you can see, I also store the regex in a map to link them each other.

(def titles-regex
  (sorted-map-by #(> (count %1) (count %2))
                 "###### $1" #"====== +([^ \t\n\r\f\v].*?)"
                 "##### $1" #"===== +([^ \t\n\r\f\v].*?)"
                 "#### $1" #"==== +([^ \t\n\r\f\v].*?)"
                 "### $1" #"=== +([^ \t\n\r\f\v].*?)"
                 "## $1" #"== +([^ \t\n\r\f\v].*?)"
                 "# $1" #"= +([^ \t\n\r\f\v].*?)"))

The problem is that if you start to parse the file with the wrong order you will just substitute the last = of any asciidoc title with a # that will generate a bunch of wrong H1.

You definitely need to start looking for H6 and the for H5 and so on.

And here it comes the sorted map that helped to sort the regex from the longer to the shorter and to still have them linked each other.

Finally, just one last little note, on github it is possible to define block of code using a syntax like:

``` language 
fancy code
```

This is not true here on jekyll (they are working on it, though) where you have to use a syntax like django.

My little script is been thought to let me publish here on jekyll, so it generate a slightly wrong syntax; you can fix that by passing the --no-jekyll parameter.

The whole code:

(ns asciidoc-to-markdown.core
  (:use [clojure.tools.cli :only [cli]]))

(def titles-regex
  (sorted-map-by #(> (count %1) (count %2))
                 "###### $1" #"====== +([^ \t\n\r\f\v].*?)"
                 "##### $1" #"===== +([^ \t\n\r\f\v].*?)"
                 "#### $1" #"==== +([^ \t\n\r\f\v].*?)"
                 "### $1" #"=== +([^ \t\n\r\f\v].*?)"
                 "## $1" #"== +([^ \t\n\r\f\v].*?)"
                 "# $1" #"= +([^ \t\n\r\f\v].*?)"))

(defn title [text]
  (reduce (fn [text regex]
            (clojure.string/replace text (get titles-regex regex) regex))
          text
          (keys titles-regex)))

(defn source [text & jekyll]
  (clojure.string/replace text
                          #"\[source,(.*?)\]\n----\n(.*?\n+.*?)\n----"
                          (if (first jekyll)
                            "\\n$2\n\"
                            "``` $1\n$2\n```")))

(defn inline-code [text]
  (clojure.string/replace text
                          #"\+(.*?)\+"
                          "`$1`"))

(defn -main [input output & args]
  (let [[args opts banner]
        (cli args
             ["-h" "--help" "Show help" :default false]
             ["-j" "--jekyll" "Make jekyll ready markdown file" :flag true :default true])]
    (println args opts banner)
    (println (:jekyll args))
    (spit output (-> input
                     slurp
                     (source (:jekyll args))
                     title
                     inline-code))))

If you are interested you can just download everything from github and run it to convert basic asciidoc into markdown.

    git clone git@github.com:siscia/asciidoc-to-markdown.git
    cd asciidoc-to-markdown
    lein run input-file.asciidoc output-file.md

I am available for freelance work, I am specialized in IoT and distributed fault tolerant systems, if you are interested in working with me you can get in touch here: simone [at] mweb [dot] biz