Clojure reduce: one case for text processing


As a practice, I managed to illustrate Clojure files using illustrate.clj, but my original idea was to annotate org-mode files of blogs. It's not uncommon that a blog post has some code snippets.

But it missed the feature until last night, as I wasn't sure how to implement it appropriately before and didn't have enough time.

For example, I may have an org-mode like this:

sum of two numbers:
#+begin_src clojure
(+ 1 2)
#+end_src

I want to have a result comment ((;; => 3)) after each top-level form after using illustrate.clj: sum of two numbers:

#+begin_src clojure
(+ 1 2)
;; => 3
#+end_src

As I come from a background of C++ and Python, the common way to do that in these languages would be:

  1. find the begin of source blocks (#+begin_src), and find their corresponding ends.

  2. extract and illustrate the source blocks

  3. join the newly source blocks back with the previous content

The Clojure way here is similar, but using reduce:

  1. split the file content into lines

  2. reduct the lines to final content

    Use a state to carry the content before a block and the source block along the way. if a source block ends, call (illustrate-string) to illustrate it and append the result string to previous content.

The function is like this:

(defn illustrate-org-file
  "Add illustration comments to an org-mode file"
  [file new-file]
  (let [lines (clojure.string/split-lines (slurp file))
        result (reduce (fn
                         [state line]
                         (let [content (nth state 0)
                               prev-in-block? (nth state 1)
                               src-block (nth state 2)]
                           (if prev-in-block?
                             (if (re-matches #"\s*#\+end_src" line)
                               [(str content (illustrate-string src-block) line "\n")
                                false
                                ""]
                               [content
                                true
                                (str src-block line "\n")]) ; append to the source block
                             ;; not in a src block previously
                             [(str content line "\n")
                              (if (re-matches #"\s*#\+begin_src\s+clojure" line)
                                true
                                false)
                              ""])))
                       ["" false ""]
                       lines)]
    (spit new-file (nth result 0))))

Although it works, it's not as concise and elegant as I expected. Maybe I would improve the code in the future.

The complete code is at my babashka-tools repo.


See also

comments powered by Disqus