To get my hands dirty with Clojure, I am trying to find or implement Clojure's string functions in the sense of Python. Python has powerful string APIs, and I also want to see how powerful Clojure could be in this field. That would be interesting.
As shown in the cheatsheet, Clojure has implemented most of them, and there are some that I have to implement myself, like title-case.
Along the way, I found it was a little cumbersome to append the evaluation result and the result of calling them, for example,
(defn title-case-idiomatic
[str]
(clojure.string/join " "
(map #(clojure.string/capitalize %)
(clojure.string/split str #" +"))))
;; => #'user/title-case-idiomatic
(title-case-idiomatic " hello world ")
;; => Hello World
If I change the function name or add more calling examples, I need to adjust the evaluation comment accordingly, which is not a very good idea.
Though I never have a chance to practice it, I was aware of the homoiconicity of Lisp, so why not write a Babashka script to do that automatically?
That would be an excellent opportunity to get familiar with Clojure, so let's dive into it.
P.S. I made a tweet thread along this journey:
What about a #clojure illustration tool for code snippets using #babashka ? pic.twitter.com/WLUqmOWStV
— whatacold (@whatacold) July 30, 2021
The Trivial Version
The idea is simple in 3 steps:
- Read the top-level forms
- Evaluate them
- Print the forms and their result in comments after them
The most challenging part is how to read top-level forms and evaluate them, but it didn't take me too long to make it using edn/read
and java.io.PushbackReader
, based on this StackOverflow answer.
The core part of it is as follows:
(defn illustrate
[file suffix]
(with-open [in (java.io.PushbackReader. (clojure.java.io/reader
file))]
(let [edn-seq (repeatedly (partial edn/read {:eof :theend} in))]
(spit (str file suffix)
(with-out-str
(dorun (map (fn [obj]
(println)
(prn obj)
(println "; => " (eval obj)))
(take-while (partial not= :theend) edn-seq))))))))
It works, and it only took me some 2 hours to implement it, which is fantastic.
But it also has a major pitfall. It can't preserve whitespaces and comments, which makes it barely useful. The full version is at 3f7e639, if you're interested.
Then how can I improve it?
Though I don't have much experience, I did read some blog posts about how to rewrite Clojure files using rewrite-clj, for example, Homoiconicity & Feature Flags — Martin Klepsch - www.martinklepsch.org.
So rewrite-clj sounds like the right tool.
The rewrite-clj Version
It seems a good start to read the docs. It briefly introduces how it works, some concepts, and the four major APIs: zip, node, parser, and Paredit API.
Then I thought I was good to go, but it turned out that it was more complex than I thought.
The First Try that Failed
This is what I came up at first:
(require '[rewrite-clj.zip :as z]
'[rewrite-clj.node :as n]
'[rewrite-clj.parser :as p])
;; a test string
(def data-string "
(defn my-function [a]
(* a 3))
(my-function 7)
")
;; parse code to nodes, create a zipper, and navigate to the first non-whitespace node
(def zloc (z/of-string data-string))
(loop [cur zloc
left cur]
(println "current string {{" (z/string cur) "}}")
(if (z/end? cur)
(println "final string: {{" (z/root-string left) "}}") ; XXX it doesn't work as expected!!!
(recur (z/right (z/insert-right cur (p/parse-string "; test\n")))
cur)))
The output shows that it only works partially, the last comment didn't get appended:
current string {{ (defn my-function [a]
(* a 3)) }}
current string {{ (my-function 7) }}
current string {{ nil }}
final string: {{
(defn my-function [a]
(* a 3)) ; test
(my-function 7)
}}
It took me a lot of time to figure out why it doesn't work, but I failed to.
Then I turned to the #rewrite-clj community on Slack for help, @lread nicely gave me a workable solution:
(loop [zloc zloc]
(let [zloc (some-> zloc
(z/insert-right* (n/comment-node "; test"))
(z/insert-right* (n/newlines 1)))
next-sib (z/right zloc)]
(if next-sib
(recur next-sib)
(z/print-root zloc))))
That was a good starting point for me. Looking back, the lessons here are:
-
The
zipper
orzloc
of rewrite-clj is immutable, so be careful with the stale ones before modifications.The problem above was that the
cur
in the lastrecur
sexpr is stale after applyingz/insert-right
, and the correct one would be like this:(loop [cur zloc left cur] (println "current string {{" (z/string cur) "}}") (if (z/end? cur) (println "final string: {{" (z/root-string left) "}}") (let [cur (z/insert-right cur (p/parse-string "; test\n"))] ; here is the diff (recur (z/right cur) cur))))
-
It's better to create nodes explicitly. At first, I thought
(z/insert-right* (p/parse-string-all "\n; test\n"))
would insert two nodes to the right, but it turned out it didn't work. Instead, inserting a comment and then a newline node did the trick, that is:(-> zloc (z/insert-right* (n/comment " test")) (z/insert-right* (n/newline 1)))
whitespace-node
is different fromnewline-node
for rewrite-clj, at the first glance, my intuition was that whitespace nodes also contains newlines.- Keep the number of
loop
bindings as little as possible, and only one is the best so that there is little forrecur
to care about. - The
let
bindings is an excellent place to put some logic so that the body oflet
would be concise.
The Second Version
After getting familiar with rewrite-clj, I get back to my track once again. The second version only contains 57 sloc, which is concise.
There are 3 ways to use it:
illustrate.clj -i .new foo.clj
, to illustratefoo.clj
and write the result tofoo.new.clj
.illustrate.clj foo.clj
, to illustratefoo.clj
and overwrite it. Be careful! You'd better back up your file or put it under the control of git.cat foo.clj | illustrate.clj
to do it via a pipe. It could be handy if you use it with tools like Emacs/Vim.
There is still one feature that I leave for someday: to illustrate org-mode files containing Clojure source block, which would be wonderful for writing blog posts with Hugo.
Anyway, the script is on GitHub.