String Manipulation in Clojure


Python string APIs are powerful and concise, that is an important reason I use it to do a lot of scripting these days, join, split, strip, to name a few.

Since I am learning Clojure recently, I am wondering, how is string manipulation like in Clojure and how to implement equivalent ones?

I think it's an excellent opportunity to get familiar with Clojure. Before diving into the implementation, how to declare a multi-line string?

Just do it literally:

(def greet-multiline "Hello,

    See you!")
;; => #'user/greet-multiline

(count greet-multiline)
;; => 20

Below are some string functions that I came up with along the course.

String creating

;; "a" + "b"
(defn s-add
  [& strings]
  (if (empty? strings)
    ""
    (apply str strings)))
;; => #'user/s-add
(s-add "a" "b")
;; => ab

;; "a" * 10
;; https://stackoverflow.com/questions/5433691/how-to-repeat-string-n-times-in-idiomatic-clojure-way
(defn s-multiply
  [s cnt]
  (apply str (repeat cnt s)))
;; => #'user/s-multiply

(s-multiply "hello" 3)
;; => hellohellohello

(defn s-format
  [& args]                              ; why [:as args] not work???
  (apply format args))
;; => #'user/s-format

(s-format "Hello there, %s" "bob")
;; => Hello there, bob
(s-format "1 + 1 = %d" (+ 1 1))
;; => 1 + 1 = 2

(defn s-join
  ([strings]
   (s-join strings " "))
  ([strings sep]
   (reduce (fn
             ([] "")                     ; for a empty list of strings
             ([a b] (str a sep b)))
           strings)))
;; => #'user/s-join

(s-join ["a" "b" "c"])
;; => a b c
(s-join ["a" "b" "c"] ", ")
;; => a, b, c
(s-join [])
;; =>

String properties

(defn s-len
  [s]
  (count s))
;; => #'user/s-len

(s-len "hello")
;; => 5

(defn s-count-char
  "Count the number of the specified `char` in `str`"
  [str char]
  (reduce #(if (= char %2)
             (+ %1 1)
             %1)
          0 str))
;; => #'user/s-count-char

(s-count-char "hello world" \o)
;; => 2
(s-count-char "hello world" \x)
;; => 0
(s-count-char "" \o)
;; => 0

String predicates

(defn s-endswith?
  ;; or based on clojure.string/ends-with?
  ([s suffix]
   (s-endswith? s suffix 0 (count s)))
  ([s suffix start end]
   "boundaries are: [start, end)"
   (let [start (min start (- (count s) 1))
         end (min end (count s))]
     ;; (println start end)
     (= suffix (subs s
                     (max start (- end (count suffix)))
                     end)))))
;; => #'user/s-endswith?

(s-endswith? "abc" "bc")
;; => true

(defn s-isalpha
  [s]
  (if (empty? s)
    false
    (reduce #(if (not %1)
               false
               (or (<= (int \a) (int %2) (int \z))
                   (<= (int \A) (int %2) (int \Z))))
            true
            s)))
;; => #'user/s-isalpha

(s-isalpha "")
;; => false
(s-isalpha "abcABC")
;; => true
(s-isalpha "abcABC11")
;; => false

Find and replace

(defn s-find
  ([s sub-str]
   (s-find s sub-str 0 (count s)))

  ([s sub-str start end]
   (let [pos (clojure.string/index-of s
                                      sub-str
                                      start)]
     (if (and pos
              (<= (+ pos (count sub-str)) end))
       pos
       -1))))
;; => #'user/s-find

(s-find "abc" "a")
;; => 0
(s-find "abc" "bc")
;; => 1

(defn s-replace
  [s match replacement]
  (clojure.string/replace s match replacement))
;; => #'user/s-replace

;; by char/char
(s-replace "abc" \a \1)
;; => 1bc
;; by string/string
(s-replace "abc" "a" "1")
;; => 1bc
;; by pattern/string
(s-replace "abc" #"[a-z]" "1")
;; => 111
;; by pattern/function
(s-replace "abc" #"[a-z]" #(clojure.string/upper-case %1))
;; => ABC

String transformation

(defn s-center
  [s width fillchar]
  (let [len (count s)]
    (if (<= width len)
      s
      (let [llen (int (/ (- width len) 2))
            rlen (- width len llen)]
        (str (clojure.string/join (repeat llen fillchar))
             s
             (clojure.string/join (repeat rlen fillchar)))))))
;; => #'user/s-center

(s-center "hello" 10 \x)
;; => xxhelloxxx

(defn s-expandtabs
  ([s]
   (s-expandtabs s 8))
  ([s tabsize]
   (second
    (reduce (fn [state char]
              (if (not= \tab char)
                [(+ 1 (first state)) (str (second state) char)]
                (let [cur-index (first state)
                      next-index (+ cur-index (- tabsize (rem cur-index tabsize)))
                      nspaces (- next-index cur-index)]
                  [next-index (str (second state)
                                   (clojure.string/join (repeat nspaces \space)))])))
            [0 ""]
            s))))
;; => #'user/s-expandtabs

(s-expandtabs "\ta")
;; =>         a
(s-expandtabs "a\tb" 2)
;; => a b
(s-expandtabs "a\t\tb" 2)
;; => a   b
(count (s-expandtabs "a\tb" 2))
;; => 3
(count (s-expandtabs "a\t\tb" 2))
;; => 5

;; in a idiomatic way
(defn title-case-idiomatic
  [str]
  (clojure.string/join " "
                       (map #(clojure.string/capitalize %)
                            (clojure.string/split str #" +"))))
;; => #'user/title-case-idiomatic

(title-case-idiomatic " hello world ")
;; =>  Hello World

;; a.upper()
(defn s-upper
  [str]
  (clojure.string/upper-case str))
;; => #'user/s-upper

(s-upper "hello")
;; => HELLO

(defn s-lower
  [str]
  (clojure.string/lower-case str))
;; => #'user/s-lower

(s-lower "ABc")
;; => abc

(defn s-capitalize [str]
  (clojure.string/capitalize str))
;; => #'user/s-capitalize

(s-capitalize "abc DEf")
;; => Abc def

(defn s-lstrip
  [str]
  (clojure.string/triml str))
;; => #'user/s-lstrip

(s-lstrip "  cde .")
;; => cde .

(defn s-rstrip
  [str]
  (clojure.string/trimr str))
;; => #'user/s-rstrip

(s-rstrip "  cde ")
;; =>   cde

(defn s-strip
  [str]
  (clojure.string/trim str))
;; => #'user/s-strip

(s-strip "  cde ")
;; => cde

A very brief summary

There are three string APIs in Clojure, and you can find them briefly at https://clojure.org/api/cheatsheet:

  1. core API, e.g. count, str.

  2. clojure/string, e.g. join

  3. Java API, e.g. (.toUpperCase "abc")

    Try not to use these APIs as they are JVM-specific, which means they are not portable for JS or CLR runtime.

By the way, if you're wondering how to add the evaluation result for the code snippets above, you are not alone. You can try my illustrate.clj, a babashka script to add the comments quickly.


See also

comments powered by Disqus