My bold choice of blog post authoring tool: HTML

My personal website was recently out of commission for over a year. And for several years prior to that it was reachable but received no updates. The main reason I didn't modify the site was because I wasn't feeling very compelled write out my thoughts on the web in those years. However, I believe a secondary reason was that I had never put in enough time to be comfortable with my blog generator tool.

In the past I have employed a succesion of static site generation tools. From memory I believe I have used Jekyll, Hakyll, Cryogen, Perun, and Stasis. Each tool had a convention for how you should organize your site content and then the tool would create a site for you with various pieces of functionality automatically applied. Here are the blog post rules for Jekyll, as an example.

My problem was that I would update my site infrequently, so I would never quite internalize all of the little details of the site generator: locations, names, rules, etc.

Because the needs I had of a personal site generator were modest I decided at last to eschew any library specifically designed for generating static sites and put together myself the necessary code to serve basic HTML.

Why Not Markdown?

One thing that I couldn't help but notice is that every static-site/blog tool I have ever used encourages you to write your blog posts in Markdown files which the tool will then convert into HTML for you. Why is this approach so ubiquitous? Markdown is a simpler format than HTML to be sure, but HTML is the target output, after all. An argument could be made that it's a trade-off: since a subset of no-frills HTML is enough for people to write a simple blog post, you should pick the format that is easier to write. But is HTML really that much more difficult and annoying that it's worth adding a lossy translation layer? Because otherwise you'll have to write some <p>...</p> tags every so often? I'm not convinced. It's a safe bet that the text editor you are using to write blog posts has support for automatically closing brackets and inserting closing tags as you type. I've had no complaints so far in Emacs.

Authoring in HTML

I write my blog posts as a valid html page with a html, head, and body tags. The content of the body is the full inner HTML of my blog post. The head contains meta information, and additionally could contain extra links to CSS or JavaScript files to add interactivity targeted to this specific blog post.

<html lang="en">
    <head>
        <title>My bold choice of blog post authoring tool: HTML</title>
        <meta content="Why I decided to skip Markdown and write web pages like my ancestors did." name="dc:abstract" />
        <meta content="computers, clojure" name="tags" />
    <head>
    <body>
        <h1>My bold choice of blog post authoring tool: HTML</h1>
        ...
    </body>
</html>
            

Writing HTML feels native and it's nice to know that I can easily sprinkle in whatever wacky stuff that I desire with ease.

Putting It All Together

The post HTML I pasted above is not the same as the page you are currently visiting. That content needs to be wrapped in rest of the site's markup which is the same for each post. I first generate the outer template HTML of this site using the Hiccup Clojure library, and then use the excellent jsoup Java library to merge together the heads and the bodies of the two HTML pages.

(ns ...
(:require
   [clojure.java.io :as io]
   [clojure.string :as string]
   [hiccup2.core :as h]
   [... :as views.shared])
(:import
   (org.jsoup Jsoup))

(defn wrap-html
  ([body] (wrap-html body nil))
  ([body title]
   (str
     (h/html
       [:html {"xmlns:ds" "http://purl.org/dc/terms/"}
        (cond-> (into [:head] (views.shared/head-resources))
          title (conj [:title title]))
        body]))))

(defn wrapper-html
  []
  (wrap-html
    [:body
     (views.shared/header "")
     [:main.inner
      [:article#blog-post]]]))

(defn- post-html
  [slug]
  (when slug
    (slurp (io/resource (format "blog/posts/%s.html" slug)))))

(defn post-page
  [{:keys [slug]}]
  (let [j-doc              (Jsoup/parse (wrapper-html))
        post               (post-html slug)
        post-j-doc         (Jsoup/parse post)
        post-head-children (-> post-j-doc (.select "head > *"))
        post-body-children (-> post-j-doc (.select "body > *"))]
    ;; merge heads
    (-> j-doc
        (.select "head")
        (.first)
        (.appendChildren post-head-children))
    ;; merge body
    (-> j-doc
        (.select "article")
        (.first)
        (.appendChildren post-body-children))
    (.toString j-doc)))
            
            

Then in order to make an index page like this one I needed to write a little code to use Jsoup again to parse all of the HTML files and pull out the tag values I put into the meta tag in each post.

(defn posts-by-tag
  [tag]
  (->> (post-slugs)
       (keep (fn [slug]
               (let [j    (Jsoup/parse (post-html slug))
                     tags (-> j
                              (.select "meta[name=tags]")
                              (first)
                              (.attr "content")
                              (string/split #",\s*")
                              (set))]
                 (when (or (= "all" tag)
                           (tags tag))
                   {:slug        slug
                    :title       (-> j (.select "title") (.text))
                    :description (-> j (.select "meta[name=dc:abstract]") first (.attr "content"))
                    :created     (-> j (.select "[property=dc:created]") first (.attr "content"))
                    :tags        tags}))))
       (sort-by :created #(compare %2 %1))))

#_(posts-by-tag "clojure")

            

Writing this little blog generator did not require a tremendous amount of effort on my part and in return I now have something that feels unbounded and more "mine" than any of my past setups.

September 4, 2023