Blog index and tags automation

Let's add tags to blog posts

Table of Contents

Tags.

Tags are a nice and easy way to organize posts without explicit search. In the simplest way, you have a list of tags in posts, and each tag links to a page with all posts having the corresponding tag. It is also helpful to have a page with all tags available in the blog. And, of course, I don't want to maintain the list of tags manually.

Automate tags.

Simple idea - go through all files, collect titles, dates and descriptions, render into org files. Easy peazy lemon squezy. Let me show you parts of function body. Prepare required pathes, list needed org files, map to collect data.

(let* ((base-dir (plist-get plist :base-directory) )
       (tag-dir (concat base-dir "/tags/"))
       (posts-dir (concat base-dir "/posts"))
       (post-org-files (directory-files-recursively posts-dir "\\.org$"))
       (tag-map (make-hash-table :test 'equal))
       (posts-list (list))
       )

Create folder for tags files. Than we go through each file and collect plists with data required to render tags files


(unless (file-directory-p tag-dir)
  (make-directory tag-dir t))

(dolist (file post-org-files)
  (with-temp-buffer
    (insert-file-contents file)
    (let* ((parsed-info (list :parse-tree (org-element-parse-buffer)))
	   (tags (split-string (my/org-get-property "TAGS" parsed-info)))
	   (title (my/org-get-property "TITLE" parsed-info))
	   (date (my/org-get-property "DATE" parsed-info))
	   (description (my/org-get-property "DESCRIPTION" parsed-info))
	   (published (my/org-get-property "PUBLISHED" parsed-info))
	   (preview (with-temp-buffer
		      (insert-file-contents file)
		      (my/get-first-two-meaningful-lines)))
	   )
      (if (and published (string= published "true"))
	  (progn
	    (setq link-plist
		  (list
		   :title title
		   :description description
		   :preview preview
		   :file file
		   :date date))

	    (push link-plist posts-list)

	    (dolist (tag tags)
	      (puthash tag (cons link-plist (gethash tag tag-map)) tag-map))

	    (setq posts-list
		  (sort posts-list
			(lambda (a b)
			  (date-less-p (plist-get b :date) (plist-get a :date))))))))))

Go through resulting map and render org files for each tag.


(maphash
 (lambda (tag link-plists)
   (let
       ((tag-file (concat tag-dir tag ".org")))
     (with-temp-file tag-file
       (insert (format "#+TITLE: Tag: %s\n" tag))
       (insert "#+OPTIONS: toc:nil\n\n")
       (insert (format "* %s\n" tag))

       (dolist (link-plist link-plists)
	 (let*
	     ((title (plist-get link-plist :title))
	      (description (plist-get link-plist :description))
	      (file (plist-get link-plist :file))
	      (date (plist-get link-plist :date))
	      (preview (plist-get link-plist :preview))
	      (relative-file (file-relative-name file (file-name-directory tag-file))))
	   (insert (format "** [[file:%s][%s]]\n" relative-file title))
	   (insert (format "%s\n" description))
	   (insert (format "#+BEGIN_QUOTE\n%s ...\n#+END_QUOTE\n" preview))
	   (insert (format "%s\n" (my/date-format date))))))))
 tag-map)

Resulting page example: @org-mode

Posts index.

Previously, I maintained a list of posts manually. Now that I have more than 5 posts, I no longer want to do it by hand. So, after introducing tags, I've decided to automate the page with the list of posts. The same idea, almost the same data, will be integrated into the same file. Pagination has not yet been implemented and is not planned.

Posts index automation.

Here is the part of function which renders list of posts and list of tags with number of posts in each tag category.

(let
	((index-file (concat base-dir "/posts.org")))
      (with-temp-file index-file
	(org-mode)

	(insert "#+TITLE: Alex Mikhailov - Blog\n")
	(insert "#+AUTHOR: Aleksandr Mikhailov\n")
	(insert "#+DESCRIPTION: Index page for my blog\n")
	(insert "#+OPTIONS: toc:nil\n\n")

	(insert "* Posts\n")

	(dolist (post posts-list)
	  (let*
	      ((title (plist-get post :title))
	       (description (plist-get post :description))
	       (file (plist-get post :file))
	       (date (plist-get post :date))
	       (preview (plist-get post :preview))
	       (relative-file (file-relative-name file (file-name-directory file))))

	    (insert (format "** [[file:./posts/%s][%s]]\n" relative-file title))
	    (insert (format "%s\n" description))
	    (insert (format "#+BEGIN_QUOTE\n%s ...\n#+END_QUOTE\n" preview))
	    (insert (format "%s\n" (my/date-format date)))
	    )
	  )

	(insert "* Tags\n")
	(maphash
	 (lambda (tag posts)
	   (insert (format "** [[file:./tags/%s.org][%s]] (%d) \n" tag tag (length posts)))          )
	 tag-map
	 )
	)

      )

Here is an example of resulting page: Posts

Cons.

Now, with the way the rendering function is integrated into the process, it is called when going through each Org file found by Org-Export. This introduces O(n2) complexity. It's not ideal, but never mind, I will redo it before reaching the 100th post.

Whole config.

As usual, between posts, I decided to tackle a whole bunch of small tasks, so here is the entire config. I'm thinking about splitting the exporter into modules and maybe open-sourcing it properly. A neat thing I've implemented is a way to filter out unpublished posts using a PUBLISHED property. This allows me to work on drafts without affecting the blog's current state.

#+TITLE: Blog index and tags automation
#+DATE: <2024-07-05 Fri>
#+DESCRIPTION: Let's add tags to blog posts
#+PUBLISHED: false
#+TAGS: @org-mode @elisp @tags
;; Load the publishing system
;; Configure environment
;;
(setq debug-on-error t)

(let ((default-directory  (concat "~/.config/emacs/.local/straight/build-" emacs-version "/")))
  (normal-top-level-add-subdirs-to-load-path))

(add-to-list 'custom-theme-load-path
	     (concat "~/.config/emacs/.local/straight/build-" emacs-version "/doom-themes"))
(add-to-list 'custom-theme-load-path (concat "~/.config/emacs/.local/straight/build-" emacs-version "/base16-theme"))
(add-to-list 'custom-theme-load-path (concat "~/.config/emacs/.local/straight/build-" emacs-version "/moe-theme"))


(require 'xml)
(require 'dom)
(require 'ox-publish)
(require 'ox-rss)
(require 'org)
(require 'esxml)
;; (require 'esxml-html)

;;
;;Variables
;;
(setq
 my/url "https://fidonode.me"
 my/web-export-path "./public"
 my/blog-src-path "./home/05 Blog"
 my/lang-substitution-map '(("elisp" . "lisp"))
 org-html-validation-link nil            ;; Don't show validation link
 org-html-htmlize-output-type nil
 org-src-fontify-natively t)

;;
;;Templates
;;
(defun my/footer (info)
  `(footer ((class .  "footer"))
    (div ((class . "container"))
	 (hr () )
	 (small ()
		(p () "Alex Mikhailov")
		(p () "Built with: "
		   (a ((href . "https://www.gnu.org/software/emacs/")) "GNU Emacs") " "
		   (a ((href . "https://orgmode.org/")) "Org Mode") " "
		   (a ((href . "https://picocss.com/")) "picocss")
		   )))))

(defun my/header (info)
  (let ((title-str (org-export-data (plist-get info :title) info)))
    `(header ((class . "header"))
      (div ((class . "container"))
	   (nav ()
		(ul ()
		    (li ()
			(strong () ,"Alex Mikhailov")))
		(ul ()
		    (li () (a ((href . "/index.html")) "About"))
		    (li () (a ((href . "/posts.html")) "Blog"))
		    (li () (a ((href . "/rss.xml")) "RSS"))
		    ))))))

(defun my/src-block (src-block contents info)
  "Translate SRC-BLOCK element into HTML.
 CONTENTS is nil. INFO is a plist holding contextual information."
  (let* (
	 (org-language (format "language-%s" (org-element-property :language src-block)))
	 (language (my/replace-substrings org-language))
	 (code (org-element-property :value src-block)))
    (esxml-to-xml
     `(pre ()
       (code ((class . ,language))
	     ,(org-html-encode-plain-text code)
	     )))))

(defun my/render-preview (file-name title description)
  (let* ((has-imagemagick (executable-find "magick"))
	 (full-file-path (file-truename(format "%s%s/resources/images/preview/%s.png" script-directory my/blog-src-path file-name )))
	 (file-dir (file-name-directory full-file-path))
	 (has-dir (file-directory-p file-dir))
	 (has-file (file-exists-p full-file-path))
	 (path-to-script-root (format "%shelpers" script-directory))
	 (path-to-script (format "%s/og_image_gen.sh" path-to-script-root)))
    (if (and has-imagemagick
	     (and description (not (string= description ""))))
	(progn
	  (when (not has-file)
	    (progn
	      (when (not has-dir)
		(make-directory file-dir t))
	      (shell-command (format "bash '%s' '%s' '%s' '%s' '%s'" path-to-script title description full-file-path path-to-script-root))
	      )
	    ))
      (message "Imagemagick is not installed. Preview generation skipped.")
      )))

(defun my/html-header (info)
  (let* ((title-str (org-export-data (plist-get info :title) info))
	 (description-str (org-export-data (plist-get info :description) info))
	 (file-path-str (org-export-data (plist-get info :input-file) info))
	 (base-directory-str (org-export-data (plist-get info :base-directory) info))
	 (file-name-str (file-relative-name file-path-str (format "%s/%s" script-directory base-directory-str)))
	 (img-link-str (format "%s/resources/images/preview/%s.png" my/url file-name-str))
	 (has-src-blocks (my/org-has-src-blocks-p info)))

    (my/render-preview file-name-str title-str description-str)

    (set-text-properties 0 (length title-str) nil title-str)
    (set-text-properties 0 (length description-str) nil description-str)
    (set-text-properties 0 (length img-link-str) nil img-link-str)


    `(head ()
      (meta ((charset . "utf-8")))
      (meta ((author . "Alex Mikhailov")))
      (meta ((name . "viewport")
	     (content . "width=device-width, initial-scale=1, shrink-to-fit=no")))
      (meta ((name . "color-scheme") (content . "light dark")))
      (meta ((http-equiv . "content-language") (content . "en-us")))
      ;; OG block
      (meta ((name . "description") (content .  ,description-str)))
      (meta ((property . "og:description") (content . ,description-str)))
      (meta ((property . "og:image") (content . ,img-link-str)))
      (meta ((property . "og:title") (content . ,title-str)))

      (meta ((name . "twitter:description") (content . ,description-str)))
      (meta ((name . "twitter:title") (content . ,title-str)))
      (meta ((name . "twitter:image") (content . ,img-link-str)))
      (meta ((name . "twitter:card") (content . "summary_large_image")))

      (link ((rel . "icon") (type . "image/x-icon") (href . "/resources/favicon.ico")))
      (link ((rel . "stylesheet") (type . "text/css") (href . "/resources/css/pico.sand.min.css")))
      (script ((defer . "true") (src . "https://umami.dokutsu.xyz/script.js") (data-website-id . "d52d9af1-0c7d-4531-84c6-0b9c2850011f")) ())
      (title () ,title-str)

      ,@(when has-src-blocks
	  (list
	   `(link ((id . "highlight-theme") (rel . "stylesheet") (type . "text/css")))
	   `(script ((src . "https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/highlight.min.js")) ())
	   `(script ((src . "https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.9.0/build/languages/bash.min.js")) ())
	   `(script ((src . "https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.9.0/build/languages/lisp.min.js")) ())
	   `(script ((src . "/resources/js/theme-selector.js")) ())
	   )))))

(defun my/regular-template (contents info)
  `(main ((class . "container"))
    (raw-string ,contents)))

(defun my/blog-post-template (contents info)
  (let* ((title-str (org-export-data (plist-get info :title) info))
	 (description-str (org-export-data (plist-get info :description) info))
	 (tags (split-string  (my/org-get-property "TAGS" info)))
	 (tags-html (cl-map 'list (lambda (tag)
				    `(li () (mark () (a ((href . ,(format  "/tags/%s.html" tag)) (class . "secondary")) ,(format "%s" tag))))) tags))
	 )
    `(main ((class . "container blog-post"))
      (hgroup ()
	      (h1 () ,title-str)
	      (p () ,description-str)
	      (nav () (ul () (li () "Tags:") ,@tags-html))

	      )
      (raw-string ,contents)
      )))

(defun my/template (contents info)
  (let
      ((file-path-str (org-export-data (plist-get info :input-file) info)))

    (concat
     "<!DOCTYPE html>"
     (esxml-to-xml
      `(html ((lang . "en"))
	,(my/html-header info)
	(body ()
	      ,(my/header info)
	      ,(if (string-match-p "\/posts\/" file-path-str)
		   (my/blog-post-template contents info)
		 (my/regular-template contents info))
	      ,(my/footer info)
	      ))))))


(org-export-define-derived-backend 'my-html 'html
  :translate-alist '(
		     (template . my/template)
		     (src-block . my/src-block)
		     ))

(defun my/publish-to-html (plist filename pub-dir)
  "Publish an Org file to HTML using the custom backend."
  (org-publish-org-to 'my-html filename ".html" plist pub-dir))

;;
;;Build blog index
;;

(defun my/publish-blog-index (plist FF FFFF)
  "Process all Org files in 'posts' and 'tags' directories, create index file, and insert links."

  (let* ((base-dir (plist-get plist :base-directory) )
	 (tag-dir (concat base-dir "/tags/"))
	 (posts-dir (concat base-dir "/posts"))
	 (post-org-files (directory-files-recursively posts-dir "\\.org$"))
	 (tag-map (make-hash-table :test 'equal))
	 (posts-list (list))
	 )

    ;; Ensure tag directory exists
    (unless (file-directory-p tag-dir)
      (make-directory tag-dir t))

    ;; Scan all org files and collect tags
    (dolist (file post-org-files)
      (with-temp-buffer
	(insert-file-contents file)
	(let* ((parsed-info (list :parse-tree (org-element-parse-buffer)))
	       (tags (split-string (my/org-get-property "TAGS" parsed-info)))
	       (title (my/org-get-property "TITLE" parsed-info))
	       (date (my/org-get-property "DATE" parsed-info))
	       (description (my/org-get-property "DESCRIPTION" parsed-info))
	       (published (my/org-get-property "PUBLISHED" parsed-info))
	       (preview (with-temp-buffer
			  (insert-file-contents file)
			  (my/get-first-two-meaningful-lines)))
	       )
	  (if (and published (string= published "true"))
	      (progn
		(setq link-plist
		      (list
		       :title title
		       :description description
		       :preview preview
		       :file file
		       :date date))

		(push link-plist posts-list)

		(dolist (tag tags)
		  (puthash tag (cons link-plist (gethash tag tag-map)) tag-map))

		(setq posts-list
		      (sort posts-list
			    (lambda (a b)
			      (date-less-p (plist-get b :date) (plist-get a :date))))))))))

    (let
	((index-file (concat base-dir "/posts.org")))
      (with-temp-file index-file
	(org-mode)

	(insert "#+TITLE: Alex Mikhailov - Blog\n")
	(insert "#+AUTHOR: Aleksandr Mikhailov\n")
	(insert "#+DESCRIPTION: Index page for my blog\n")
	(insert "#+OPTIONS: toc:nil\n\n")

	(insert "* Posts\n")

	(dolist (post posts-list)
	  (let*
	      ((title (plist-get post :title))
	       (description (plist-get post :description))
	       (file (plist-get post :file))
	       (date (plist-get post :date))
	       (preview (plist-get post :preview))
	       (relative-file (file-relative-name file (file-name-directory file))))

	    (insert (format "** [[file:./posts/%s][%s]]\n" relative-file title))
	    (insert (format "%s\n" description))
	    (insert (format "#+BEGIN_QUOTE\n%s ...\n#+END_QUOTE\n" preview))
	    (insert (format "%s\n" (my/date-format date)))
	    )
	  )

	(insert "* Tags\n")
	(maphash
	 (lambda (tag posts)
	   (insert (format "** [[file:./tags/%s.org][%s]] (%d) \n" tag tag (length posts)))          )
	 tag-map
	 )
	)

      )

    ;; Create tag files and insert links
    (maphash
     (lambda (tag link-plists)
       (let
	   ((tag-file (concat tag-dir tag ".org")))
	 (with-temp-file tag-file
	   (insert (format "#+TITLE: Tag: %s\n" tag))
	   (insert "#+OPTIONS: toc:nil\n\n")
	   (insert (format "* %s\n" tag))

	   (dolist (link-plist link-plists)
	     (let*
		 ((title (plist-get link-plist :title))
		  (description (plist-get link-plist :description))
		  (file (plist-get link-plist :file))
		  (date (plist-get link-plist :date))
		  (preview (plist-get link-plist :preview))
		  (relative-file (file-relative-name file (file-name-directory tag-file))))
	       (insert (format "** [[file:%s][%s]]\n" relative-file title))
	       (insert (format "%s\n" description))
	       (insert (format "#+BEGIN_QUOTE\n%s ...\n#+END_QUOTE\n" preview))
	       (insert (format "%s\n" (my/date-format date))))))))
     tag-map)
    )
  )

;;
;;Sitemap/RSS
;;
(defun my/format-rss-feed-entry (entry style project)
  "Format ENTRY for the RSS feed.
 ENTRY is a file name.  STYLE is either 'list' or 'tree'.
 PROJECT is the current project."
  (cond ((not (directory-name-p entry))
	 (let* ((file (org-publish--expand-file-name entry project))
		(title (org-publish-find-title entry project))
		(date (format-time-string "%Y-%m-%d" (org-publish-find-date entry project)))
		(link (concat (file-name-sans-extension entry) ".html")))
	   (with-temp-buffer
	     (org-mode)
	     (insert (format "* [[file:%s][%s]]\n" file title))
	     (org-set-property "RSS_PERMALINK" link)
	     (org-set-property "RSS_TITLE" title)
	     (org-set-property "PUBDATE" date)
	     (let ((first-two-lines (with-temp-buffer
				      (insert-file-contents file)
				      (buffer-substring-no-properties
				       (point-min)
				       (progn (forward-line 2) (point))))))
	       (if (string-suffix-p "\n" first-two-lines)
		   (setq first-two-lines (substring first-two-lines 0 -1)))
	       (insert first-two-lines))
	     (goto-char (point-max))
	     (insert "...")
	     (buffer-string))))
	((eq style 'tree)
	 ;; Return only last subdir.
	 (file-name-nondirectory (directory-file-name entry)))
	(t entry)))

(defun my/format-rss-feed (title list)
  "Generate RSS feed, as a string.
 TITLE is the title of the RSS feed.  LIST is an internal
 representation for the files to include, as returned by
 `org-list-to-lisp'.  PROJECT is the current project."
  (concat "#+TITLE: " title "\n"
	  "#+STARTUP: showall \n\n"
	  (org-list-to-subtree list 1 '(:icount "" :istart ""))))

(defun my/publish-to-rss (plist filename pub-dir)
  "Publish RSS with PLIST, only when FILENAME is 'rss.org'.
 PUB-DIR is when the output will be placed."
  (if (equal "rss.org" (file-name-nondirectory filename))
      (org-rss-publish-to-rss plist filename pub-dir)))

;;
;;Helpers
;;

(defun cleanup-org-line (line)
  "Clean up Org-mode formatting from a single LINE."
  (let ((cleaned-line (org-no-properties line)))
    (setq cleaned-line (replace-regexp-in-string "^\\*+ " "" cleaned-line)) ; Remove heading stars
    (string-trim cleaned-line))) ; Remove leading/trailing whitespace

(defun my/get-first-two-meaningful-lines ()
  "Extracts the first two meaningful lines from an Org-mode buffer."
  (interactive)
  (save-excursion  ; Do not change the cursor position permanently
    (goto-char (point-min))  ; Start at the beginning of the buffer
    ;; Loop to skip comments and properties
    (while (looking-at "^[#*]")
      (forward-line 1))  ; Move to the next line until a meaningful line is found
    ;; Now read and return the first two meaningful lines

    (cleanup-org-line (buffer-substring-no-properties
		       (point)
		       (progn (forward-line 2) (point)))
		      ))
  )

(defun date-less-p (date1 date2)
  "Return t if DATE1 is less than DATE2.
 DATE1 and DATE2 should be strings in the format <YYYY-MM-DD Day>."
  (let* ((date1 (substring date1 1 11))  ; Extract the date part
	 (date2 (substring date2 1 11))
	 (time1 (apply 'encode-time (parse-time-string (concat date1 " 00:00:00"))))
	 (time2 (apply 'encode-time (parse-time-string (concat date2 " 00:00:00")))))
    (time-less-p time1 time2)))

(defun my/date-format (org-date)
  "Convert an Org-mode date string ORG-DATE to a formatted date string."
  (let* ((parsed-time (org-parse-time-string org-date))
	 (time (apply 'encode-time parsed-time)))
    (format-time-string "posted on %Y-%m-%d" time)))

(defun my/format-date-subtitle (file project)
  "Format the date found in FILE of PROJECT."
  (my/date-format (org-publish-find-date file project)))

(defun my/pt (var)
  "Print the value and type of VAR."
  (message "Value: %S, Type: %s" var (type-of var)))

(defun plist-keys (plist)
  "Return a list of keys in the property list PLIST."
  (let (keys)
    (while plist
      (setq keys (cons (car plist) keys))
      (setq plist (cddr plist)))
    (nreverse keys)))

(defvar script-directory
  (file-name-directory (or load-file-name buffer-file-name))
  "The directory where the current script is located.")

(defun my/org-get-property (property info)
  "Get the value of a PROPERTY in the Org document represented by INFO."
  (org-element-map (plist-get info :parse-tree) 'keyword
    (lambda (keyword)
      (when (string= (org-element-property :key keyword) property)
	(org-element-property :value keyword)))
    nil t))

(defun my/org-has-src-blocks-p (info)
  "Return t if the Org document represented by INFO has source code blocks."
  (org-element-map (plist-get info :parse-tree) 'src-block
    (lambda (src-block) t)
    nil t))

(defun my/replace-substrings (input-string)
  "Replace substrings in INPUT-STRING according to SUBSTITUTION-MAP."
  (let ((output-string input-string))
    (dolist (pair my/lang-substitution-map)
      (let ((old (regexp-quote (car pair)))
	    (new (cdr pair)))
	(setq output-string (replace-regexp-in-string old new output-string))))
    output-string))

;;
;;Clear folder with results
;;
(when (file-directory-p my/web-export-path)
  (delete-directory my/web-export-path t))
(mkdir my/web-export-path)


;;
;;Main blog configuration
;;
(setq org-publish-project-alist
      (list
       (list "blog-index"
	     :base-directory my/blog-src-path
	     :base-extension "org"
	     :recursive t
	     :publishing-directory my/web-export-path
	     :publishing-function 'my/publish-blog-index)
       (list "static"
	     :base-directory my/blog-src-path
	     :base-extension "css\\|js\\|png\\|jpg\\|jpeg\\|gif\\|pdf\\|ico\\|txt"
	     :publishing-directory my/web-export-path
	     :recursive t
	     :publishing-function 'org-publish-attachment
	     )
       (list "blog"
	     :recursive t
	     :base-directory my/blog-src-path
	     :publishing-directory my/web-export-path
	     :publishing-function 'my/publish-to-html
	     :html-html5-fancy t
	     :htmlized-source t
	     :with-author nil
	     :with-creator t
	     :with-toc t
	     :section-numbers nil
	     :time-stamp-file nil
	     )
       (list "blog-rss"
	     :author "Alex M"
	     :email "iam@fidonode.me"
	     :base-directory my/blog-src-path
	     :base-extension "org"
	     :recursive t
	     :exclude (regexp-opt '("rss.org" "index.org" "404.org" "posts.org"))
	     :publishing-function 'my/publish-to-rss
	     :publishing-directory my/web-export-path
	     :rss-extension "xml"
	     :html-link-home my/url
	     :html-link-use-abs-url t
	     :html-link-org-files-as-html t
	     :auto-sitemap t
	     :sitemap-filename "rss.org"
	     :sitemap-title "rss"
	     :sitemap-style 'list
	     :sitemap-sort-files 'anti-chronologically
	     :sitemap-function 'my/format-rss-feed
	     :sitemap-format-entry 'my/format-rss-feed-entry)
       ))


;; Generate the site output
(org-publish-all t)

(message "Build complete!")