simaec.net Web Publishing

Using Obsidian as CMS

Obsidian is an application to write and organize notes as personal knowledge base. Notes are kept as Markdown files on the local drive. The application is used for every day note-taking, writing and research but it can be easily adapted to other purposes like project management tool or, in our case, content management system.

Key Takeaways

  • Obsidian is a convenient application to write and manage content for a website.
  • There are different options available to convert notes into html pages.
  • Suitable to manage meta data for items in a large dataset
  • Custom python script is a convient method to convert notes to html pages

Using Obsidian as CMS? Why and How Obsidian can serve as headless CMS. Explaining my approach with talking points about Obsidian, Markdown and Python - YouTube Video

Why Using Obsidian as CMS?

Obsidian is a note-taking, knowledge management application and not designed to be used as a CMS. Why could you consider maintaining website content within a Obsidian vault? Primarily, because Obsidian is a well designed interface to write and organize content which allows a content creator to write and organize content efficiently. Here a some advantages of using Obsidian:

  • Writing notes in markdown format is the simplest option when writing text while still providing basic, essential text formatting
  • Notes are stored on the local drive in small text files instead of a large locally or remotely stored file
  • References among notes are updated automatically
  • Efficiently backup with private Github repository using Github Desktop or command line statements.

How to Publish Web Pages?

Although Obsidian notes are Markdown files, a slim version of a markup language, these notes cannot directly be served in a web site. There must be some conversion from Markdown to HTML before the content can be displayed as a web page.

  • Adding front-matter to a note which then can be served as web page by pushing the note to a Github repository connected to a website via Github Pages.
  • Running Jekyll locally, creating static web pages which then can be hosted on a regular hosting service.
  • Using the publishing service offered by Obsidian which allows you to publish a selection of notes directly from you Obsidian Vault to a website hosted by Obsidian.

These approaches won't work for us because we need control of web server settings not provided by Github Pages or Obsidian publishing service. We publish websites using Firebase Hosting. A custom script written in Python converts Obsidian notes to HTML files which then are deployed to the hosting platform.

Some Ground Rules

Use of Obsidian shouldn't rely on plugins. Although Obsidian plugins can expand the functionality, we prefer a purist approach using the base functionality of Obsidian.

Writing content in Markdown files should adhere to basic Markdown syntax as close as possible, avoiding unnecessary complexity.

How to Deal with Specially Formatted Text Blocks

We sometimes highlight text paragraphs. For the conversion of Markdown to HTML you can associate a css class to a paragraph using native HTML syntax. Example:


<p class="warning">Text paragraph associated with a css class</p>

Oooh, I don't like this approach but no alternative in sight for now. Prefer the Jekyll approach.

How to Deal with Structured Data?

One set of structured data covers the attributes of a web page like head meta tags content (language, title, description, canonical url etc.) Declaring these parameters using front-matter is an obvious choice but we applied another approach as front-matter comes with some disadvantages.

Our implementation keeps this data in a published note, initiated with an h3 heading named "Meta" followed by a list of items where a parameter is kept before a colon and the value of the parameter behind the colon:

For this page


### Meta
- website: simaec.net
- path: /website-development/using-obsidian-as-cms/
- title: Using Obsidian as CMS
- description: How to implement Obsidian as content management system for Websites. 
- author: Karl-Heinz Müller

Another set of structured data is the content itself or a section of a website. Our website faunaflora.photography, for example, publishes information about 150+ species, 500+ photos, 50+ videos and 1000+ log entries about in field species identifications and observations. We store the data for each item in a separate note following a pre-determined headings, lists and texts.

A python script converts these notes into a pandas data frames and then converts the data into web pages. The structure of these type of documents is rigid but still provides enough flexibility to extend information about an item.

As an example the Markdown of a photo file, close up of an American Bullfrog (Rana Catesbeiana):


## khm-20190624-1612-0000104643

![American Bull Frog](https://simaecnet.imgix.net/photos/khm-20190624-1612-0000104643.jpg?w=1200&h=400&fit=crop&auto=format,compress&crop=entropy "American Bull Frog")

![American Bull Frog](https://simaecnet.imgix.net/photos/khm-20190624-1612-0000104643.jpg?w=200&h=200&fit=crop&auto=format,compress&crop=entropy "American Bull Frog")


### Title

American Bull Frog (Rana Catesbeiana)

### Caption

American Bull Frog (Rana Cateisbana) - Close Up

### Meta

- camera: NIKON D500
- lens: 90mm f/2.8
- exposure_time: 1/1000
- f_number: f/3.3
- focal_length: 90mm
- iso: 200
- topic: species
- species: American Bullfrog (Rana Catesbeiana)
- location: Parc Bernard-Landry
- city: LAVAL
- state: QC
- country: CA
- width: 4219
- height: 2808
- presentation: entropy
- tags: amphibians, frogs

We have two Markdown code lines for the image. These are two different layouts which serve us to verify that image rendering (presentation) is correct for the corresponding photo. It further allows viewing the photo as support when writing title and caption.

Meta section contains structured data about the photo. Most of the content in this section has been extracted from exif data of the the photo.

How to Deal with Media Files Like Photos and Videos?

Although Obsidian can contain media files, optimized publishing of high resolution media files requires CDN. We keep in Obsidian ONLY notes for media files. The media files itself are stored outside of the Obsidian Vault. Editing content about a media file, or linking media file within a web page then can be done within Obsidian while creating the final published web page will convert the reference to the media file into the correct html snippet.

The script we deploy for this task has been adjusted to deal with high resolution photos. The resulting html code is customized for the image rendering and CDN service (IMGIX) we are using.

Python Script

We developed python scripts, custom static page generators, to create web pages from Obsidian notes. The script loads the content of all markdown files within specified folders and converts them from markdown to a fully fledged, state of the art, static web pages.

In more details, the script reads a file, cuts off the meta data section and converts the remaining part into html using python's markdown library with a few extensions (fenced_code, tables, attr_list). The meta data section is read line by line and converted into structured data.

Then, the script combines meta data and converted markdown into a dictionary which is rendered into html using Python's Jinja 2 library and templates. The resulting code is saved in a folder containing a local copy of the website which can be viewed as localhost and deployed to the corresponding firebase hosting project.

We use Python 3.8.x and the libraries used are:


import os
import markdown

import lxml.html
import lxml.etree
import re

from jinja2 import Environment, FileSystemLoader

At the moment, we are still re-factoring the deployment python scripts. We will publish it later in a Github repository. If you would like to see the code now, please contact us.

Future Additions and Remarks

We still have to deal with with media files (photos, videos, graphs) referenced within a content block of a note.

How about using Obsidian as CMS in a team?

Glossary

Front-matter is a section on top of a markdown file written in YAML ("YAML is a human-friendly data serialization language for all programming languages."), initiated by 3 dashes and separated from the rest of the content by another 3 dashes.