Again a post mostly for my own memory: how I set up my website with jekyll, github pages, a custom domain, and automatic publication listing.

Contents:

Jekyll and Github pages basic setup

Mostly done in the past, so this is from memory:

  • jekyll is a static website rendering engine which takes input files (typically markdown), and outputs a pretty html static website in its _site/ subdir.
  • github pages (gh-pages) plays together really well with jekyll; as in you initialize a repository, initialize it with jekyll, check in the jekyll input files, and gh-pages will auto-render and serve.
  • The main homepage and index over the blog posts are “Front Matter” pages, written in markdown.
  • I use the kramdown dialect which seems pretty flexible & powerful.
  • I probably chose one of the simplest default themes, very plain and basic. Fits my personality.

Setting up DNS in GoDaddy

I registered my domain with GoDaddy (the biggest domain name registrar, as I read somewhere). Now I need to set it up so the domain name cleanly points to the github pages. Followed mostly this tutorial to set up the DNS to point to my github pages.

My specific setting: I want the whole website to be under a subdomain (tom.sercu.me), and redirect either sercu.me or www.sercu.me there. I set this up through Subdomain Forwarding in the GoDaddy dashboard. For the apex domain sercu.me to redirect to the tom subdomain, I just added a Subdomain Forwarding, filled in @ for subdomain (I kinda guessed that this means empty string in DNS speak, which it is). This actually became (non sub-) Domain Forwarding and overwrote the 4 github pages IP’s by a GoDaddy IP.

Turns out that the IP settings as A records seem to only matter for setting up the apex (root) domain, so just setting the CNAME in my DNS settings: tom -> tomsercu.github.io is enough. Confirmed with dig tom.sercu.me does list the 4 github IP’s.

Investigating load times

This seems pretty good https://tools.pingdom.com/#!/cH6lCN/tom.sercu.me

  • DNS 34ms, wait 13ms
  • some http to https redirect going on (got flagged):
    Remove the following redirect chain if possible:
    http://tom.sercu.me/
    https://tom.sercu.me/
    

    But I’m not easily finding how to remove that redirect chain.. isn’t it up to the browser to query https immediately rather than first trying http?

  • Longer lifetime of browser caching for https://tom.sercu.me/css/main.css maybe a jekyll setting?

Fancier Jekyll stuff

  • header.html has a responsive site-nav thingy on the right side, which iterates over Jekyll site.pages, which is alphabetic by default. Instead I added a liquid filter to sort according to an order, set in the YAML front matter.

    {% assign sorted_pages = site.pages | sort:"order" %}
    {% for page in sorted_pages %}
    etc
    {% endfor %}
    
  • For references I use kramdown footnotes, simple enough.

  • For math, \(\LaTeX\) is supported by default through the MathJax javascript library. The library itself needs to be manually included on the page from CDN, see http://docs.mathjax.org/en/latest/start.html:

    <script type="text/javascript" async
      src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML">
    </script>
    
  • Setting defaults in _config.yml; layout page or post for regular pages and posts respectively, and setting default author for posts.
  • Using page.categories to organize posts between blog, paper (summaries), talks, media. Each of those 4 categories has its own index page, which is abstracted into a layout. A shortlist (most recent / selected) appears on the main page through liquid array filtering (slice to truncate, where to filter selected publications). The formatting of entries is different depending on category, but should be consistent between full index pages and truncated frontpage; this is abstracted into _includes/indexlist.html where the category and list is passed in as arguments.

Publications script

I wrote a small python script. to automatically iterate over a list with all my publications, automatically grabbing the pdf out of my personal reference management system called ref, converting it to thumbnail preview based on Andrej Karpathy’s script. Title/author information is then also used from ref, and the abstract and publication date is pulled from arXiv based on the arXiv ID.

A yaml file is written per publication in _posts/pubs/.

Still todo

  • Change colors like cims website
  • Set up godaddy email fwd
  • Overview of posts/pages by tags.
  • Comments:
    • I first thought of using disqus which is easy but everything depends now on a third party providing scripts & hosting.
    • But my main problem is that I see many popular ML bloggers having most discussion about their blog posts playing out on twitter. So somehow I started dreaming about automatically pulling all twitter discussion to display in the comment section. Maybe together with an option of classical comments below the page.
    • Staticman seems perfect! (found through this blog post). Staticman uses some javascript and a node.js server to accept user comments, create a jekyll data file in the repo which holds the comment, and push them to github automatically, to be displayed with liquid iteration.
    • This would work well to merge twitter + comments, if I have a chron job running that keeps track of tweets according to some criterion, then auto-commits links to them to be displayed with some twitter plugin.
    • Bonus points because this would give my raspberry pi something useful to do: the chron job for tweets, and maybe even running the staticman node.js server? (opening up port 80 to the world.. shudder)
    • But hey my paternity leave is almost up so yeah.. this probably ain’t gonna happen anytime soon.
  • Actually write blog posts