Pelican Sitemap and Pagination

Posted on February 22, 2014 in blog • 1 min read

Pelican (the static site generator that I'm using to generate this blog) doesn't seem to generate a sitemap on its own, so I spent a bit of time today searching for a way to do so that's easily integrated with Pelican; surely someone must have already solved this problem, right? Well, it turns out that indeed, there's already a plugin for it in the pelican-plugins repository, and it's really easy to use!

pelicanconf.py:

PLUGIN_PATHS = ['/path/to/cloned/pelican-plugins/repo']
PLUGINS = ['sitemap']

SITEMAP = {
    'format': 'xml',
    'priorities': {
        'articles': 0.5,
        'indexes': 0.5,
        'pages': 0.5
    },
    'changefreqs': {
        'articles': 'monthly',
        'indexes': 'daily',
        'pages': 'monthly'
    }
}

However, if you take a look at the sitemap that it generates, e.g. my site's own sitemap, you'll notice that it only seems to recognize the top-level index.html page, and not any of the other pages that are generated by Pelican with pagination turned on (DEFAULT_PAGINATION = # in pelicanconf.py), e.g. index2.html, index3.html, index4.html, and so on.

Luckily, the sitemap plugin is quite simple to grok and I hacked together a workaround for this in just a few minutes. If you stumbled upon this blog post and you have the same problem, it basically comes down to changing:

for standard_page_url in ['index.html', 'archives.html',
                          'tags.html', 'categories.html']:

to this:

standard_pages = ['index.html', 'archives.html',
                  'tags.html', 'categories.html']

for i in range(100):
    standard_pages.append('index'+str(i)+'.html')

for standard_page_url in standard_pages:

in sitemap/sitemap.py.

(Yes, I know that's rather ugly, but it works...you can find my pull request if you've like to suggest a different solution.)