Add and use an image sitemap with Hugo

Contents

Not so far ago I elaborated on adding copyright information for images on a Hugo-based website. Through this approach, I managed to learn how to get images per post, then list and use in Schema for Images.

Recently, when I read updated Google image SEO best practices I noticed a section called Use an image sitemap where we read:

“You can provide the URL of images we might not have otherwise discovered by submitting an image sitemap”.

I decided to see how I can implement that into my Hugo website.

I started by looking at the example of image sitemap and decided to give it a try.


Hugo gives you flexibility in adding additional output files. Through this, I decided to create a sitemap for images.

Configuration

I started with specifying additional output for my sitemap in the Hugo configuration file (hugo.toml).

[outputFormats.imagessitemap]
  baseName  = 'imagessitemap'
  mediaType = 'application/xml'
  noUgly = true # default is false

Further, in the same file, I added it to my outputs.

[outputs]
  page = [ "html"]
  home = [ "html", "rss", "imagessitemap"]

Multilingual adjustment

If you using a multilingual approach you may need to do an additional action here.

My Polish site is served in pure baseURL / where my English part is /en/.

The above output will generate the imagesitemap.xml file in / (Polish) and /en/ (English).

The problem is that, with a multilingual approach, the /sitemap.xml is not a sitemap for the Polish part of the website. This file is serving as a Sitemap Index file listing sitemaps in language folders.

For the Polish part of the website, despite it being in /, the sitemap with posts will be /pl/sitemap.xml and for English /en/sitemap.xml.

This may be a bit of OCD, but I would like my sitemap for images to be generated in the language folder. I would like the Polish part to be in /pl/imagessitemap.xml. To do that, I need to add in the configuration file, in my main language section, the following:

  [languages.pl.outputFormats.imagessitemap]
    baseName  = 'imagessitemap'
    mediaType = 'application/xml'
    noUgly = true # default is false
    path  = 'pl'

For the English part, it is not required as it will, by default, generate inside /en/ folder.


Layout

Now we need to create our custom output layout.

I created file called home.imagessitemap.xml in layouts\_default folder.

{{ printf "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?>" | safeHTML }}
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">

  {{ range .Site.RegularPages }}{{ if ne .Params.sitemap_exclude true }}
  {{ if or (.Params.featuredImage) (findRE `(?s)<img.+?>` .Content) }}
    {{- if .Permalink -}}
      <url>
        <loc>{{ .Permalink }}</loc>
        {{ if .Params.featuredImage }}

          {{- $ftimgsrc := "" -}}
          {{ if hasPrefix .Params.featuredImage "/" }}
              {{ $ftimgsrc = resources.Get .Params.featuredImage }}
          {{ else }}
              {{ if .Page.BundleType }}
                  {{ $ftimgsrc = .Page.Resources.GetMatch .Params.featuredImage }}
              {{ else }}
                  {{ $path := path.Join .Page.File.Dir .Params.featuredImage }}
                  {{ $ftimgsrc = resources.Get $path }}
              {{ end }}
          {{ end }}

          <image:image>
            <image:loc>{{ $ftimgsrc.Permalink | safeURL }}</image:loc>
          </image:image>{{ end }}
          {{ if (findRE `(?s)<img.+?>` .Content) }}{{ range $k, $_ := findRE `(?s)<img.+?>` .Content }}{{ if $k }}{{ end }}
            <image:image>
              <image:loc>{{ replaceRE `(?s).*src="(.+?)".*` "$1" . | absURL }}</image:loc>
            </image:image>{{ end }}
        {{ end }}
      </url>{{- end -}}
  {{ end }}
  {{ end }}{{ end }}
  
</urlset>

In this file, I already reused exclusion ({{ if ne .Params.sitemap_exclude true }}) for posts that got specified sitemap_exclude: true in frontmatter.

I followed it with a condition if or where I will list the post only if got either featuredImage specified in the frontmatter or any image added through the content utilising findRE function that I learned before.

In this way, the post URL, specified between <loc> will appear only if there are images to report.

By running Hugo locally we can verify, under localhost:1313/imagessitemap.xml, or if we have multilanguage site localhost:1313/en/imagessitemap.xml or localhost:1313/pl/imagessitemap.xml are our site is there.

Announcing new sitemaps

The last bit is to report the sitemap in such a way, that Google and other search engines will be able to detect them.

The best approach is to report it in the robots.txt file that is normally located in our static\ folder.

If you don’t know how to create robots.txt file, read How to write and submit a robots.txt file at Google Search Central.

Below other sitemaps that may be already there I added the following:

Sitemap: https://dariusz.wieckiewicz.org/pl/imagessitemap.xml
Sitemap: https://dariusz.wieckiewicz.org/en/imagessitemap.xml

To speed things up, I manually added them in the sitemaps section at Google Search Console.

And that’s all!

Comments
Categories