Drupal isn’t bad at supporting SEO by default. But there are a few things you should bear in mind when you want to create the best experience for your site editors working with SEO. 

In this blogpost we'll list some main takeaways we have from the great session "Drupal SEO Pitfalls and How To Avoid Them" from Dropsolid. The session was held on day 3 of the Drupal Dev Days in Ghent which Digitalist attended.

Avoid “thin content“ pages

Often sites have pages accessible even though they do not fill a purpose by themselves.  This type of content can be an image content type that's displayed in a carousel on your website or several nodes each containing an image and name that are part of a bigger landing page listing.

This content will result in "Thin content" pages indexed by search engines. And you don't want these urls to be indexed on your site because they are a "waste of resource" (bandwidth, crawl budget, database storage) and your SEO score will go down from it.

A solution is the "Rabbit hole" module. This module provides multiple options to change the way  search engines view a page and controls what should happen when the entity is being viewed at its own page. To fix the pitfall example above we want to set the urls to "404-page not found" to make sure we do not have "Thin content" pages indexed by search engines.

Make sure metadata is editable 

Sometimes pages on the website are not their own entities and can therefore create problems for your editors to handle metadata editing.

As an example, consider the scenario where the front page is generated from different content or a collection of data listed on a page, eg. an employees page put together by different employee entities. 

For an editor, who wants to change the SEO metadata like meta title and description or simply change the XML sitemap settings for this specific page this will create a problem.

The solution is to make sure you always have an entity as base for our content. Today Drupal core comes with layout builder module that can help solve this. We also have modules like paragraphs that  will sort this out by having an entity to use for the pages SEO metadata.

You have an indexable internal search

Search engines like google can index your search result pages and the result is that google might give your internal search result as answer to the google search.  This will again create "thin content” pages and should be avoided.

Luckily since you already made sure to always have an entity for all pages the simple solution here is to install metatag module and set the entity that holds your search result to "Prevent search engines from indexing this page" and "Prevent search engines from following links on this page" and problem is solved.

As you can see, with some very simple architectural planning of your website you can get quite far in improving the possibilities for editors to optimize the site for search engines. And as a developer you can even do more.

More things to bear in mind

Some other common things to at least always consider is the following:

  • Use the Pathauto module to generate clean, readable and meaningful urls.
  • Aggregate and minify js/css files. (Faster is always better)
  • Use image compression. Either in Drupal core or if you want to let your editors have more control (but also give them  more work) tools like Squoosh exist.  
  • Make sure assets are not accidentally  blocked by robots.txt. Eg. the favicon or folders containing content that should be accessible for visitors on your site.  A rule of thumb is to always allow search bots access to everything as an end user would.
  • Use Google Search Console to check your website. To use this tool you need to verify ownership of the website. As a site owner you can verify this through Google Analytics or Google Tag Manager account and as a developer you can add a file or a html tag.

Robots.txt disallow != noindex

Another thing worth mentioning is the difference between using robots.txt and “noindex” using metatag.

Robots.txt instructions impact crawling but not indexing and noindex directives using metatag impacts indexing but not crawling.

What this means is for example if you disallow a page in robots.txt, search engines like Google can still index this page if another website is linking to it. But google can not crawl it. The result is a search result snippet without a title or description since google is not able to crawl the page.

If you have an old page url that you want to get rid of in google, set it only to “noindex” using metatag because you still want it to be crawled to find the noindex directive.

Avoid sudden layout shifts to improve user-experience also improves your SEO

Visual stability issues can happen and google is using Cumulative Layout Shift (CLS) score to measure this on the website. You want to avoid unexpected layout shifts and jumping elements to improve user-experience but also to improve your SEO score.

The most common causes to watch out for when preventing CLS are:

  • Images without dimensions 
  • Ads, embeds, and iframes without dimensions
  • Dynamically injected content
  • Web Fonts causing FOIT/FOUT
  • Actions waiting for a network response before updating DOM
No items found.