Generating dynamic sitemaps with Next.js + React.js + next-sitemap

Gabriel Messias da Rosa

17 meses atrás / 10 minutos de leitura

React Next.js HTML CSS SEO TYPESCRIPT


Sitemap is a file that explains to search engines, the structure of a website, as well as its updates and the main pages of the website to be indexed. It is usually created and used in XML format.

Google Sitemaps is one of several services provided by Google to its users. With it, you can easily and quickly consult your website's index statistics, crawl speed, page rank and search statistics, which list the most searched items on Google that direct users to places on the web page.

If your site works with the Next.js and React.js framework, you may have noticed that Next works very well with SSR (Server Side Rendering) and dynamic routes. But how to make these dynamic routes part of the sitemap automatically? That's what we'll see today!

Basic configuration of next-sitemap

Assuming we have this site that needs a sitemap for google indexing.

Example website

We need to generate a sitemap with the pages, home, about and users-list, but the private-page cannot be indexed and each time a user is added to the list, the sitemap should automatically update and add the user's profile page to the sitemap.

For this we are going to use a library called “next-sitemap” (if you are using yarn use the yarn add next-sitemap command) so we can generate sitemaps according to our needs.

This library needs a .config.js file, essential for its functioning.

In the root folder of the project, create a file called “next-sitemap.config.js”

Adding configuration file in the project root folder.

So add a default configuration just for your basic working

/** @type {import('next-sitemap').IConfig} */
module.exports = {
  siteUrl: process.env.SITE_URL || "<https://example.com>",
  generateRobotsTxt: true, // (optional) 
// ...other options
};
Default next-sitemap configuration, provided by the library's official documentation.

The next-sitemap will load environment variables from the .env file by default.

The next step is to add a new script to the package.json file called postbuild, which will run our file right after the build in vercel and also locally, always generating a new sitemap for each site build.

Then add the postbuild by referencing the run command of the “next-sitemap” library.

Your scripts will look like this:

"scripts": {   
  "dev": "next",   
  "build": "next build",   
  "start": "next start",   
  "type-check": "tsc"   
  "postbuild": "next-sitemap", 
},
Adding “postbuild” to the Next scripts in package.json.

Time to build the project and see what this configuration gives us.

Note that right after the build, we have a different output giving us more details about what the next-sitemap generated.

Terminal exit after “postbuild” script has been added to package.json

Inside the PUBLIC folder 3 new files were generated.

Build result with the “postbuild” script correctly configured.

With the project running locally (use the yarn dev or yarn start commands) check the url http://localhost:3000/sitemap-0.xml

You'll see that all pages have been added to the sitemap, even the ones we don't want. Let's fix this!

<urlset xmlns="<http://www.sitemaps.org/schemas/sitemap/0.9>" xmlns:news="<http://www.google.com/schemas/sitemap-news/0.9>" xmlns:xhtml="<http://www.w3.org/1999/xhtml>" xmlns:mobile="<http://www.google.com/schemas/sitemap-mobile/1.0>" xmlns:image="<http://www.google.com/schemas/sitemap-image/1.1>" xmlns:video="<http://www.google.com/schemas/sitemap-video/1.1>">
  <url>
    <loc><https://example.com></loc>
    <lastmod>2022-09-20T11:13:12.523Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/about></loc>
    <lastmod>2022-09-20T11:13:12.523Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/private-page></loc>
    <lastmod>2022-09-20T11:13:12.523Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/users></loc>
    <lastmod>2022-09-20T11:13:12.523Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/users/101></loc>
    <lastmod>2022-09-20T11:13:12.523Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/users/102></loc>
    <lastmod>2022-09-20T11:13:12.523Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/users/103></loc>
    <lastmod>2022-09-20T11:13:12.523Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/users/104></loc>
    <lastmod>2022-09-20T11:13:12.523Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
</urlset>
The sitemap-0.xml file right after the Next postbuild script.

Deleting routes from sitemap and robots.txt with next-sitemap

To solve this let's add more settings to our next-sitemap.config.js file.

Add the “exclude” option with its value being an array of strings and each string being a route you want to exclude from the sitemap.

/** @type {import('next-sitemap').IConfig} */ module.exports = {
  siteUrl: process.env.SITE_URL || "<https://example.com>",
  generateRobotsTxt: true,
  exclude: ["/private-page"],
  // ...other options
};
Configuration file changed to exclude the route “/private-page” from the sitemap-0.xml file in the next build.

As we are also generating a robots.txt, we must remove from its radar the route that should not be indexed. For that add one more option called robotsTxtOptions, its value will be an object with more options. Use the “policies” option, its value will be an array of objects, to enable all possible robots, configure the “userAgent” key with “*”, then configure “allow” with the value a string with the route you want the robots to identify, for the entire site use the project root route, which in this case is “/”. To disable a route just do the same by changing “allow” to “disallow”.

In the end you will get a result similar to this.

/** @type {import('next-sitemap').IConfig} */ module.exports = {
  siteUrl: process.env.SITE_URL || "<https://example.com>",
  generateRobotsTxt: true,
  exclude: ["/private-page"],
  robotsTxtOptions: {
    policies: [
      { userAgent: "*", disallow: "/private-page" },
      { userAgent: "*", allow: "/" },
    ],
  },
};
Configuration of user agent access policies.

Build the project and you will see the changes in robots.txt and sitemap-0.xml

robots.txt

# *
User-agent: *
Disallow: /private-page

# *
User-agent: *
Allow: /

# Host
Host: <https://example.com>

# Sitemaps
Sitemap: <https://example.com/sitemap.xml>
Robots.txt file generated after configuring user agent access policies.

sitemap-0.xml

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="<http://www.sitemaps.org/schemas/sitemap/0.9>" xmlns:news="<http://www.google.com/schemas/sitemap-news/0.9>" xmlns:xhtml="<http://www.w3.org/1999/xhtml>" xmlns:mobile="<http://www.google.com/schemas/sitemap-mobile/1.0>" xmlns:image="<http://www.google.com/schemas/sitemap-image/1.1>" xmlns:video="<http://www.google.com/schemas/sitemap-video/1.1>">
  <url>
    <loc><https://example.com></loc>
    <lastmod>2022-09-20T11:36:08.228Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/about></loc>
    <lastmod>2022-09-20T11:36:08.228Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/users></loc>
    <lastmod>2022-09-20T11:36:08.228Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/users/101></loc>
    <lastmod>2022-09-20T11:36:08.228Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/users/102></loc>
    <lastmod>2022-09-20T11:36:08.228Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/users/103></loc>
    <lastmod>2022-09-20T11:36:08.228Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc><https://example.com/users/104></loc>
    <lastmod>2022-09-20T11:36:08.228Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.7</priority>
  </url>
</urlset>
The sitemap-0.xml file generated after configuring the exclusion of the project's private route.

Dynamic sitemap with Next SSR + next-sitemap

Assuming that the user list is constantly changing, to avoid having to build the project for each new user that is registered, for example, let's make the user list sitemap dynamic.

For this, we need to create a route next, a folder inside the 'pages' folder of next. Name this folder as 'server-sitemap.xml', inside this folder create an index.

Default structure for next to identify sitemaps that should be dynamic via SSR.

Here's the trick, inside the index file you should export by default a function called ServerSitemap, with no return.

const ServerSitemap = () => {};
export default ServerSitemap;
Simple Next route created.

Leave it as is, this function is just so that Next allows us to use getServerSideProps to our advantage.

With that being said, create a common getServerSideProps function. Inside it, fetch the api and use the same data that is used in the dynamic route that you want to make the sitemap more dynamic, in this case, the user id.

Identifying which user attribute is being used in Next's dynamic route. In the case the ID.

Parse the json response from the fetch and store that in a constant. As in the example:

import { GetServerSideProps } from "next";
export const getServerSideProps: GetServerSideProps = async (context) => {
  const response = await fetch("<http://localhost:3000/api/users>");
  const users = await response.json();
  console.log("users", users);
};
const ServerSitemap = () => {};
export default ServerSitemap;
Next function getServerSideProps doing a GET in the api and its response being converted to json.

After creating another constant called fields that can be typed with ISitemapField[] type, you can import this type from the next-sitemap library. Make the fields value a map of the users array returned by the API. The return of this map must be an object with at least the key loc (you can see all the options in ISitemapField) with the value being the url of the site with the user id concatenated, and another key called lastmod with its value being a new date formatted. As in the example below:

import { GetServerSideProps } from "next";
import { ISitemapField } from "next-sitemap";
import { UserType } from "../../src/types";
export const getServerSideProps: GetServerSideProps = async (context) => {
  const response = await fetch("<http://localhost:3000/api/users>");
  const users: any[] = await response.json();
  console.log("users", users);
  const fields: ISitemapField[] = users.map((user: UserType) => ({
    loc: `https://example.com/users/${user.id}`,
    lastmod: new Date().toISOString(),
  }));
};
const ServerSitemap = () => {};
export default ServerSitemap;
Configuring the fields to generate a list of objects with loc and lastmod and then generate a sitemap for each user profile page.

Then return the next-sitemap function called getServerSideSitemap with its first parameter being the context of getServerSideProps and its second parameter being the array of fields.

After all, you should have something like this:

import { GetServerSideProps } from "next";
import { getServerSideSitemap, ISitemapField } from "next-sitemap";
import { UserType } from "../../src/types";
export const getServerSideProps: GetServerSideProps = async (context) => {
  const response = await fetch("<http://localhost:3000/api/users>");
  const users: any[] = await response.json();
  console.log("users", users);
  const fields: ISitemapField[] = users.map((user: UserType) => ({
    loc: `https://example.com/users/${user.id}`,
    lastmod: new Date().toISOString(),
  }));
  return getServerSideSitemap(context, fields);
};
const ServerSitemap = () => {};
export default ServerSitemap;
Next function getServerSideProps with the necessary configuration to generate sitemaps according to the list of users.

This function will basically generate an updated sitemap whenever the sitemap route is accessed, in this case the route is https://example.com/server-sitemap.xml.

Checking the */server-sitemap.xml route we notice that it behaves like a common sitemap:

Dynamically generated sitemap according to the user list.

Also note that we have our users mock with only two users, another 2 commented out and they correspond to the generated sitemap.

Mock data from users with users commented out for testing purposes.

If we add more users to our mock “API”, we notice the sitemap change as soon as we update the server-sitemap.xml route

User mockups:

Mock data of users with uncommented users for testing purposes. More users have been added.

XML with sitemap updated with new users dynamically:

Dynamically generated sitemap with the new user list.

This is an excellent way to streamline frequently changing sitemaps.

We use all the technologies mentioned here at 80 Lines, including for this post.

You can see the code in this repository  https://github.com/GabrielMessiasdaRosa/dynamic-sitemap.


References

https://github.com/iamvishnusankar/next-sitemap

Compartilhar com:

Inscreva-se

E receba nossas alertas de insights de tecnologia no seu e-mail.