Skip to content

fix: use absolute URL for canonical link in redirect pages#6413

Open
adoodevv wants to merge 3 commits intoros2:rollingfrom
adoodevv:rolling
Open

fix: use absolute URL for canonical link in redirect pages#6413
adoodevv wants to merge 3 commits intoros2:rollingfrom
adoodevv:rolling

Conversation

@adoodevv
Copy link
Copy Markdown

Description

The redirect_html_fragment in conf.py was building the <link rel="canonical"> tag by concatenating html_baseurl with the output of get_relative_uri(), which returns a relative path. This produces broken canonical URLs on third-party mirrors that build without sphinx-multiversion, because the relative path contains .. segments that resolve incorrectly against the base URL.

For example, the mirror at ros.ncnynl.com was generating:

<link rel="canonical" href="https://docs.ros.org/en/How-To-Guides/Using-Custom-Rosdistro.html" />

instead of the correct:

<link rel="canonical" href="https://docs.ros.org/en/humble/How-To-Guides/Using-Custom-Rosdistro.html" />

The fix computes the canonical URL as an absolute path directly from html_baseurl and the canonical document's path, which is always correct regardless of how the site is built. The relative URI is kept for the <meta refresh> and JS redirect since those work within the built site structure.

A pytest test covering the regression and the fix is included.

Fixes #6112

Did you use Generative AI?

Yes, Claude (Anthropic) was used to help draft the tests.

Additional Information

The <meta refresh> and JS redirect intentionally still use the relative URI — only the <link rel="canonical"> needed to be changed to an absolute URL.

The redirect_html_fragment was building the canonical <link> by
concatenating html_baseurl with a relative URI from get_relative_uri().
This produces broken canonical URLs on mirrors that build without
sphinx-multiversion, because the relative path contains '..' segments
that resolve incorrectly against the base URL.

Fix by computing the canonical URL as an absolute path directly from
html_baseurl and the canonical document's path, which is always correct
regardless of how the site is built.

Fixes ros2#6112
@kscottz
Copy link
Copy Markdown
Collaborator

kscottz commented Apr 20, 2026

This produces broken canonical URLs on third-party mirrors that build without sphinx-multiversion, because the relative path contains .. segments that resolve incorrectly against the base URL.

Can you speak to this a bit more? Can you explain why you are running a third party mirror? We already have trouble getting search engines to point to the correct resources in the canonical documentation and multiple third party mirrors seems like it could exacerbate that issue. If it is a website performance issue that seems like something we can help solve.

@adoodevv
Copy link
Copy Markdown
Author

This produces broken canonical URLs on third-party mirrors that build without sphinx-multiversion, because the relative path contains .. segments that resolve incorrectly against the base URL.

Can you speak to this a bit more? Can you explain why you are running a third party mirror? We already have trouble getting search engines to point to the correct resources in the canonical documentation and multiple third party mirrors seems like it could exacerbate that issue. If it is a website performance issue that seems like something we can help solve.

Thanks for the review! To clarify - I'm not running a third-party mirror. This fix was prompted by the issue reported in #6112, where @tfoote observed that ros.ncnynl.com (a mirror outside the ROS team's control) was outranking the official docs in search results, and that mirror was generating a broken canonical tag:

<link rel="canonical" href="https://docs.ros.org/en/How-To-Guides/Using-Custom-Rosdistro.html" />

That URL 404s because it's missing the humble/ prefix. The bug is in our own conf.py - the canonical URL is assembled by concatenating html_baseurl with a relative path from get_relative_uri(), which produces a malformed absolute URL when a mirror builds without sphinx-multiversion.

The intent of this fix isn't to encourage mirrors - it's to make the canonical tag generation correct so that any mirror that does exist (regardless of whether we approve of it) at least points search engines to the right official page, rather than a 404. A correct canonical tag is the mechanism that tells search engines to prefer the official docs over the mirror, so fixing it actually helps the SEO problem you described rather than making it worse.

@kscottz
Copy link
Copy Markdown
Collaborator

kscottz commented Apr 20, 2026

@adoodevv thanks so much for the clarification and appreciate the help. I'll take a look.

@adoodevv
Copy link
Copy Markdown
Author

@adoodevv thanks so much for the clarification and appreciate the help. I'll take a look.

You're welcome. Thank you too.

Copy link
Copy Markdown
Contributor

@tfoote tfoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this change doesn't resolve the issue.

If you follow the basic approach of make html in the README.md https://github.com/ros2/ros2_documentation#building-html

The resultant build doesn't include the distro in the canonical url testing with your branch

I see build/html/Guides/Ament-CMake-Documentation.html: <link rel="canonical" href="https://docs.ros.org/en/How-To-Guides/Ament-CMake-Documentation.html" />

Which should be

build/html/Guides/Ament-CMake-Documentation.html: <link rel="canonical" href="https://docs.ros.org/en/rolling/How-To-Guides/Ament-CMake-Documentation.html" />

The basic build needs to default in the distro version. We'll probably be best served to encode rolling as the default and then make sure the multiversion overrides it.

@adoodevv
Copy link
Copy Markdown
Author

Sorry this change doesn't resolve the issue.

If you follow the basic approach of make html in the README.md https://github.com/ros2/ros2_documentation#building-html

The resultant build doesn't include the distro in the canonical url testing with your branch

I see build/html/Guides/Ament-CMake-Documentation.html: <link rel="canonical" href="https://docs.ros.org/en/How-To-Guides/Ament-CMake-Documentation.html" />

Which should be

build/html/Guides/Ament-CMake-Documentation.html: <link rel="canonical" href="https://docs.ros.org/en/rolling/How-To-Guides/Ament-CMake-Documentation.html" />

The basic build needs to default in the distro version. We'll probably be best served to encode rolling as the default and then make sure the multiversion overrides it.

Fixed - html_baseurl now defaults to rolling so plain make html builds produce the correct versioned canonical URL. Verified locally: build/html/Guides/Ament-CMake-Documentation.html now shows https://docs.ros.org/en/rolling/How-To-Guides/Ament-CMake-Documentation.html.

Plain Running Sphinx v8.2.3
loading translations [en]... done
loading pickled environment... The configuration has changed (3 options: 'html_css_files', 'html_js_files', 'html_static_path')
done
building [mo]: targets for 0 po files that are out of date
writing output...
building [html]: targets for 0 source files that are out of date
updating environment: 0 added, 0 changed, 0 removed
reading sources...
looking for now-outdated files... none found
no targets are out of date.
preparing documents... done
copying assets...
copying downloadable files... [ 10%] The-ROS2-Project/Marketing/documents/ros2-brochure-a4-web.pdf
copying downloadable files... [ 20%] The-ROS2-Project/Marketing/documents/ros2-brochure-a4-print.pdf
copying downloadable files... [ 30%] The-ROS2-Project/Marketing/documents/ros2-brochure-ltr-web.pdf
copying downloadable files... [ 40%] The-ROS2-Project/Marketing/documents/ros2-brochure-ltr-print.pdf
copying downloadable files... [ 50%] Tutorials/Advanced/Discovery-Server/scripts/generate_discovery_packages.bash
copying downloadable files... [ 60%] Tutorials/Advanced/Discovery-Server/scripts/discovery_packets.py
copying downloadable files... [ 70%] Tutorials/Advanced/Discovery-Server/scripts/no_intraprocess_configuration.xml
copying downloadable files... [ 80%] Tutorials/Advanced/Simulators/Webots/Code/my_world.wbt
copying downloadable files... [ 90%] Tutorials/Intermediate/URDF/documents/r2d2.urdf.xml
copying downloadable files... [100%] Tutorials/Intermediate/URDF/documents/r2d2.rviz

copying static files...
Writing evaluated template result to /home/adoodevv/Documents/ros2_documentation/build/html/_static/documentation_options.js
Writing evaluated template result to /home/adoodevv/Documents/ros2_documentation/build/html/_static/language_data.js
Writing evaluated template result to /home/adoodevv/Documents/ros2_documentation/build/html/_static/basic.css
Writing evaluated template result to /home/adoodevv/Documents/ros2_documentation/build/html/_static/js/versions.js
Writing evaluated template result to /home/adoodevv/Documents/ros2_documentation/build/html/_static/copybutton.js
copying static files: done
copying extra files...
copying extra files: done
copying assets: done
generating indices... genindex done
writing additional pages... search done
dumping search index in English (code: en)... done
dumping object inventory... done
sphinx-sitemap-ros: sitemap.xml was generated for URL https://docs.ros.org/en/rolling in /home/adoodevv/Documents/ros2_documentation/build/html/sitemap.xml
build succeeded.

The HTML pages are in build/html. builds were leaving html_baseurl as
'https://docs.ros.org/en' (no version), causing redirect pages to
generate canonical links missing the distro prefix.

- Default html_baseurl to 'https://docs.ros.org/en/rolling' so basic
  builds produce correct canonical URLs out of the box
- Update smv_rewrite_configs to replace the full URL (not append) when
  sphinx-multiversion runs with a specific distro
- Fix redirect_html_fragment to use an absolute canonical_abs_url
  instead of concatenating base_url with a relative path

Fixes ros2#6112
@tfoote
Copy link
Copy Markdown
Contributor

tfoote commented Apr 24, 2026

Since the real fix is in the second commit. The tests clearly aren't catching the problem or confirming that they're resolved. And are the changes from the first commit necessary. They seem more complex that I would expect is necessary. Can you please review this to reduce it down to a minimum changes necessary and show that the tests fail before the change and pass after.

Previous tests reimplemented the redirect logic inside the test file,
meaning they would pass even if conf.py was wrong. Tests now read and
exercise the actual redirect_html_fragment template and format call
from conf.py directly, so they genuinely fail before the fix and pass
after.
@adoodevv
Copy link
Copy Markdown
Author

adoodevv commented Apr 25, 2026

Since the real fix is in the second commit. The tests clearly aren't catching the problem or confirming that they're resolved. And are the changes from the first commit necessary. They seem more complex that I would expect is necessary. Can you please review this to reduce it down to a minimum changes necessary and show that the tests fail before the change and pass after.

Rewrote the tests - they now read the actual redirect_html_fragment template from conf.py and exercise it directly, so they fail before the fix and pass after.

On whether both changes are necessary: the Guides/How-To-Guides/ redirect is cross-directory, so get_relative_uri returns ../How-To-Guides/Ament-CMake-Documentation.html. Without the canonical_abs_url change, concatenating that with html_baseurl would produce https://docs.ros.org/en/rolling/../How-To-Guides/.... Both fixes are needed - html_baseurl ensures the version is present, canonical_abs_url ensures no .. segments in the path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Canonical URL generation logic can lead to incorrect mirror results

4 participants