Comments (27)
π€― Didn't expect it to have such an impact.
Finally this perf regression was a good thing π
from docusaurus.
Just to add we've recently rolled back from 3.1 to 3.0.1 for this exact same issue (we also have a large site). Normally would take approx 45 mins to build, and with 3.1 moves to just over 2 hours.
However, maybe of interest, when we initially rolled back we updated our package-lock.json and noticed the build times stayed the same (close to 2 hours). Reverting to the original package-lock.json prior to our 3.1 upgrade that we used when originally on 3.0.1, the build went back to 45 mins.
I've just tried it again, and when using 3.0.1, and building without a package-lock.json to use the latest dependencies, the build time more than doubles.
As an aside, onBrokenAnchors: "ignore", made no difference for us (and we also fixed all the broken anchors).
from docusaurus.
First off, huge fan of Docusaurus. Wanted to comment along. Might be tangential to this, we also were increasing a 2x increase in build times upgrading from Docusaurus 3.0.1 to 3.1. We ended up downgrading back to 3.0.1. We leverage our own CI Solution, Harness CI Enterprise.
3.0.1 Builds: 8-9 mins
3.1 Builds: 17-20 mins.
We are wanting to dig in a little further if anyone on Docusaurus Project Side can weigh in On Broken Anchors
feature [https://github.com//pull/9528]. If that feature has to build a list of the anchors, on larger sites if that step takes time. We tried configuring onBrokenMarkdownLinks
to Ignore but I believe the process still runs, just. not producing or throwing the output. Potentially moving Ignore to not execute at all?
The big increase comes between Server Compile and the "done" hook.
[success] [webpackbar] Server: Compiled successfully in 7.36m
[SUCCESS] Generated static files in "build".
[INFO] Use `npm run serve` command to test your build locally.
Done in 983.84s.
Node Build Version: 18.19.0
Thanks for a great project!
from docusaurus.
Hey
No we didn't change anything recently that could lead to such a significant difference.
But your report is not clear enough.
What was the version of Docusaurus you used before exactly?
Was using 3.0.1.
How long did it take to build previously?
It took under 30 minutes.
Can you replicate this only on your computer, or also on CI such as GitHub Actions?
My Docusaurus site is pretty big and doesn't fit on CI machines. RAM usage used to spike to 14GB the sealing process, and all CI machines crashed at this point.
Are we even sure it's Docusaurus fault? Your log shows that
Done in 12618.98s.
. Please show us the time it takes executing only the Docusaurus build command, building just one language for example, and nothing else.
I am sure. Every other script finishes under 1 minute, and the it's only the Docusaurus build step that hangs.
How comes you are reporting using Node.js 16.4 while Docusaurus v3.0 requires Node 18?
I am using v18.17.1. Where did you get this information, may I ask?
from docusaurus.
@andrewgbell it looks like the build-time increase is not related to the 3.1 upgrade, but rather the upgrade of a transitive dependency that has a perf regression.
It would be super helpful for me to be able see/run that upgrade myself and study the package-lock.json
diff.
Can someone share a site / branch that build faster in 3.0.1, and where I could reproduce the build time regression by upgrading.
Does the onBrokenAnchor always run but does not display results if ignore. If does not run, can weed that out.
@ravilach I'd recommend to try turning off both onBrokenLinks: "ignore"
and onBrokenLinks: "ignore"
, because we only "bypass" the broken link checker if both are ignored atm.
I'll try to optimize that better in the future, but in the meantime the code looks like this:
if (onBrokenLinks === 'ignore' && onBrokenAnchors === 'ignore') {
return;
}
const brokenLinks = getBrokenLinks({
routes,
collectedLinks: normalizeCollectedLinks(collectedLinks),
});
reportBrokenLinks({brokenLinks, onBrokenLinks, onBrokenAnchors});
Note: is this possible that you encounter longer build times only due to cache eviction.
We use Webpack with persistent caching and on rebuilds it's supposed to rebuild faster.
It may be possible that your site builds longer simply because the caches were empty?
In this case I suggest trying to run docusaurus clear && docusaurus build
on your "fast branch" and see if it becomes slower to build.
from docusaurus.
node_modules/@docusaurus/core/lib/server/brokenLinks.js
@slorber Yes, that worked great! Replaced the file and ran it with:
onBrokenLinks: "warn",
onBrokenAnchors: "warn",
onBrokenMarkdownLinks: "throw",
And it built just as quick as earlier. Thanks for all your help with this!
from docusaurus.
Awesome news then π thanks for reporting
from docusaurus.
Hey
No we didn't change anything recently that could lead to such a significant difference.
But your report is not clear enough.
What was the version of Docusaurus you used before exactly?
How long did it take to build previously?
Can you replicate this only on your computer, or also on CI such as GitHub Actions?
What was the upgrade PR?
Are we even sure it's Docusaurus fault? Your log shows that Done in 12618.98s.
. Please show us the time it takes executing only the Docusaurus build command, building just one language for example, and nothing else.
How comes you are reporting using Node.js 16.4 while Docusaurus v3.0 requires Node 18?
from docusaurus.
Hey
No we didn't change anything recently that could lead to such a significant difference.
But your report is not clear enough.
What was the version of Docusaurus you used before exactly?Was using 3.0.1.
How long did it take to build previously?
It took under 30 minutes.
Can you replicate this only on your computer, or also on CI such as GitHub Actions?
My Docusaurus site is pretty big and doesn't fit on CI machines. RAM usage used to spike to 14GB the sealing process, and all CI machines crashed at this point.
Are we even sure it's Docusaurus fault? Your log shows that
Done in 12618.98s.
. Please show us the time it takes executing only the Docusaurus build command, building just one language for example, and nothing else.I am sure. Every other script finishes under 1 minute, and the it's only the Docusaurus build step that hangs.
How comes you are reporting using Node.js 16.4 while Docusaurus v3.0 requires Node 18?
I am using v18.17.1. Where did you get this information, may I ask?
+1 on the sealing process, where the resource usage/time seems to spike for us also. Anything added to that process from 3.0.1 -> 3.1, e.g On Broken Anchors? Thanks!
from docusaurus.
If it's coming from the brokenAnchor you may try to put onBrokenAnchors in docusaurus.config file to ignore
Maybe you can disable it in your CI but still have a build process somewhere that you run manually / every few times to check for broken links / anchors
from docusaurus.
If it's coming from the brokenAnchor you may try to put onBrokenAnchors in docusaurus.config file to
ignore
Maybe you can disable it in your CI but still have a build process somewhere that you run manually / every few times to check for broken links / anchors
Thanks @OzakIOne it's a great feature. Curious, we noticed the same behavior with ignore
. Does the onBrokenAnchor always run but does not display results if ignore
. If does not run, can weed that out.
from docusaurus.
@anaclumos I tried using your repo before the upgrade (https://github.com/anaclumos/extracranial/tree/f144432acdfff55d741a1dbc568ae0b51dd052fe) but the usage of Bun package manager makes it inconvenient to troubleshoot.
First when I run bun install
on your repo with latest Bun version, it seems to resolve to newer versions of Docusaurus dependency ranges, and modify your bun.lockb
file:
Then, the binary format of the lockfile makes it super inconvenient to inspect and diff.
Maybe I could try using the exact same version of Bun you are using, and it would not upgrade? For now I'm unable to troubleshoot this using your repo.
from docusaurus.
@andrewgbell it looks like the build-time increase is not related to the 3.1 upgrade, but rather the upgrade of a transitive dependency that has a perf regression.
It would be super helpful for me to be able see/run that upgrade myself and study the
package-lock.json
diff.Can someone share a site / branch that build faster in 3.0.1, and where I could reproduce the build time regression by upgrading.
Unfortunately the repo isn't (yet) open source, but I can share the package-lock.json's from both runs if any use and potentially the build log files if there's anything particular you need?
from docusaurus.
Unfortunately the repo isn't (yet) open source, but I can share the package-lock.json's from both runs if any use and potentially the build log files if there's anything particular you need?
@andrewgbell I'd have to run this locally myself, partially upgrading some libs in a dichotomic way to find out which transitive dep cause the problem. I doubt seeing a diff will be enough to identify the problem unfortunately, I need to run the code.
from docusaurus.
@andrewgbell it looks like the build-time increase is not related to the 3.1 upgrade, but rather the upgrade of a transitive dependency that has a perf regression.
It would be super helpful for me to be able see/run that upgrade myself and study the
package-lock.json
diff.Can someone share a site / branch that build faster in 3.0.1, and where I could reproduce the build time regression by upgrading.
Does the onBrokenAnchor always run but does not display results if ignore. If does not run, can weed that out.
@ravilach I'd recommend to try turning off both
onBrokenLinks: "ignore"
andonBrokenLinks: "ignore"
, because we only "bypass" the broken link checker if both are ignored atm.I'll try to optimize that better in the future, but in the meantime the code looks like this:
if (onBrokenLinks === 'ignore' && onBrokenAnchors === 'ignore') { return; } const brokenLinks = getBrokenLinks({ routes, collectedLinks: normalizeCollectedLinks(collectedLinks), }); reportBrokenLinks({brokenLinks, onBrokenLinks, onBrokenAnchors});Note: is this possible that you encounter longer build times only due to cache eviction.
We use Webpack with persistent caching and on rebuilds it's supposed to rebuild faster.
It may be possible that your site builds longer simply because the caches were empty?
In this case I suggest trying to run
docusaurus clear && docusaurus build
on your "fast branch" and see if it becomes slower to build.
Unfortunately the repo isn't (yet) open source, but I can share the package-lock.json's from both runs if any use and potentially the build log files if there's anything particular you need?
@andrewgbell I'd have to run this locally myself, partially upgrading some libs in a dichotomic way to find out which transitive dep cause the problem. I doubt seeing a diff will be enough to identify the problem unfortunately, I need to run the code.
Ours is Open Source: https://github.com/harness/developer-hub if that helps. Currently on DS 3.0.1.
Here is the yarn.lock from the 3.1 upgrade: https://github.com/harness/developer-hub/blob/7b5fbafc4036f61d30e094362a67204cc573cf7a/yarn.lock
from docusaurus.
@andrewgbell it looks like the build-time increase is not related to the 3.1 upgrade, but rather the upgrade of a transitive dependency that has a perf regression.
It would be super helpful for me to be able see/run that upgrade myself and study thepackage-lock.json
diff.
Can someone share a site / branch that build faster in 3.0.1, and where I could reproduce the build time regression by upgrading.Unfortunately the repo isn't (yet) open source, but I can share the package-lock.json's from both runs if any use and potentially the build log files if there's anything particular you need?
@slorber If you need another repo, let me know as I can invite you into our org.
from docusaurus.
Still investigating your site @ravilach, but it looks like there are 2 problems:
- the broken link checker now using node
new URL()
is much slower (edit: it's not that it's the matchRoutes calls) - a transitive dependency is causing longer build times (I suspect related to postcss or css-loader)
Have any of you tried to upgrade without fully regenerating the lockfile, and disabling all the broken link checkers?
yarn upgrade @docusaurus/core@latest @docusaurus/cssnano-preset@latest @docusaurus/plugin-client-redirects@latest @docusaurus/plugin-debug@latest @docusaurus/plugin-google-analytics@latest @docusaurus/plugin-google-gtag@latest @docusaurus/plugin-sitemap@latest @docusaurus/preset-classic@latest @docusaurus/theme-classic@latest @docusaurus/theme-mermaid@latest @docusaurus/theme-search-algolia@latest @docusaurus/module-type-aliases@latest @docusaurus/tsconfig@latest
onBrokenLinks: "ignore"
onBrokenAnchors: "ignore"
from docusaurus.
Still investigating your site @ravilach, but it looks like there are 2 problems:
* the broken link checker now using node `new URL()` is much slower * a transitive dependency is causing longer build times (I suspect related to postcss or css-loader)
Have any of you tried to upgrade without fully regenerating the lockfile, and disabling all the broken link checkers?
yarn upgrade @docusaurus/core@latest @docusaurus/cssnano-preset@latest @docusaurus/plugin-client-redirects@latest @docusaurus/plugin-debug@latest @docusaurus/plugin-google-analytics@latest @docusaurus/plugin-google-gtag@latest @docusaurus/plugin-sitemap@latest @docusaurus/preset-classic@latest @docusaurus/theme-classic@latest @docusaurus/theme-mermaid@latest @docusaurus/theme-search-algolia@latest @docusaurus/module-type-aliases@latest @docusaurus/tsconfig@latest* `onBrokenLinks: "ignore"` * `onBrokenAnchors: "ignore"`
Thanks @slorber, much appreciated!
from docusaurus.
Maybe I could try using the exact same version of Bun you are using, and it would not upgrade? For now I'm unable to troubleshoot this using your repo.
I migrated to pnpm.
from docusaurus.
Still investigating your site @ravilach, but it looks like there are 2 problems:
- the broken link checker now using node
new URL()
is much slower- a transitive dependency is causing longer build times (I suspect related to postcss or css-loader)
Have any of you tried to upgrade without fully regenerating the lockfile, and disabling all the broken link checkers?
yarn upgrade @docusaurus/core@latest @docusaurus/cssnano-preset@latest @docusaurus/plugin-client-redirects@latest @docusaurus/plugin-debug@latest @docusaurus/plugin-google-analytics@latest @docusaurus/plugin-google-gtag@latest @docusaurus/plugin-sitemap@latest @docusaurus/preset-classic@latest @docusaurus/theme-classic@latest @docusaurus/theme-mermaid@latest @docusaurus/theme-search-algolia@latest @docusaurus/module-type-aliases@latest @docusaurus/tsconfig@latest
onBrokenLinks: "ignore"
onBrokenAnchors: "ignore"
Hi, I've added:
onBrokenLinks: "ignore",
onBrokenAnchors: "ignore",
onBrokenMarkdownLinks: "throw",
alongside running
npm upgrade @docusaurus/core @docusaurus/cssnano-preset @docusaurus/plugin-client-redirects @docusaurus/plugin-debug @docusaurus/plugin-google-analytics @docusaurus/plugin-google-gtag @docusaurus/plugin-sitemap @docusaurus/preset-classic @docusaurus/theme-classic @docusaurus/theme-mermaid @docusaurus/theme-search-algolia @docusaurus/module-type-aliases @docusaurus/tsconfig
And build time dropped back to the expected (in fact a few minutes quicker, approx 40 mins). I've tried removing
onBrokenAnchors: "ignore",
However build time just back up again to over 2 hours.
I've also tried adding these ignores again
onBrokenLinks: "ignore",
onBrokenAnchors: "ignore",
onBrokenMarkdownLinks: "throw",
but upgrading the whole package-lock.json again. As of today, it slows by about 10% over the run above (About 45 mins), which is a huge improvement on where we were last week so not sure if a dependency has updated since.
So looks like you're correct on the two issues though the brokenlinks and anchors seems to have a far greater impact.
from docusaurus.
Still investigating your site @ravilach, but it looks like there are 2 problems:
- the broken link checker now using node
new URL()
is much slower- a transitive dependency is causing longer build times (I suspect related to postcss or css-loader)
Have any of you tried to upgrade without fully regenerating the lockfile, and disabling all the broken link checkers?
yarn upgrade @docusaurus/core@latest @docusaurus/cssnano-preset@latest @docusaurus/plugin-client-redirects@latest @docusaurus/plugin-debug@latest @docusaurus/plugin-google-analytics@latest @docusaurus/plugin-google-gtag@latest @docusaurus/plugin-sitemap@latest @docusaurus/preset-classic@latest @docusaurus/theme-classic@latest @docusaurus/theme-mermaid@latest @docusaurus/theme-search-algolia@latest @docusaurus/module-type-aliases@latest @docusaurus/tsconfig@latest
onBrokenLinks: "ignore"
onBrokenAnchors: "ignore"
Hi, I've added:
onBrokenLinks: "ignore",
onBrokenAnchors: "ignore",
onBrokenMarkdownLinks: "throw",
alongside running
npm upgrade @docusaurus/core @docusaurus/cssnano-preset @docusaurus/plugin-client-redirects @docusaurus/plugin-debug @docusaurus/plugin-google-analytics @docusaurus/plugin-google-gtag @docusaurus/plugin-sitemap @docusaurus/preset-classic @docusaurus/theme-classic @docusaurus/theme-mermaid @docusaurus/theme-search-algolia @docusaurus/module-type-aliases @docusaurus/tsconfig
And build time dropped back to the expected (in fact a few minutes quicker, approx 40 mins). I've tried removing
onBrokenAnchors: "ignore",
However build time just back up again to over 2 hours.
I've also tried adding these ignores again onBrokenLinks: "ignore", onBrokenAnchors: "ignore", onBrokenMarkdownLinks: "throw",
but this time upgrading the whole package-lock.json. As of today, it now runs through at the same speed as above so not sure if a dependency has updated since.
So looks like you're correct on the brokenlinks and anchors seems to have a far greater impact.
from docusaurus.
Thanks for reporting @andrewgbell
I've submitted a PR that should optimize things, likely faster than before: #9778
So far it seems to work on @ravilach site.
Could you give it a test by building locally with this modified file?
node_modules/@docusaurus/core/lib/server/brokenLinks.js
"use strict";
/**
* Copyright (c) Facebook, Inc. and its affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/
Object.defineProperty(exports, "__esModule", { value: true });
exports.handleBrokenLinks = void 0;
const tslib_1 = require("tslib");
const lodash_1 = tslib_1.__importDefault(require("lodash"));
const logger_1 = tslib_1.__importDefault(require("@docusaurus/logger"));
const react_router_config_1 = require("react-router-config");
const utils_1 = require("@docusaurus/utils");
const utils_2 = require("./utils");
function matchRoutes(routeConfig, pathname) {
// @ts-expect-error: React router types RouteConfig with an actual React
// component, but we load route components with string paths.
// We don't actually access component here, so it's fine.
return (0, react_router_config_1.matchRoutes)(routeConfig, pathname);
}
function createBrokenLinksHelper({ collectedLinks, routes, }) {
const validPathnames = new Set(collectedLinks.keys());
// Matching against the route array can be expensive
// If the route is already in the valid pathnames,
// we can avoid matching against it as an optimization
const remainingRoutes = routes
.filter((route) => !validPathnames.has(route.path));
function isPathnameMatchingAnyRoute(pathname) {
if (matchRoutes(remainingRoutes, pathname).length > 0) {
// IMPORTANT: this is an optimization here
// See https://github.com/facebook/docusaurus/issues/9754
// Large Docusaurus sites have many routes!
// We try to minimize calls to a possibly expensive matchRoutes function
validPathnames.add(pathname);
return true;
}
return false;
}
function isPathBrokenLink(linkPath) {
const pathnames = [linkPath.pathname, decodeURI(linkPath.pathname)];
if (pathnames.some((p) => validPathnames.has(p))) {
return false;
}
if (pathnames.some(isPathnameMatchingAnyRoute)) {
return false;
}
return true;
}
function isAnchorBrokenLink(linkPath) {
const { pathname, hash } = linkPath;
// Link has no hash: it can't be a broken anchor link
if (hash === undefined) {
return false;
}
// Link has empty hash ("#", "/page#"...): we do not report it as broken
// Empty hashes are used for various weird reasons, by us and other users...
// See for example: https://github.com/facebook/docusaurus/pull/6003
if (hash === '') {
return false;
}
const targetPage = collectedLinks.get(pathname) || collectedLinks.get(decodeURI(pathname));
// link with anchor to a page that does not exist (or did not collect any
// link/anchor) is considered as a broken anchor
if (!targetPage) {
return true;
}
// it's a not broken anchor if the anchor exists on the target page
if (targetPage.anchors.has(hash) ||
targetPage.anchors.has(decodeURIComponent(hash))) {
return false;
}
return true;
}
return {
collectedLinks,
isPathBrokenLink,
isAnchorBrokenLink,
};
}
function getBrokenLinksForPage({ pagePath, helper, }) {
const pageData = helper.collectedLinks.get(pagePath);
const brokenLinks = [];
pageData.links.forEach((link) => {
const linkPath = (0, utils_1.parseURLPath)(link, pagePath);
if (helper.isPathBrokenLink(linkPath)) {
brokenLinks.push({
link,
resolvedLink: (0, utils_1.serializeURLPath)(linkPath),
anchor: false,
});
}
else if (helper.isAnchorBrokenLink(linkPath)) {
brokenLinks.push({
link,
resolvedLink: (0, utils_1.serializeURLPath)(linkPath),
anchor: true,
});
}
});
return brokenLinks;
}
/**
* The route defs can be recursive, and have a parent match-all route. We don't
* want to match broken links like /docs/brokenLink against /docs/*. For this
* reason, we only consider the "final routes" that do not have subroutes.
* We also need to remove the match-all 404 route
*/
function filterIntermediateRoutes(routesInput) {
const routesWithout404 = routesInput.filter((route) => route.path !== '*');
return (0, utils_2.getAllFinalRoutes)(routesWithout404);
}
function getBrokenLinks({ collectedLinks, routes, }) {
const filteredRoutes = filterIntermediateRoutes(routes);
const helper = createBrokenLinksHelper({
collectedLinks,
routes: filteredRoutes,
});
const result = {};
collectedLinks.forEach((_unused, pagePath) => {
try {
result[pagePath] = getBrokenLinksForPage({
pagePath,
helper,
});
}
catch (e) {
throw new Error(`Unable to get broken links for page ${pagePath}.`, {
cause: e,
});
}
});
return result;
}
function brokenLinkMessage(brokenLink) {
const showResolvedLink = brokenLink.link !== brokenLink.resolvedLink;
return `${brokenLink.link}${showResolvedLink ? ` (resolved as: ${brokenLink.resolvedLink})` : ''}`;
}
function createBrokenLinksMessage(pagePath, brokenLinks) {
const type = brokenLinks[0]?.anchor === true ? 'anchor' : 'link';
const anchorMessage = brokenLinks.length > 0
? `- Broken ${type} on source page path = ${pagePath}:
-> linking to ${brokenLinks
.map(brokenLinkMessage)
.join('\n -> linking to ')}`
: '';
return `${anchorMessage}`;
}
function createBrokenAnchorsMessage(brokenAnchors) {
if (Object.keys(brokenAnchors).length === 0) {
return undefined;
}
return `Docusaurus found broken anchors!
Please check the pages of your site in the list below, and make sure you don't reference any anchor that does not exist.
Note: it's possible to ignore broken anchors with the 'onBrokenAnchors' Docusaurus configuration, and let the build pass.
Exhaustive list of all broken anchors found:
${Object.entries(brokenAnchors)
.map(([pagePath, brokenLinks]) => createBrokenLinksMessage(pagePath, brokenLinks))
.join('\n')}
`;
}
function createBrokenPathsMessage(brokenPathsMap) {
if (Object.keys(brokenPathsMap).length === 0) {
return undefined;
}
/**
* If there's a broken link appearing very often, it is probably a broken link
* on the layout. Add an additional message in such case to help user figure
* this out. See https://github.com/facebook/docusaurus/issues/3567#issuecomment-706973805
*/
function getLayoutBrokenLinksHelpMessage() {
const flatList = Object.entries(brokenPathsMap).flatMap(([pagePage, brokenLinks]) => brokenLinks.map((brokenLink) => ({ pagePage, brokenLink })));
const countedBrokenLinks = lodash_1.default.countBy(flatList, (item) => item.brokenLink.link);
const FrequencyThreshold = 5; // Is this a good value?
const frequentLinks = Object.entries(countedBrokenLinks)
.filter(([, count]) => count >= FrequencyThreshold)
.map(([link]) => link);
if (frequentLinks.length === 0) {
return '';
}
return logger_1.default.interpolate `
It looks like some of the broken links we found appear in many pages of your site.
Maybe those broken links appear on all pages through your site layout?
We recommend that you check your theme configuration for such links (particularly, theme navbar and footer).
Frequent broken links are linking to:${frequentLinks}`;
}
return `Docusaurus found broken links!
Please check the pages of your site in the list below, and make sure you don't reference any path that does not exist.
Note: it's possible to ignore broken links with the 'onBrokenLinks' Docusaurus configuration, and let the build pass.${getLayoutBrokenLinksHelpMessage()}
Exhaustive list of all broken links found:
${Object.entries(brokenPathsMap)
.map(([pagePath, brokenPaths]) => createBrokenLinksMessage(pagePath, brokenPaths))
.join('\n')}
`;
}
function splitBrokenLinks(brokenLinks) {
const brokenPaths = {};
const brokenAnchors = {};
Object.entries(brokenLinks).forEach(([pathname, pageBrokenLinks]) => {
const [anchorBrokenLinks, pathBrokenLinks] = lodash_1.default.partition(pageBrokenLinks, (link) => link.anchor);
if (pathBrokenLinks.length > 0) {
brokenPaths[pathname] = pathBrokenLinks;
}
if (anchorBrokenLinks.length > 0) {
brokenAnchors[pathname] = anchorBrokenLinks;
}
});
return { brokenPaths, brokenAnchors };
}
function reportBrokenLinks({ brokenLinks, onBrokenLinks, onBrokenAnchors, }) {
// We need to split the broken links reporting in 2 for better granularity
// This is because we need to report broken path/anchors independently
// For v3.x retro-compatibility, we can't throw by default for broken anchors
// TODO Docusaurus v4: make onBrokenAnchors throw by default?
const { brokenPaths, brokenAnchors } = splitBrokenLinks(brokenLinks);
const pathErrorMessage = createBrokenPathsMessage(brokenPaths);
if (pathErrorMessage) {
logger_1.default.report(onBrokenLinks)(pathErrorMessage);
}
const anchorErrorMessage = createBrokenAnchorsMessage(brokenAnchors);
if (anchorErrorMessage) {
logger_1.default.report(onBrokenAnchors)(anchorErrorMessage);
}
}
// Users might use the useBrokenLinks() API in weird unexpected ways
// JS users might call "collectLink(undefined)" for example
// TS users might call "collectAnchor('#hash')" with/without #
// We clean/normalize the collected data to avoid obscure errors being thrown
// We also use optimized data structures for a faster algorithm
function normalizeCollectedLinks(collectedLinks) {
const result = new Map();
Object.entries(collectedLinks).forEach(([pathname, pageCollectedData]) => {
result.set(pathname, {
links: new Set(pageCollectedData.links.filter(lodash_1.default.isString)),
anchors: new Set(pageCollectedData.anchors
.filter(lodash_1.default.isString)
.map((anchor) => (anchor.startsWith('#') ? anchor.slice(1) : anchor))),
});
});
return result;
}
async function handleBrokenLinks({ collectedLinks, onBrokenLinks, onBrokenAnchors, routes, }) {
if (onBrokenLinks === 'ignore' && onBrokenAnchors === 'ignore') {
return;
}
const brokenLinks = getBrokenLinks({
routes,
collectedLinks: normalizeCollectedLinks(collectedLinks),
});
reportBrokenLinks({ brokenLinks, onBrokenLinks, onBrokenAnchors });
}
exports.handleBrokenLinks = handleBrokenLinks;
from docusaurus.
Thanks @andrewgbell
Don't you see any improvement too? On @ravilach site (that I simplified a bit, just 1 docs plugin instance instead of 5), I see a significant improvement in time to handle broken links and total build time.
3.0
handleBrokenLinks: 3:28.361 (m:ss.mmm)
β¨ Done in 636.47s.
3.1 before
handleBrokenLinks: 6:32.570 (m:ss.mmm)
β¨ Done in 785.73s.
3.1 after optimizations
handleBrokenLinks: 694.907ms
β¨ Done in 361.92s.
from docusaurus.
Hi @slorber , sorry yes. I'd been comparing 3.1 optimisations with 3.1 ignore broken links so hadn't spotted it.
But looking again we get:
3.0 build time with handleBrokenLinks - 54 mins
3.1 (without fix) build time with handleBrokenLinks - 137 mins
3.1 (with fix) build time with handleBrokenLinks - 41 mins
So very significant! Thanks!
from docusaurus.
Just updated. It's even faster than before!! Thank you so much π
from docusaurus.
awesome news @anaclumos
Do you mind sharing numbers? How much faster is it?
from docusaurus.
It used to take around 20 minutes. Now it finishes around 11 minutes.
from docusaurus.
Related Issues (20)
- The <!--truncate--> line in my long blog post on the initialized website is not causing the blog post size to be limited. HOT 1
- Add trailing slash to auto generated sitemap.xml for directories only HOT 3
- blogTitle not working HOT 4
- Algolia Contextual Search Generates Incorrect FaceFilters HOT 6
- Multi-Instance Routes Not Working in v3.2.1 HOT 3
- Home page renders twice, one below the other. HOT 1
- Details elements aren't searchable - a11y issue HOT 8
- WARNβ 1 deprecated subdependencies found: [email protected] HOT 1
- npm run build fails when nmetadata are missing HOT 3
- Broken link transformation HOT 2
- Ability to Skip Homepage and Directly Access Feature Page HOT 1
- Light/Dark Mode issues on older versions of Safari 12,13, and 14 HOT 3
- Problems with locale url on homepage and blog HOT 7
- Proposal: createSitemapItems hook - a sitemap equivalent to createFeedItems
- Blob storage files render dynamically HOT 1
- YAML code highlighting highlights much less than prismjs demo site HOT 1
- Include sidebar_class_name in category index metadata
- DocCards incorrectly pluralise the number of elements within them when only one is present
- React 18.3 - new console warnings HOT 10
- Missing documentation for theme components HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from docusaurus.