What is the current behavior? No documented way of scrolling

worked for me like that: <div class="snippet-clipboard-content notranslate positio

Take a look at <a href="https://github.com/get-set-fetch/scraper/tree/main/examples/in

Is there a way to scroll? about headless-chrome-crawler HOT 4 OPEN

wemow commented on May 25, 2024

Is there a way to scroll?

from headless-chrome-crawler.

Comments (4)

kulikalov commented on May 25, 2024 2

Sorry, @yemd, I cannot reach this library maintainer to get access to publishing updates. I'd recommend building a custom solution using puppeteer instead of using this library.

from headless-chrome-crawler.

michaelpapesch commented on May 25, 2024 2

worked for me like that:

        customCrawl: async (page, crawl) => {
            await page.setViewport({
                width: 1200,
                height: 800
            });
            const result = await crawl();

            await page.evaluate(scrollToBottom);
            await page.waitFor(3000);
            return result;
        },
...
async function scrollToBottom() {
    await new Promise(resolve => {
        const distance = 100; // should be less than or equal to window.innerHeight
        const delay = 100;
        const timer = setInterval(() => {
            document.scrollingElement.scrollBy(0, distance);
            if (document.scrollingElement.scrollTop + window.innerHeight >= document.scrollingElement.scrollHeight) {
                clearInterval(timer);
                resolve();
            }
        }, delay);
    });
}

from headless-chrome-crawler.

ThisNameWasTaken commented on May 25, 2024

Here is how I kept scrolling through a list for lazy loaded products untill the crawler reached the bottom of the page. I hope this helps :)

const productCrawler = await Crawler.launch({
  /*... */
});

await productCrawler.queue({
  url: '...',
  retryCount: 1,
  maxDepth: 3,
  depthPriority: false,
  waitUntil: 'networkidle0',
  jQuery: false,
  waitFor: {
    options: {},
    args: [config], // args for selectorOrFunctionOrTimeout
    selectorOrFunctionOrTimeout: function (config) {
      const documentHeight = document.documentElement.scrollHeight;

      window.scrollTo(0, documentHeight);

      // You might want to check if there are any elements still loading (look for spinners, other indicators, or just wait)
      // Return true if you are done scrolling, false otherwise

      return true; 
    },
  },
});

await productCrawler.onIdle();
await productCrawler.close();

If not you can always scroll inside the evaluatePageMethod

const productCrawler = await Crawler.launch({
  // ...
  evaluatePage: eval(`() => {
    const documentHeight = document.documentElement.scrollHeight;

    window.scrollTo(0, documentHeight);
  }`),
  // ...
})