Giter Site home page Giter Site logo

Comments (4)

kulikalov avatar kulikalov commented on May 25, 2024 2

Sorry, @yemd, I cannot reach this library maintainer to get access to publishing updates. I'd recommend building a custom solution using puppeteer instead of using this library.

from headless-chrome-crawler.

michaelpapesch avatar michaelpapesch commented on May 25, 2024 2

worked for me like that:

        customCrawl: async (page, crawl) => {
            await page.setViewport({
                width: 1200,
                height: 800
            });
            const result = await crawl();

            await page.evaluate(scrollToBottom);
            await page.waitFor(3000);
            return result;
        },
...
async function scrollToBottom() {
    await new Promise(resolve => {
        const distance = 100; // should be less than or equal to window.innerHeight
        const delay = 100;
        const timer = setInterval(() => {
            document.scrollingElement.scrollBy(0, distance);
            if (document.scrollingElement.scrollTop + window.innerHeight >= document.scrollingElement.scrollHeight) {
                clearInterval(timer);
                resolve();
            }
        }, delay);
    });
}

from headless-chrome-crawler.

ThisNameWasTaken avatar ThisNameWasTaken commented on May 25, 2024

Here is how I kept scrolling through a list for lazy loaded products untill the crawler reached the bottom of the page. I hope this helps :)

const productCrawler = await Crawler.launch({
  /*... */
});

await productCrawler.queue({
  url: '...',
  retryCount: 1,
  maxDepth: 3,
  depthPriority: false,
  waitUntil: 'networkidle0',
  jQuery: false,
  waitFor: {
    options: {},
    args: [config], // args for selectorOrFunctionOrTimeout
    selectorOrFunctionOrTimeout: function (config) {
      const documentHeight = document.documentElement.scrollHeight;

      window.scrollTo(0, documentHeight);

      // You might want to check if there are any elements still loading (look for spinners, other indicators, or just wait)
      // Return true if you are done scrolling, false otherwise

      return true; 
    },
  },
});

await productCrawler.onIdle();
await productCrawler.close();

If not you can always scroll inside the evaluatePageMethod

const productCrawler = await Crawler.launch({
  // ...
  evaluatePage: eval(`() => {
    const documentHeight = document.documentElement.scrollHeight;

    window.scrollTo(0, documentHeight);
  }`),
  // ...
})

from headless-chrome-crawler.

a1sabau avatar a1sabau commented on May 25, 2024

Take a look at get-set-fetch infinite scrolling example. It may prove a viable alternative.
Disclaimer: I'm the repo owner.

from headless-chrome-crawler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.