Giter Site home page Giter Site logo

coder-hxl / x-crawl Goto Github PK

View Code? Open in Web Editor NEW
1.2K 13.0 68.0 40.89 MB

Flexible Node.js AI-assisted crawler library

Home Page: https://coder-hxl.github.io/x-crawl/

License: MIT License

JavaScript 1.89% TypeScript 98.11%
crawl crawler nodejs typescript spider flexible puppeteer javascript multifunction chromium

x-crawl's Introduction

x-crawl · npm NPM Downloads GitHub license

English | 简体中文

x-crawl is a flexible Node.js AI-assisted crawler library. Flexible usage and powerful AI assistance functions make crawler work more efficient, intelligent and convenient.

It consists of two parts:

  • Crawler: It consists of a crawler API and various functions that can work normally even without relying on AI.
  • AI: Currently based on the large AI model provided by OpenAI, AI simplifies many tedious operations.

If you find x-crawl helpful, or you like x-crawl, you can give x-crawl repository a like on GitHub A star. Your support is the driving force for our continuous improvement! thank you for your support!

Features

  • 🤖 AI Assistance - Powerful AI assistance function makes crawler work more efficient, intelligent and convenient.
  • 🖋️ Flexible writing - A single crawling API is suitable for multiple configurations, and each configuration method has its own advantages.
  • ⚙️Multiple uses - Supports crawling dynamic pages, static pages, interface data and file data.
  • ⚒️ Control page - Crawling dynamic pages supports automated operations, keyboard input, event operations, etc.
  • 👀 Device Fingerprinting - Zero configuration or custom configuration to avoid fingerprint recognition to identify and track us from different locations.
  • 🔥 Asynchronous Sync - Asynchronous or synchronous crawling mode without switching crawling API.
  • ⏱️ Interval crawling - no interval, fixed interval and random interval, determine whether to crawl with high concurrency.
  • 🔄 Failed Retry - Customize the number of retries to avoid crawling failures due to temporary problems.
  • ➡️ Rotation proxy - Automatic proxy rotation with failed retries, custom error times and HTTP status codes.
  • 🚀 Priority Queue - Based on the priority of a single crawl target, it can be crawled ahead of other targets.
  • 🧾 Crawl information - Controllable crawl information, which will output colored string information in the terminal.
  • 🦾 TypeScript - Own types and implement complete types through generics.

AI assisted crawler

With the rapid development of network technology, website updates have become more frequent, and changes in class names or structures often bring considerable challenges to crawlers that rely on these elements. Against this background, crawlers combined with AI technology have become a powerful weapon to meet this challenge.

First of all, changes in class names or structures after website updates may cause traditional crawler strategies to fail. This is because crawlers often rely on fixed class names or structures to locate and extract the required information. Once these elements change, the crawler may not be able to accurately find the required data, thus affecting the effectiveness and accuracy of data crawling.

However, crawlers combined with AI technology are better able to cope with this change. AI can also understand and parse the semantic information of web pages through natural language processing and other technologies to more accurately extract the required data.

To sum up, crawlers combined with AI technology can better cope with the problem of class name or structure changes after website updates.

Example

The combination of crawler and AI allows the crawler and AI to obtain pictures of high-rated vacation rentals according to our instructions:

import { createCrawl, createCrawlOpenAI } from 'x-crawl'

//Create a crawler application
const crawlApp = createCrawl({
  maxRetry: 3,
  intervalTime: { max: 2000, min: 1000 }
})

//Create AI application
const crawlOpenAIApp = createCrawlOpenAI({
  clientOptions: { apiKey: process.env['OPENAI_API_KEY'] },
  defaultModel: { chatModel: 'gpt-4-turbo-preview' }
})

// crawlPage is used to crawl pages
crawlApp.crawlPage('https://www.airbnb.cn/s/select_homes').then(async (res) => {
  const { page, browser } = res.data

  // Wait for the element to appear on the page and get the HTML
  const targetSelector = '[data-tracking-id="TOP_REVIEWED_LISTINGS"]'
  await page.waitForSelector(targetSelector)
  const highlyHTML = await page.$eval(targetSelector, (el) => el.innerHTML)

  // Let the AI get the image link and de-duplicate it (the more detailed the description, the better)
  const srcResult = await crawlOpenAIApp.parseElements(
    highlyHTML,
    `Get the image link, don't source it inside, and de-duplicate it`
  )

  browser.close()

  // crawlFile is used to crawl file resources
  crawlApp.crawlFile({
    targets: srcResult.elements.map((item) => item.src),
    storeDirs: './upload'
  })
})

Can even send the whole HTML to the AI to help us operate, because the website content is more complex you also need to describe the location to get more accurately, and will consume a lot of Tokens.

Even if the subsequent update of the website causes the class name or structure to change, it can climb to the data normally, because we no longer rely on the fixed class name or structure to locate and extract the required information, but let the AI understand and parse the semantic information of the web page, so as to extract the required data more efficiently, intelligently and conveniently.

Procedure:

Pictures of highly rated vacation rentals climbed to:

See the HTML that the AI needs to process

For ease of viewing, it is formatted here

<div data-pageslot="true" class="c1yo0219 dir dir-ltr">
  <div class="c121p4jg dir dir-ltr" data-reactroot="">
    <div
      aria-describedby="carousel-description"
      aria-labelledby="carousel-label"
      class="s7q4c1d rd7fm2t dir dir-ltr"
      role="group"
    >
      <div class="hztl681 dir dir-ltr" id="carousel-label">
        <div class="htgr43m dir dir-ltr">
          <div class=" dir dir-ltr">
            <h2 class="h1436ahv dir dir-ltr">威奇托的高评分度假屋</h2>
            <p class="bqwnmiz swd4c9o dir dir-ltr">
              这些房源在位置、干净卫生等方面获得房客的一致好评。
            </p>
          </div>
        </div>
      </div>
      <div class="dbldy2s dir dir-ltr" id="carousel-description">
        显示 12 项中的 4 项
      </div>
      <div class="c18vjgz6 dir dir-ltr">
        <div class="cfexzqx dir dir-ltr">
          <span class="a8jt5op dir dir-ltr" aria-live="polite"
            >第 1 页,共 3 页</span
          >
          <div aria-hidden="false" class="_1uytoza">
            <span class="a8jt5op dir dir-ltr">第 1 页,共 3 页</span>
            <div dir="ltr">
              <div aria-hidden="true" class="_1l1vk8w">1 / 3</div>
            </div>
            <div class="_jro6t0">
              <button
                aria-label="上一张"
                type="button"
                class="l1ovpqvx c1e0qvzg dir dir-ltr"
              >
                <span class="ifnd39z dir dir-ltr"
                  ><span class="_krjbj">_</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;fill:none;height:12px;width:12px;stroke:currentColor;stroke-width:4;overflow:visible"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill="none"
                      d="M20 28 8.7 16.7a1 1 0 0 1 0-1.4L20 4"
                    ></path></svg
                ></span></button
              ><span class="_pog3hg"></span
              ><button
                aria-label="下一张"
                type="button"
                class="l1ovpqvx c1e0qvzg dir dir-ltr"
              >
                <span class="ifnd39z dir dir-ltr"
                  ><span class="_krjbj">_</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;fill:none;height:12px;width:12px;stroke:currentColor;stroke-width:4;overflow:visible"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill="none"
                      d="m12 4 11.3 11.3a1 1 0 0 1 0 1.4L12 28"
                    ></path></svg
                ></span>
              </button>
            </div>
          </div>
        </div>
      </div>
      <div class="s1yvqyx7 dir dir-ltr">
        <div class="c14whb16 dir dir-ltr">
          <div aria-hidden="false" class=" dir dir-ltr">
            <a
              aria-label="农家乐 | Mulvane"
              class="cbkobxh dir dir-ltr"
              href="/s/homes?dynamic_product_ids%5B%5D=45937791&amp;omni_page_id=36021&amp;place_id=ChIJLRh_0mrbuocRPj3TdL_VlpM"
              target="_blank"
              rel="noreferrer"
              data-nosnippet="true"
              ><picture class="p1h9elqm dir dir-ltr"
                ><source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/miso/Hosting-45937791/original/c67d32ed-21eb-4066-8cef-650dcd45bada.jpeg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/miso/Hosting-45937791/original/c67d32ed-21eb-4066-8cef-650dcd45bada.jpeg?im_w=720 2x
                  "
                  media="(min-width: 744px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/miso/Hosting-45937791/original/c67d32ed-21eb-4066-8cef-650dcd45bada.jpeg?im_w=480 1x,
                    https://z1.muscache.cn/im/pictures/miso/Hosting-45937791/original/c67d32ed-21eb-4066-8cef-650dcd45bada.jpeg?im_w=960 2x
                  "
                  media="(min-width: 375px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/miso/Hosting-45937791/original/c67d32ed-21eb-4066-8cef-650dcd45bada.jpeg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/miso/Hosting-45937791/original/c67d32ed-21eb-4066-8cef-650dcd45bada.jpeg?im_w=720 2x
                  " />
                <img
                  src="https://z1.muscache.cn/im/pictures/miso/Hosting-45937791/original/c67d32ed-21eb-4066-8cef-650dcd45bada.jpeg?aki_policy=large"
                  loading="lazy"
                  alt=""
                  class="iotvkpj dir dir-ltr"
              /></picture>
              <div class="b14un2cd g12fuxtb f9iqyua dir dir-ltr">
                <span class="a8jt5op dir dir-ltr">房客推荐</span>
                <div aria-hidden="true" class="t1qa5xaj dir dir-ltr">
                  房客推荐
                </div>
              </div>
              <div class="cbheav2 dir dir-ltr">
                <h3
                  class="t1jojoys n1nue62c dir dir-ltr"
                  data-testid="listing-card-title"
                >
                  农家乐 | Mulvane
                </h3>
                <span class="r4a59j5 dir dir-ltr"
                  ><span class="a8jt5op dir dir-ltr"
                    >平均评分 4.98 分(满分 5 分),共 168 条评价</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;height:12px;width:12px;fill:currentColor"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill-rule="evenodd"
                      d="m15.1 1.58-4.13 8.88-9.86 1.27a1 1 0 0 0-.54 1.74l7.3 6.57-1.97 9.85a1 1 0 0 0 1.48 1.06l8.62-5 8.63 5a1 1 0 0 0 1.48-1.06l-1.97-9.85 7.3-6.57a1 1 0 0 0-.55-1.73l-9.86-1.28-4.12-8.88a1 1 0 0 0-1.82 0z"
                    ></path></svg
                  ><span aria-hidden="true" class="ru0q88m dir dir-ltr"
                    >4.98 (168)</span
                  ></span
                >
                <p
                  class="n1nue62c f1m3zhlw t6mzqp7 dir dir-ltr"
                  data-testid="listing-card-name"
                >
                  带私人热水浴缸和早餐的小木屋度假屋3
                </p>
                <p class="s4ddf4l n1nue62c f1m3zhlw dir dir-ltr">
                  Stay in one of our three private cabins, each equipped with
                  its own two person hot tub on the back deck for you to enjoy
                  under the stars. You will also have breakfast delivered to
                  your cabin at the time you choose in the morning(s). Each
                  cabin offers a pillow-top Queen bed, mini-fridge, microwave,
                  coffee maker, thermostat-controlled gas fireplace, A/C unit,
                  shower, cable TV, and a DVD player. All of that situated on
                  24+ beautiful acres complete with a pond and walking paths
                  through the woods.
                </p>
                <p class="n1nue62c f1m3zhlw dir dir-ltr">8月8日至15日</p>
                <div class="cyjufhz p8bhhzl f1m3zhlw dir dir-ltr">
                  <span aria-hidden="true" class="piy2wzv dir dir-ltr"
                    >¥1,170</span
                  ><span aria-hidden="true">&nbsp;\x3C!-- -->/晚</span
                  ><span class="s14ffc1j dir dir-ltr">每晚 ¥1,170</span>
                </div>
              </div></a
            >
          </div>
          <div aria-hidden="false" class="sjvtr3d dir dir-ltr">
            <a
              aria-label="Loft | 威奇托(Wichita)"
              class="cbkobxh dir dir-ltr"
              href="/s/homes?dynamic_product_ids%5B%5D=29753115&amp;omni_page_id=36021&amp;place_id=ChIJLRh_0mrbuocRPj3TdL_VlpM"
              target="_blank"
              rel="noreferrer"
              data-nosnippet="true"
              ><picture class="p1h9elqm dir dir-ltr"
                ><source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/52d375d3-5e54-444b-8186-15e61a592d9a.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/52d375d3-5e54-444b-8186-15e61a592d9a.jpg?im_w=720 2x
                  "
                  media="(min-width: 744px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/52d375d3-5e54-444b-8186-15e61a592d9a.jpg?im_w=480 1x,
                    https://z1.muscache.cn/im/pictures/52d375d3-5e54-444b-8186-15e61a592d9a.jpg?im_w=960 2x
                  "
                  media="(min-width: 375px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/52d375d3-5e54-444b-8186-15e61a592d9a.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/52d375d3-5e54-444b-8186-15e61a592d9a.jpg?im_w=720 2x
                  " />
                <img
                  src="https://z1.muscache.cn/im/pictures/52d375d3-5e54-444b-8186-15e61a592d9a.jpg?aki_policy=large"
                  loading="lazy"
                  alt=""
                  class="iotvkpj dir dir-ltr"
              /></picture>
              <div class="b14un2cd g12fuxtb f9iqyua dir dir-ltr">
                <span class="a8jt5op dir dir-ltr">房客推荐</span>
                <div aria-hidden="true" class="t1qa5xaj dir dir-ltr">
                  房客推荐
                </div>
              </div>
              <div class="cbheav2 dir dir-ltr">
                <h3
                  class="t1jojoys n1nue62c dir dir-ltr"
                  data-testid="listing-card-title"
                >
                  Loft | 威奇托(Wichita)
                </h3>
                <span class="r4a59j5 dir dir-ltr"
                  ><span class="a8jt5op dir dir-ltr"
                    >平均评分 4.9 分(满分 5 分),共 710 条评价</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;height:12px;width:12px;fill:currentColor"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill-rule="evenodd"
                      d="m15.1 1.58-4.13 8.88-9.86 1.27a1 1 0 0 0-.54 1.74l7.3 6.57-1.97 9.85a1 1 0 0 0 1.48 1.06l8.62-5 8.63 5a1 1 0 0 0 1.48-1.06l-1.97-9.85 7.3-6.57a1 1 0 0 0-.55-1.73l-9.86-1.28-4.12-8.88a1 1 0 0 0-1.82 0z"
                    ></path></svg
                  ><span aria-hidden="true" class="ru0q88m dir dir-ltr"
                    >4.9 (710)</span
                  ></span
                >
                <p
                  class="n1nue62c f1m3zhlw t6mzqp7 dir dir-ltr"
                  data-testid="listing-card-name"
                >
                  Wichita市中心现代、明亮的乐趣阁楼
                </p>
                <p class="s4ddf4l n1nue62c f1m3zhlw dir dir-ltr">
                  Welcome to the heart of downtown Wichita, KS, affectionately
                  referred to as the ICT by locals. Parking right out front, on
                  Douglas Ave in front of the private entrance. A gorgeous airy
                  loft with 10’ ceilings and a huge skylight tunnel. Two
                  bedrooms with full bathrooms, fully stocked kitchen with all
                  appliances. Full sized washer and dryer...all supplies
                  provided. Enjoy the private rooftop patio, hop on the free
                  trolly service, The Q, or walk to a plethora of restaurants,
                  bars or attractions.
                </p>
                <p class="n1nue62c f1m3zhlw dir dir-ltr">10月31日至11月7日</p>
                <div class="cyjufhz p8bhhzl f1m3zhlw dir dir-ltr">
                  <span aria-hidden="true" class="piy2wzv dir dir-ltr"
                    >¥862</span
                  ><span aria-hidden="true">&nbsp;\x3C!-- -->/晚</span
                  ><span class="s14ffc1j dir dir-ltr">每晚 ¥862</span>
                </div>
              </div></a
            >
          </div>
          <div aria-hidden="false" class="sjvtr3d dir dir-ltr">
            <a
              aria-label="民居 | 威奇托(Wichita)"
              class="cbkobxh dir dir-ltr"
              href="/s/homes?dynamic_product_ids%5B%5D=26764727&amp;omni_page_id=36021&amp;place_id=ChIJLRh_0mrbuocRPj3TdL_VlpM"
              target="_blank"
              rel="noreferrer"
              data-nosnippet="true"
              ><picture class="p1h9elqm dir dir-ltr"
                ><source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/4ce87a7c-cbce-4e6e-97ea-38840518e1c4.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/4ce87a7c-cbce-4e6e-97ea-38840518e1c4.jpg?im_w=720 2x
                  "
                  media="(min-width: 744px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/4ce87a7c-cbce-4e6e-97ea-38840518e1c4.jpg?im_w=480 1x,
                    https://z1.muscache.cn/im/pictures/4ce87a7c-cbce-4e6e-97ea-38840518e1c4.jpg?im_w=960 2x
                  "
                  media="(min-width: 375px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/4ce87a7c-cbce-4e6e-97ea-38840518e1c4.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/4ce87a7c-cbce-4e6e-97ea-38840518e1c4.jpg?im_w=720 2x
                  " />
                <img
                  src="https://z1.muscache.cn/im/pictures/4ce87a7c-cbce-4e6e-97ea-38840518e1c4.jpg?aki_policy=large"
                  loading="lazy"
                  alt=""
                  class="iotvkpj dir dir-ltr"
              /></picture>
              <div class="b14un2cd g12fuxtb f9iqyua dir dir-ltr">
                <span class="a8jt5op dir dir-ltr">房客推荐</span>
                <div aria-hidden="true" class="t1qa5xaj dir dir-ltr">
                  房客推荐
                </div>
              </div>
              <div class="cbheav2 dir dir-ltr">
                <h3
                  class="t1jojoys n1nue62c dir dir-ltr"
                  data-testid="listing-card-title"
                >
                  民居 | 威奇托(Wichita)
                </h3>
                <span class="r4a59j5 dir dir-ltr"
                  ><span class="a8jt5op dir dir-ltr"
                    >平均评分 4.92 分(满分 5 分),共 250 条评价</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;height:12px;width:12px;fill:currentColor"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill-rule="evenodd"
                      d="m15.1 1.58-4.13 8.88-9.86 1.27a1 1 0 0 0-.54 1.74l7.3 6.57-1.97 9.85a1 1 0 0 0 1.48 1.06l8.62-5 8.63 5a1 1 0 0 0 1.48-1.06l-1.97-9.85 7.3-6.57a1 1 0 0 0-.55-1.73l-9.86-1.28-4.12-8.88a1 1 0 0 0-1.82 0z"
                    ></path></svg
                  ><span aria-hidden="true" class="ru0q88m dir dir-ltr"
                    >4.92 (250)</span
                  ></span
                >
                <p
                  class="n1nue62c f1m3zhlw t6mzqp7 dir dir-ltr"
                  data-testid="listing-card-name"
                >
                  整套带复古飞机主题的小房子
                </p>
                <p class="s4ddf4l n1nue62c f1m3zhlw dir dir-ltr">
                  With a nod to Wichita's rich aviation history and focus on
                  care and quality in every detail; enjoy our fun and
                  interesting little house! Wichita has it all: shopping, zoo,
                  bike path with bike rentals, breweries, coffee shops, amazing
                  restaurants, music and sport venues, public parks, food
                  trucks, museums, and thriving businesses - we think you will
                  find our space executive ready, vacation prepared, and
                  comfortable! We hope you enjoy this amazing city and perfect
                  little home as much as we do!
                </p>
                <p class="n1nue62c f1m3zhlw dir dir-ltr">1月18日至25日</p>
                <div class="cyjufhz p8bhhzl f1m3zhlw dir dir-ltr">
                  <span aria-hidden="true" class="piy2wzv dir dir-ltr"
                    >¥709</span
                  ><span aria-hidden="true">&nbsp;\x3C!-- -->/晚</span
                  ><span class="s14ffc1j dir dir-ltr">每晚 ¥709</span>
                </div>
              </div></a
            >
          </div>
          <div aria-hidden="false" class="sjvtr3d dir dir-ltr">
            <a
              aria-label="民居 | 威奇托(Wichita)"
              class="cbkobxh dir dir-ltr"
              href="/s/homes?dynamic_product_ids%5B%5D=545779178592886885&amp;omni_page_id=36021&amp;place_id=ChIJLRh_0mrbuocRPj3TdL_VlpM"
              target="_blank"
              rel="noreferrer"
              data-nosnippet="true"
              ><picture class="p1h9elqm dir dir-ltr"
                ><source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/3f6b8ce1-df9b-4624-94e0-b63ec54b7fe4.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/3f6b8ce1-df9b-4624-94e0-b63ec54b7fe4.jpg?im_w=720 2x
                  "
                  media="(min-width: 744px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/3f6b8ce1-df9b-4624-94e0-b63ec54b7fe4.jpg?im_w=480 1x,
                    https://z1.muscache.cn/im/pictures/3f6b8ce1-df9b-4624-94e0-b63ec54b7fe4.jpg?im_w=960 2x
                  "
                  media="(min-width: 375px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/3f6b8ce1-df9b-4624-94e0-b63ec54b7fe4.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/3f6b8ce1-df9b-4624-94e0-b63ec54b7fe4.jpg?im_w=720 2x
                  " />
                <img
                  src="https://z1.muscache.cn/im/pictures/3f6b8ce1-df9b-4624-94e0-b63ec54b7fe4.jpg?aki_policy=large"
                  loading="lazy"
                  alt=""
                  class="iotvkpj dir dir-ltr"
              /></picture>
              <div class="b14un2cd g12fuxtb f9iqyua dir dir-ltr">
                <span class="a8jt5op dir dir-ltr">房客推荐</span>
                <div aria-hidden="true" class="t1qa5xaj dir dir-ltr">
                  房客推荐
                </div>
              </div>
              <div class="cbheav2 dir dir-ltr">
                <h3
                  class="t1jojoys n1nue62c dir dir-ltr"
                  data-testid="listing-card-title"
                >
                  民居 | 威奇托(Wichita)
                </h3>
                <span class="r4a59j5 dir dir-ltr"
                  ><span class="a8jt5op dir dir-ltr"
                    >平均评分 4.89 分(满分 5 分),共 292 条评价</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;height:12px;width:12px;fill:currentColor"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill-rule="evenodd"
                      d="m15.1 1.58-4.13 8.88-9.86 1.27a1 1 0 0 0-.54 1.74l7.3 6.57-1.97 9.85a1 1 0 0 0 1.48 1.06l8.62-5 8.63 5a1 1 0 0 0 1.48-1.06l-1.97-9.85 7.3-6.57a1 1 0 0 0-.55-1.73l-9.86-1.28-4.12-8.88a1 1 0 0 0-1.82 0z"
                    ></path></svg
                  ><span aria-hidden="true" class="ru0q88m dir dir-ltr"
                    >4.89 (292)</span
                  ></span
                >
                <p
                  class="n1nue62c f1m3zhlw t6mzqp7 dir dir-ltr"
                  data-testid="listing-card-name"
                >
                  舒适的双★卧室★步行至河畔
                </p>
                <p class="s4ddf4l n1nue62c f1m3zhlw dir dir-ltr">
                  We offer Full kitchen with essentials, ideal for families who
                  like to cook at home and like to save!\n*Special note, Home is
                  available for booking 7/15-8/4, 2024. Must book for entire
                  time for great discount! Message me if interested. \n\n5
                  minutes from I-35 highway, 6 min to WSU, 12 minutes from ICT
                  Airport, five minutes to Interest Bank Arena! \n\nThis
                  beautiful home is within walking distance of Arkansas River,
                  trails, parks. \nWalk and see amazing wildlife on trail right
                  by Arkansas River.
                </p>
                <p class="n1nue62c f1m3zhlw dir dir-ltr">9月18日至25日</p>
                <div class="cyjufhz p8bhhzl f1m3zhlw dir dir-ltr">
                  <span aria-hidden="true" class="piy2wzv dir dir-ltr"
                    >¥546</span
                  ><span aria-hidden="true">&nbsp;\x3C!-- -->/晚</span
                  ><span class="s14ffc1j dir dir-ltr">每晚 ¥546</span>
                </div>
              </div></a
            >
          </div>
          <div aria-hidden="true" class=" dir dir-ltr">
            <a
              aria-label="民居 | 威奇托(Wichita)"
              class="cbkobxh dir dir-ltr"
              href="/s/homes?dynamic_product_ids%5B%5D=661881998531696630&amp;omni_page_id=36021&amp;place_id=ChIJLRh_0mrbuocRPj3TdL_VlpM"
              target="_blank"
              rel="noreferrer"
              data-nosnippet="true"
              tabindex="-1"
              ><picture class="p1h9elqm dir dir-ltr"
                ><source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/miso/Hosting-661881998531696630/original/c7f7769f-e56c-4d55-8e74-06fdaf3e048d.jpeg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/miso/Hosting-661881998531696630/original/c7f7769f-e56c-4d55-8e74-06fdaf3e048d.jpeg?im_w=720 2x
                  "
                  media="(min-width: 744px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/miso/Hosting-661881998531696630/original/c7f7769f-e56c-4d55-8e74-06fdaf3e048d.jpeg?im_w=480 1x,
                    https://z1.muscache.cn/im/pictures/miso/Hosting-661881998531696630/original/c7f7769f-e56c-4d55-8e74-06fdaf3e048d.jpeg?im_w=960 2x
                  "
                  media="(min-width: 375px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/miso/Hosting-661881998531696630/original/c7f7769f-e56c-4d55-8e74-06fdaf3e048d.jpeg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/miso/Hosting-661881998531696630/original/c7f7769f-e56c-4d55-8e74-06fdaf3e048d.jpeg?im_w=720 2x
                  " />
                <img
                  src="https://z1.muscache.cn/im/pictures/miso/Hosting-661881998531696630/original/c7f7769f-e56c-4d55-8e74-06fdaf3e048d.jpeg?aki_policy=large"
                  loading="lazy"
                  alt=""
                  class="iotvkpj dir dir-ltr"
              /></picture>
              <div class="b14un2cd g12fuxtb f9iqyua dir dir-ltr">
                <span class="a8jt5op dir dir-ltr">房客推荐</span>
                <div aria-hidden="true" class="t1qa5xaj dir dir-ltr">
                  房客推荐
                </div>
              </div>
              <div class="cbheav2 dir dir-ltr">
                <h3
                  class="t1jojoys n1nue62c dir dir-ltr"
                  data-testid="listing-card-title"
                >
                  民居 | 威奇托(Wichita)
                </h3>
                <span class="r4a59j5 dir dir-ltr"
                  ><span class="a8jt5op dir dir-ltr"
                    >平均评分 4.88 分(满分 5 分),共 128 条评价</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;height:12px;width:12px;fill:currentColor"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill-rule="evenodd"
                      d="m15.1 1.58-4.13 8.88-9.86 1.27a1 1 0 0 0-.54 1.74l7.3 6.57-1.97 9.85a1 1 0 0 0 1.48 1.06l8.62-5 8.63 5a1 1 0 0 0 1.48-1.06l-1.97-9.85 7.3-6.57a1 1 0 0 0-.55-1.73l-9.86-1.28-4.12-8.88a1 1 0 0 0-1.82 0z"
                    ></path></svg
                  ><span aria-hidden="true" class="ru0q88m dir dir-ltr"
                    >4.88 (128)</span
                  ></span
                >
                <p
                  class="n1nue62c f1m3zhlw t6mzqp7 dir dir-ltr"
                  data-testid="listing-card-name"
                >
                  历史悠久的德拉诺区休闲度假胜地
                </p>
                <p class="s4ddf4l n1nue62c f1m3zhlw dir dir-ltr">
                  Enjoy this spacious, relaxing home, centrally located, 5
                  minutes from downtown, 8 minutes from the ICT
                  airport.\n\nGuests will enjoy the main floor of this two story
                  house-turned-duplex, which includes two bedrooms with queen
                  beds, one full bath off the master bedroom, cozy living room,
                  and dining area with kitchenette.\n\nEnjoy the relaxing front
                  porch with a swing and comfortable chairs in this historic
                  neighborhood.\n\nThere are two dogs in the upstairs apartment
                  that may make minimal noise.
                </p>
                <p class="n1nue62c f1m3zhlw dir dir-ltr">5月26日至6月2日</p>
                <div class="cyjufhz p8bhhzl f1m3zhlw dir dir-ltr">
                  <span aria-hidden="true" class="piy2wzv dir dir-ltr"
                    >¥743</span
                  ><span aria-hidden="true">&nbsp;\x3C!-- -->/晚</span
                  ><span class="s14ffc1j dir dir-ltr">每晚 ¥743</span>
                </div>
              </div></a
            >
          </div>
          <div aria-hidden="true" class="sjvtr3d dir dir-ltr">
            <a
              aria-label="乡村小屋 | Clearwater"
              class="cbkobxh dir dir-ltr"
              href="/s/homes?dynamic_product_ids%5B%5D=50620715&amp;omni_page_id=36021&amp;place_id=ChIJLRh_0mrbuocRPj3TdL_VlpM"
              target="_blank"
              rel="noreferrer"
              data-nosnippet="true"
              tabindex="-1"
              ><picture class="p1h9elqm dir dir-ltr"
                ><source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/miso/Hosting-50620715/original/650ba8af-3f77-41ce-8c93-0cf502a8656d.jpeg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/miso/Hosting-50620715/original/650ba8af-3f77-41ce-8c93-0cf502a8656d.jpeg?im_w=720 2x
                  "
                  media="(min-width: 744px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/miso/Hosting-50620715/original/650ba8af-3f77-41ce-8c93-0cf502a8656d.jpeg?im_w=480 1x,
                    https://z1.muscache.cn/im/pictures/miso/Hosting-50620715/original/650ba8af-3f77-41ce-8c93-0cf502a8656d.jpeg?im_w=960 2x
                  "
                  media="(min-width: 375px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/miso/Hosting-50620715/original/650ba8af-3f77-41ce-8c93-0cf502a8656d.jpeg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/miso/Hosting-50620715/original/650ba8af-3f77-41ce-8c93-0cf502a8656d.jpeg?im_w=720 2x
                  " />
                <img
                  src="https://z1.muscache.cn/im/pictures/miso/Hosting-50620715/original/650ba8af-3f77-41ce-8c93-0cf502a8656d.jpeg?aki_policy=large"
                  loading="lazy"
                  alt=""
                  class="iotvkpj dir dir-ltr"
              /></picture>
              <div class="b14un2cd g12fuxtb f9iqyua dir dir-ltr">
                <span class="a8jt5op dir dir-ltr">房客推荐</span>
                <div aria-hidden="true" class="t1qa5xaj dir dir-ltr">
                  房客推荐
                </div>
              </div>
              <div class="cbheav2 dir dir-ltr">
                <h3
                  class="t1jojoys n1nue62c dir dir-ltr"
                  data-testid="listing-card-title"
                >
                  乡村小屋 | Clearwater
                </h3>
                <span class="r4a59j5 dir dir-ltr"
                  ><span class="a8jt5op dir dir-ltr"
                    >平均评分 4.99 分(满分 5 分),共 137 条评价</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;height:12px;width:12px;fill:currentColor"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill-rule="evenodd"
                      d="m15.1 1.58-4.13 8.88-9.86 1.27a1 1 0 0 0-.54 1.74l7.3 6.57-1.97 9.85a1 1 0 0 0 1.48 1.06l8.62-5 8.63 5a1 1 0 0 0 1.48-1.06l-1.97-9.85 7.3-6.57a1 1 0 0 0-.55-1.73l-9.86-1.28-4.12-8.88a1 1 0 0 0-1.82 0z"
                    ></path></svg
                  ><span aria-hidden="true" class="ru0q88m dir dir-ltr"
                    >4.99 (137)</span
                  ></span
                >
                <p
                  class="n1nue62c f1m3zhlw t6mzqp7 dir dir-ltr"
                  data-testid="listing-card-name"
                >
                  全新装修乡村小屋乡村度假屋
                </p>
                <p class="s4ddf4l n1nue62c f1m3zhlw dir dir-ltr">
                  Clearview Cottage is a quiet country home just 13 miles from
                  Eisenhower Airport and 20 minutes from downtown Wichita. This
                  fully renovated home has one bedroom and one bathroom and is
                  ideal for romantic getaways and business travelers. Outdoor
                  spaces include a large front porch to watch the sunset and
                  explore the stars at night. Located on a working farm, you
                  will experience the sights and sounds of rural life and
                  perhaps find some farm fresh eggs to enjoy!
                </p>
                <p class="n1nue62c f1m3zhlw dir dir-ltr">1月19日至26日</p>
                <div class="cyjufhz p8bhhzl f1m3zhlw dir dir-ltr">
                  <span aria-hidden="true" class="piy2wzv dir dir-ltr"
                    >¥535</span
                  ><span aria-hidden="true">&nbsp;\x3C!-- -->/晚</span
                  ><span class="s14ffc1j dir dir-ltr">每晚 ¥535</span>
                </div>
              </div></a
            >
          </div>
          <div aria-hidden="true" class="sjvtr3d dir dir-ltr">
            <a
              aria-label="联排别墅 | 威奇托(Wichita)"
              class="cbkobxh dir dir-ltr"
              href="/s/homes?dynamic_product_ids%5B%5D=40405228&amp;omni_page_id=36021&amp;place_id=ChIJLRh_0mrbuocRPj3TdL_VlpM"
              target="_blank"
              rel="noreferrer"
              data-nosnippet="true"
              tabindex="-1"
              ><picture class="p1h9elqm dir dir-ltr"
                ><source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/b899a44f-e5dd-4ee8-9116-13a5c79fb3d6.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/b899a44f-e5dd-4ee8-9116-13a5c79fb3d6.jpg?im_w=720 2x
                  "
                  media="(min-width: 744px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/b899a44f-e5dd-4ee8-9116-13a5c79fb3d6.jpg?im_w=480 1x,
                    https://z1.muscache.cn/im/pictures/b899a44f-e5dd-4ee8-9116-13a5c79fb3d6.jpg?im_w=960 2x
                  "
                  media="(min-width: 375px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/b899a44f-e5dd-4ee8-9116-13a5c79fb3d6.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/b899a44f-e5dd-4ee8-9116-13a5c79fb3d6.jpg?im_w=720 2x
                  " />
                <img
                  src="https://z1.muscache.cn/im/pictures/b899a44f-e5dd-4ee8-9116-13a5c79fb3d6.jpg?aki_policy=large"
                  loading="lazy"
                  alt=""
                  class="iotvkpj dir dir-ltr"
              /></picture>
              <div class="b14un2cd g12fuxtb f9iqyua dir dir-ltr">
                <span class="a8jt5op dir dir-ltr">房客推荐</span>
                <div aria-hidden="true" class="t1qa5xaj dir dir-ltr">
                  房客推荐
                </div>
              </div>
              <div class="cbheav2 dir dir-ltr">
                <h3
                  class="t1jojoys n1nue62c dir dir-ltr"
                  data-testid="listing-card-title"
                >
                  联排别墅 | 威奇托(Wichita)
                </h3>
                <span class="r4a59j5 dir dir-ltr"
                  ><span class="a8jt5op dir dir-ltr"
                    >平均评分 4.96 分(满分 5 分),共 132 条评价</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;height:12px;width:12px;fill:currentColor"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill-rule="evenodd"
                      d="m15.1 1.58-4.13 8.88-9.86 1.27a1 1 0 0 0-.54 1.74l7.3 6.57-1.97 9.85a1 1 0 0 0 1.48 1.06l8.62-5 8.63 5a1 1 0 0 0 1.48-1.06l-1.97-9.85 7.3-6.57a1 1 0 0 0-.55-1.73l-9.86-1.28-4.12-8.88a1 1 0 0 0-1.82 0z"
                    ></path></svg
                  ><span aria-hidden="true" class="ru0q88m dir dir-ltr"
                    >4.96 (132)</span
                  ></span
                >
                <p
                  class="n1nue62c f1m3zhlw t6mzqp7 dir dir-ltr"
                  data-testid="listing-card-name"
                >
                  Boho Bliss in College Hill by Indigo Moon Homes
                </p>
                <p class="s4ddf4l n1nue62c f1m3zhlw dir dir-ltr">
                  Introducing the newest Indigo Moon Property! This charming
                  twin home is professionally designed and staged by Indigo Moon
                  Homes and is walking distance to all of College Hill's most
                  popular eateries, shopping, and bars. Wesley Med Center and
                  WSU are both a short drive away and the complimentary Q-line
                  trolley is two blocks away. From fine linens and furnishings
                  to a fully equipped kitchen, we strive to make your stay
                  enjoyable!
                </p>
                <p class="n1nue62c f1m3zhlw dir dir-ltr">5月19日至26日</p>
                <div class="cyjufhz p8bhhzl f1m3zhlw dir dir-ltr">
                  <span aria-hidden="true" class="piy2wzv dir dir-ltr"
                    >¥645</span
                  ><span aria-hidden="true">&nbsp;\x3C!-- -->/晚</span
                  ><span class="s14ffc1j dir dir-ltr">每晚 ¥645</span>
                </div>
              </div></a
            >
          </div>
          <div aria-hidden="true" class="sjvtr3d dir dir-ltr">
            <a
              aria-label="民居 | 威奇托(Wichita)"
              class="cbkobxh dir dir-ltr"
              href="/s/homes?dynamic_product_ids%5B%5D=51309116&amp;omni_page_id=36021&amp;place_id=ChIJLRh_0mrbuocRPj3TdL_VlpM"
              target="_blank"
              rel="noreferrer"
              data-nosnippet="true"
              tabindex="-1"
              ><picture class="p1h9elqm dir dir-ltr"
                ><source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/a2820abe-20bc-4898-a0ee-17f3c974158b.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/a2820abe-20bc-4898-a0ee-17f3c974158b.jpg?im_w=720 2x
                  "
                  media="(min-width: 744px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/a2820abe-20bc-4898-a0ee-17f3c974158b.jpg?im_w=480 1x,
                    https://z1.muscache.cn/im/pictures/a2820abe-20bc-4898-a0ee-17f3c974158b.jpg?im_w=960 2x
                  "
                  media="(min-width: 375px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/a2820abe-20bc-4898-a0ee-17f3c974158b.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/a2820abe-20bc-4898-a0ee-17f3c974158b.jpg?im_w=720 2x
                  " />
                <img
                  src="https://z1.muscache.cn/im/pictures/a2820abe-20bc-4898-a0ee-17f3c974158b.jpg?aki_policy=large"
                  loading="lazy"
                  alt=""
                  class="iotvkpj dir dir-ltr"
              /></picture>
              <div class="b14un2cd g12fuxtb f9iqyua dir dir-ltr">
                <span class="a8jt5op dir dir-ltr">房客推荐</span>
                <div aria-hidden="true" class="t1qa5xaj dir dir-ltr">
                  房客推荐
                </div>
              </div>
              <div class="cbheav2 dir dir-ltr">
                <h3
                  class="t1jojoys n1nue62c dir dir-ltr"
                  data-testid="listing-card-title"
                >
                  民居 | 威奇托(Wichita)
                </h3>
                <span class="r4a59j5 dir dir-ltr"
                  ><span class="a8jt5op dir dir-ltr"
                    >平均评分 4.93 分(满分 5 分),共 256 条评价</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;height:12px;width:12px;fill:currentColor"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill-rule="evenodd"
                      d="m15.1 1.58-4.13 8.88-9.86 1.27a1 1 0 0 0-.54 1.74l7.3 6.57-1.97 9.85a1 1 0 0 0 1.48 1.06l8.62-5 8.63 5a1 1 0 0 0 1.48-1.06l-1.97-9.85 7.3-6.57a1 1 0 0 0-.55-1.73l-9.86-1.28-4.12-8.88a1 1 0 0 0-1.82 0z"
                    ></path></svg
                  ><span aria-hidden="true" class="ru0q88m dir dir-ltr"
                    >4.93 (256)</span
                  ></span
                >
                <p
                  class="n1nue62c f1m3zhlw t6mzqp7 dir dir-ltr"
                  data-testid="listing-card-name"
                >
                  舒适房源,距离市中心5分钟车程
                </p>
                <p class="s4ddf4l n1nue62c f1m3zhlw dir dir-ltr">
                  Enjoy a curated, comfort oriented experience at this
                  centrally-located Wichita home. Minutes from Friends and
                  Newman University, less than 10 minutes from the airport and
                  downtown. \n\nYou’ll love the spacious kitchen access,
                  comfortable living space, and beautiful bedrooms with queen
                  size beds and black out curtains for optimal
                  rest.\n\nContactless check in. You’ll receive a custom check
                  in code the day of your arrival.
                </p>
                <p class="n1nue62c f1m3zhlw dir dir-ltr">10月14日至21日</p>
                <div class="cyjufhz p8bhhzl f1m3zhlw dir dir-ltr">
                  <span aria-hidden="true" class="piy2wzv dir dir-ltr"
                    >¥1,068</span
                  ><span aria-hidden="true">&nbsp;\x3C!-- -->/晚</span
                  ><span class="s14ffc1j dir dir-ltr">每晚 ¥1,068</span>
                </div>
              </div></a
            >
          </div>
          <div aria-hidden="true" class=" dir dir-ltr">
            <a
              aria-label="客房 | 威奇托(Wichita)"
              class="cbkobxh dir dir-ltr"
              href="/s/homes?dynamic_product_ids%5B%5D=21364903&amp;omni_page_id=36021&amp;place_id=ChIJLRh_0mrbuocRPj3TdL_VlpM"
              target="_blank"
              rel="noreferrer"
              data-nosnippet="true"
              tabindex="-1"
              ><picture class="p1h9elqm dir dir-ltr"
                ><source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/1f55a7c1-021f-4eb5-8e35-6473e16d7fef.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/1f55a7c1-021f-4eb5-8e35-6473e16d7fef.jpg?im_w=720 2x
                  "
                  media="(min-width: 744px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/1f55a7c1-021f-4eb5-8e35-6473e16d7fef.jpg?im_w=480 1x,
                    https://z1.muscache.cn/im/pictures/1f55a7c1-021f-4eb5-8e35-6473e16d7fef.jpg?im_w=960 2x
                  "
                  media="(min-width: 375px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/1f55a7c1-021f-4eb5-8e35-6473e16d7fef.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/1f55a7c1-021f-4eb5-8e35-6473e16d7fef.jpg?im_w=720 2x
                  " />
                <img
                  src="https://z1.muscache.cn/im/pictures/1f55a7c1-021f-4eb5-8e35-6473e16d7fef.jpg?aki_policy=large"
                  loading="lazy"
                  alt=""
                  class="iotvkpj dir dir-ltr"
              /></picture>
              <div class="b14un2cd g12fuxtb f9iqyua dir dir-ltr">
                <span class="a8jt5op dir dir-ltr">房客推荐</span>
                <div aria-hidden="true" class="t1qa5xaj dir dir-ltr">
                  房客推荐
                </div>
              </div>
              <div class="cbheav2 dir dir-ltr">
                <h3
                  class="t1jojoys n1nue62c dir dir-ltr"
                  data-testid="listing-card-title"
                >
                  客房 | 威奇托(Wichita)
                </h3>
                <span class="r4a59j5 dir dir-ltr"
                  ><span class="a8jt5op dir dir-ltr"
                    >平均评分 4.99 分(满分 5 分),共 100 条评价</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;height:12px;width:12px;fill:currentColor"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill-rule="evenodd"
                      d="m15.1 1.58-4.13 8.88-9.86 1.27a1 1 0 0 0-.54 1.74l7.3 6.57-1.97 9.85a1 1 0 0 0 1.48 1.06l8.62-5 8.63 5a1 1 0 0 0 1.48-1.06l-1.97-9.85 7.3-6.57a1 1 0 0 0-.55-1.73l-9.86-1.28-4.12-8.88a1 1 0 0 0-1.82 0z"
                    ></path></svg
                  ><span aria-hidden="true" class="ru0q88m dir dir-ltr"
                    >4.99 (100)</span
                  ></span
                >
                <p
                  class="n1nue62c f1m3zhlw t6mzqp7 dir dir-ltr"
                  data-testid="listing-card-name"
                >
                  树屋
                </p>
                <p class="s4ddf4l n1nue62c f1m3zhlw dir dir-ltr">
                  Quaint ‘treehouse’, guesthouse over a garage. Includes 3/4
                  bath with walk-in shower, washer and dryer, kitchen/living
                  space, and deck that overlooks the backyard. Walking distance
                  to all of Delano, Riverfront Stadium and Keeper of the
                  Plains.\nLess than a half mile from Wichita’s best slice of
                  Pizza at Picasso’s, best dive bar The Shamrock with live music
                  on weekends, and many more local treasures.
                </p>
                <p class="n1nue62c f1m3zhlw dir dir-ltr">6月25日至7月2日</p>
                <div class="cyjufhz p8bhhzl f1m3zhlw dir dir-ltr">
                  <span aria-hidden="true" class="piy2wzv dir dir-ltr"
                    >¥683</span
                  ><span aria-hidden="true">&nbsp;\x3C!-- -->/晚</span
                  ><span class="s14ffc1j dir dir-ltr">每晚 ¥683</span>
                </div>
              </div></a
            >
          </div>
          <div aria-hidden="true" class="sjvtr3d dir dir-ltr">
            <a
              aria-label="袖珍小屋 | 威奇托(Wichita)"
              class="cbkobxh dir dir-ltr"
              href="/s/homes?dynamic_product_ids%5B%5D=24893823&amp;omni_page_id=36021&amp;place_id=ChIJLRh_0mrbuocRPj3TdL_VlpM"
              target="_blank"
              rel="noreferrer"
              data-nosnippet="true"
              tabindex="-1"
              ><picture class="p1h9elqm dir dir-ltr"
                ><source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/5205dac7-dd2a-4f91-8027-a4c0e52b4fae.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/5205dac7-dd2a-4f91-8027-a4c0e52b4fae.jpg?im_w=720 2x
                  "
                  media="(min-width: 744px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/5205dac7-dd2a-4f91-8027-a4c0e52b4fae.jpg?im_w=480 1x,
                    https://z1.muscache.cn/im/pictures/5205dac7-dd2a-4f91-8027-a4c0e52b4fae.jpg?im_w=960 2x
                  "
                  media="(min-width: 375px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/5205dac7-dd2a-4f91-8027-a4c0e52b4fae.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/5205dac7-dd2a-4f91-8027-a4c0e52b4fae.jpg?im_w=720 2x
                  " />
                <img
                  src="https://z1.muscache.cn/im/pictures/5205dac7-dd2a-4f91-8027-a4c0e52b4fae.jpg?aki_policy=large"
                  loading="lazy"
                  alt=""
                  class="iotvkpj dir dir-ltr"
              /></picture>
              <div class="b14un2cd movpp6a f9iqyua dir dir-ltr">
                <span class="a8jt5op dir dir-ltr">超赞房东</span>
                <div aria-hidden="true" class="t1qa5xaj dir dir-ltr">
                  超赞房东
                </div>
              </div>
              <div class="cbheav2 dir dir-ltr">
                <h3
                  class="t1jojoys n1nue62c dir dir-ltr"
                  data-testid="listing-card-title"
                >
                  袖珍小屋 | 威奇托(Wichita)
                </h3>
                <span class="r4a59j5 dir dir-ltr"
                  ><span class="a8jt5op dir dir-ltr"
                    >平均评分 4.87 分(满分 5 分),共 800 条评价</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;height:12px;width:12px;fill:currentColor"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill-rule="evenodd"
                      d="m15.1 1.58-4.13 8.88-9.86 1.27a1 1 0 0 0-.54 1.74l7.3 6.57-1.97 9.85a1 1 0 0 0 1.48 1.06l8.62-5 8.63 5a1 1 0 0 0 1.48-1.06l-1.97-9.85 7.3-6.57a1 1 0 0 0-.55-1.73l-9.86-1.28-4.12-8.88a1 1 0 0 0-1.82 0z"
                    ></path></svg
                  ><span aria-hidden="true" class="ru0q88m dir dir-ltr"
                    >4.87 (800)</span
                  ></span
                >
                <p
                  class="n1nue62c f1m3zhlw t6mzqp7 dir dir-ltr"
                  data-testid="listing-card-name"
                >
                  小巷小房子:距离老城区5个街区!
                </p>
                <p class="s4ddf4l n1nue62c f1m3zhlw dir dir-ltr">
                  Our "Little House on the Alley" is a relaxing escape from the
                  hotel scene or sharing a room in someone's home. The little
                  house is all yours! At only 320 square feet you can move
                  around from room to room quite easily, but at the same time it
                  has everything you need for a weekend getaway or a short term
                  stay. And the best part? You are just 5 blocks to Old Town
                  Entertainment District!
                </p>
                <p class="n1nue62c f1m3zhlw dir dir-ltr">8月1日至8日</p>
                <div class="cyjufhz p8bhhzl f1m3zhlw dir dir-ltr">
                  <span aria-hidden="true" class="piy2wzv dir dir-ltr"
                    >¥589</span
                  ><span aria-hidden="true">&nbsp;\x3C!-- -->/晚</span
                  ><span class="s14ffc1j dir dir-ltr">每晚 ¥589</span>
                </div>
              </div></a
            >
          </div>
          <div aria-hidden="true" class="sjvtr3d dir dir-ltr">
            <a
              aria-label="民居 | 威奇托(Wichita)"
              class="cbkobxh dir dir-ltr"
              href="/s/homes?dynamic_product_ids%5B%5D=792178978933830608&amp;omni_page_id=36021&amp;place_id=ChIJLRh_0mrbuocRPj3TdL_VlpM"
              target="_blank"
              rel="noreferrer"
              data-nosnippet="true"
              tabindex="-1"
              ><picture class="p1h9elqm dir dir-ltr"
                ><source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/miso/Hosting-792178978933830608/original/75a7613c-e435-45fb-9db4-e4163921254b.jpeg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/miso/Hosting-792178978933830608/original/75a7613c-e435-45fb-9db4-e4163921254b.jpeg?im_w=720 2x
                  "
                  media="(min-width: 744px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/miso/Hosting-792178978933830608/original/75a7613c-e435-45fb-9db4-e4163921254b.jpeg?im_w=480 1x,
                    https://z1.muscache.cn/im/pictures/miso/Hosting-792178978933830608/original/75a7613c-e435-45fb-9db4-e4163921254b.jpeg?im_w=960 2x
                  "
                  media="(min-width: 375px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/miso/Hosting-792178978933830608/original/75a7613c-e435-45fb-9db4-e4163921254b.jpeg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/miso/Hosting-792178978933830608/original/75a7613c-e435-45fb-9db4-e4163921254b.jpeg?im_w=720 2x
                  " />
                <img
                  src="https://z1.muscache.cn/im/pictures/miso/Hosting-792178978933830608/original/75a7613c-e435-45fb-9db4-e4163921254b.jpeg?aki_policy=large"
                  loading="lazy"
                  alt=""
                  class="iotvkpj dir dir-ltr"
              /></picture>
              <div class="b14un2cd g12fuxtb f9iqyua dir dir-ltr">
                <span class="a8jt5op dir dir-ltr">房客推荐</span>
                <div aria-hidden="true" class="t1qa5xaj dir dir-ltr">
                  房客推荐
                </div>
              </div>
              <div class="cbheav2 dir dir-ltr">
                <h3
                  class="t1jojoys n1nue62c dir dir-ltr"
                  data-testid="listing-card-title"
                >
                  民居 | 威奇托(Wichita)
                </h3>
                <span class="r4a59j5 dir dir-ltr"
                  ><span class="a8jt5op dir dir-ltr"
                    >平均评分 4.88 分(满分 5 分),共 102 条评价</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;height:12px;width:12px;fill:currentColor"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill-rule="evenodd"
                      d="m15.1 1.58-4.13 8.88-9.86 1.27a1 1 0 0 0-.54 1.74l7.3 6.57-1.97 9.85a1 1 0 0 0 1.48 1.06l8.62-5 8.63 5a1 1 0 0 0 1.48-1.06l-1.97-9.85 7.3-6.57a1 1 0 0 0-.55-1.73l-9.86-1.28-4.12-8.88a1 1 0 0 0-1.82 0z"
                    ></path></svg
                  ><span aria-hidden="true" class="ru0q88m dir dir-ltr"
                    >4.88 (102)</span
                  ></span
                >
                <p
                  class="n1nue62c f1m3zhlw t6mzqp7 dir dir-ltr"
                  data-testid="listing-card-name"
                >
                  位于威奇塔( Wichita )中心的舒适房源
                </p>
                <p class="s4ddf4l n1nue62c f1m3zhlw dir dir-ltr">
                  Beautiful cozy home in the heart of Wichita in a quiet
                  neighborhood. This home is moments away from Wesley Hospital,
                  popular restaurants, bars, and grocery stores. This home is
                  fully furnished with washer and dryer and all appliances,
                  equipped with the fastest fiber internet. Pet friendly. Fenced
                  backyard for the 4 legged children. Our goal is to accommodate
                  our guest to the fullest.
                </p>
                <p class="n1nue62c f1m3zhlw dir dir-ltr">3月16日至23日</p>
                <div class="cyjufhz p8bhhzl f1m3zhlw dir dir-ltr">
                  <span aria-hidden="true" class="piy2wzv dir dir-ltr"
                    >¥1,094</span
                  ><span aria-hidden="true">&nbsp;\x3C!-- -->/晚</span
                  ><span class="s14ffc1j dir dir-ltr">每晚 ¥1,094</span>
                </div>
              </div></a
            >
          </div>
          <div aria-hidden="true" class="sjvtr3d dir dir-ltr">
            <a
              aria-label="民居 | 威奇托(Wichita)"
              class="cbkobxh dir dir-ltr"
              href="/s/homes?dynamic_product_ids%5B%5D=32114722&amp;omni_page_id=36021&amp;place_id=ChIJLRh_0mrbuocRPj3TdL_VlpM"
              target="_blank"
              rel="noreferrer"
              data-nosnippet="true"
              tabindex="-1"
              ><picture class="p1h9elqm dir dir-ltr"
                ><source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/5e755fa0-74a5-400c-b33d-427e56f84330.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/5e755fa0-74a5-400c-b33d-427e56f84330.jpg?im_w=720 2x
                  "
                  media="(min-width: 744px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/5e755fa0-74a5-400c-b33d-427e56f84330.jpg?im_w=480 1x,
                    https://z1.muscache.cn/im/pictures/5e755fa0-74a5-400c-b33d-427e56f84330.jpg?im_w=960 2x
                  "
                  media="(min-width: 375px)" />
                <source
                  class="s1yu4uxa dir dir-ltr"
                  srcset="
                    https://z1.muscache.cn/im/pictures/5e755fa0-74a5-400c-b33d-427e56f84330.jpg?im_w=320 1x,
                    https://z1.muscache.cn/im/pictures/5e755fa0-74a5-400c-b33d-427e56f84330.jpg?im_w=720 2x
                  " />
                <img
                  src="https://z1.muscache.cn/im/pictures/5e755fa0-74a5-400c-b33d-427e56f84330.jpg?aki_policy=large"
                  loading="lazy"
                  alt=""
                  class="iotvkpj dir dir-ltr"
              /></picture>
              <div class="b14un2cd g12fuxtb f9iqyua dir dir-ltr">
                <span class="a8jt5op dir dir-ltr">房客推荐</span>
                <div aria-hidden="true" class="t1qa5xaj dir dir-ltr">
                  房客推荐
                </div>
              </div>
              <div class="cbheav2 dir dir-ltr">
                <h3
                  class="t1jojoys n1nue62c dir dir-ltr"
                  data-testid="listing-card-title"
                >
                  民居 | 威奇托(Wichita)
                </h3>
                <span class="r4a59j5 dir dir-ltr"
                  ><span class="a8jt5op dir dir-ltr"
                    >平均评分 4.99 分(满分 5 分),共 153 条评价</span
                  ><svg
                    xmlns="http://www.w3.org/2000/svg"
                    viewBox="0 0 32 32"
                    style="display:block;height:12px;width:12px;fill:currentColor"
                    aria-hidden="true"
                    role="presentation"
                    focusable="false"
                  >
                    <path
                      fill-rule="evenodd"
                      d="m15.1 1.58-4.13 8.88-9.86 1.27a1 1 0 0 0-.54 1.74l7.3 6.57-1.97 9.85a1 1 0 0 0 1.48 1.06l8.62-5 8.63 5a1 1 0 0 0 1.48-1.06l-1.97-9.85 7.3-6.57a1 1 0 0 0-.55-1.73l-9.86-1.28-4.12-8.88a1 1 0 0 0-1.82 0z"
                    ></path></svg
                  ><span aria-hidden="true" class="ru0q88m dir dir-ltr"
                    >4.99 (153)</span
                  ></span
                >
                <p
                  class="n1nue62c f1m3zhlw t6mzqp7 dir dir-ltr"
                  data-testid="listing-card-name"
                >
                  ICT Farmhouse-Charming Two Bedroom Home
                </p>
                <p class="s4ddf4l n1nue62c f1m3zhlw dir dir-ltr">
                  Welcome to the ICT Farmhouse! (ICT=Wichita in Airline Lingo)
                  This charming, newly remodeled two bedroom home is located on
                  the western edge of historic Delano Neighborhood. Its
                  proximity to the Downtown District, Century II Convention
                  Center &amp; many wedding venues makes it the perfect place to
                  stay for business or pleasure!
                </p>
                <p class="n1nue62c f1m3zhlw dir dir-ltr">4月28日至5月5日</p>
                <div class="cyjufhz p8bhhzl f1m3zhlw dir dir-ltr">
                  <span aria-hidden="true" class="piy2wzv dir dir-ltr"
                    >¥670</span
                  ><span aria-hidden="true">&nbsp;\x3C!-- -->/晚</span
                  ><span class="s14ffc1j dir dir-ltr">每晚 ¥670</span>
                </div>
              </div></a
            >
          </div>
        </div>
      </div>
    </div>
  </div>
</div>
View the srcResult (img url) returned by AI after parsing the HTML according to our instructions
{
  "elements": [
    {
      "src": "https://z1.muscache.cn/im/pictures/miso/Hosting-45937791/original/c67d32ed-21eb-4066-8cef-650dcd45bada.jpeg?im_w=720"
    },
    {
      "src": "https://z1.muscache.cn/im/pictures/df3493cf-39b2-46cc-9e85-7ef186980f25.jpg?im_w=720"
    },
    {
      "src": "https://z1.muscache.cn/im/pictures/52d375d3-5e54-444b-8186-15e61a592d9a.jpg?im_w=720"
    },
    {
      "src": "https://z1.muscache.cn/im/pictures/4ce87a7c-cbce-4e6e-97ea-38840518e1c4.jpg?im_w=720"
    },
    {
      "src": "https://z1.muscache.cn/im/pictures/miso/Hosting-661881998531696630/original/c7f7769f-e56c-4d55-8e74-06fdaf3e048d.jpeg?im_w=720"
    },
    {
      "src": "https://z1.muscache.cn/im/pictures/miso/Hosting-50620715/original/650ba8af-3f77-41ce-8c93-0cf502a8656d.jpeg?im_w=720"
    },
    {
      "src": "https://z1.muscache.cn/im/pictures/b899a44f-e5dd-4ee8-9116-13a5c79fb3d6.jpg?im_w=720"
    },
    {
      "src": "https://z1.muscache.cn/im/pictures/a2820abe-20bc-4898-a0ee-17f3c974158b.jpg?im_w=720"
    },
    {
      "src": "https://z1.muscache.cn/im/pictures/1f55a7c1-021f-4eb5-8e35-6473e16d7fef.jpg?im_w=720"
    },
    {
      "src": "https://z1.muscache.cn/im/pictures/5205dac7-dd2a-4f91-8027-a4c0e52b4fae.jpg?im_w=720"
    },
    {
      "src": "https://z1.muscache.cn/im/pictures/miso/Hosting-792178978933830608/original/75a7613c-e435-45fb-9db4-e4163921254b.jpeg?im_w=720"
    },
    {
      "src": "https://z1.muscache.cn/im/pictures/bafaacfa-1644-4a3b-9165-bcd831924cc6.jpg?im_w=720"
    }
  ],
  "type": "multiple"
}

warning: x-crawl is for legal use only. Any illegal activity using this tool is prohibited. Please be sure to comply with the robots.txt file regulations of the target website. This example is only used to demonstrate the use of x-crawl and is not targeted at a specific website.

Document

x-crawl latest version documentation:

English | 简体中文

x-crawl v9 documentation:

English | 简体中文

x-crawl's People

Contributors

coder-hxl avatar dependabot[bot] avatar imgbotapp avatar renovate[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

x-crawl's Issues

crawlPage setting proxy option does not work

Bug expectation

crawlPage 设置 proxy 选项无效
环境: window10 操作系统, node 16,npm v8.19
我使用 clash 作为代理,在不开系统代理的情况下,crawlData工作良好,但 crawlPage 的 proxy 配置无效,只走系统代理

代码如下

import xCrawl from 'x-crawl'

// Application instance configuration
const testXCrawl = xCrawl({
  proxy: {
    urls: [
      'https://127.0.0.1:7890'
    ],
    switchByErrorCount: 1,
    switchByHttpStatus: [401, 403]
  }
})

// Advanced configuration
testXCrawl
  .crawlPage({
    targets: [
      'https://www.google.com',
      {
        url: 'https://www.google.com',
        proxy: { urls: ['http://127.0.0.1:7890'] }
      }
    ],
    maxRetry: 3,
    proxy: {
      urls: [
        'http://127.0.0.1:7890',
        'http://127.0.0.1:7890'
      ],
      switchByErrorCount: 1,
      switchByHttpStatus: [401, 403]
    }
  })
  .then((res) => {})

考虑到 switchByHttpStatus 可能会影响
我修改了 x-crawl index.mjs 第 270 行,加了下列代码
if (status === null) { result = true; }

Error string

no error

x-crawl version

7.0.1

Node version

16.18

Package manager

npm

Package manager version

8.19

建议 crawlFile 的选项参数可支持字符串或数组

这个特性解决了什么问题?

从某个接口批量下载数据是很常见的需求, 为方便给每次下载设置个性化的参数, 希望crawlFile的所有配置参数能支持传入数组. 当传入的是字符串常量时, 则统一按此常量执行, 当传入的是数组是则按索引取相应的值.

建议在其他api的设计中也考虑这种形式, 以便更加灵活地操作使用。

提议的 API 是什么样的?

const CODES = ['NTES', 'BIDU', 'JD'];
await myXCrawl.crawlFile({
  url: CODES.map((d) => `https://query1.finance.yahoo.com/v7/finance/download/${d}`),
  storeDir: './public/data/financial', // 常量时表示全部下载都是保存在这个目录, 但也允许支持数组
  fileName: CODES,  // 支持数组
  extension: '.csv',  // 所有文件的扩展名都是csv , 但也应允许支持数组
});

crawData 请求结果问题

Bug 预期

crawlData 请求数据,请求配置:

 {
  url: 'https://api.github.com/repos/xxx/xxx/releases',
  method: 'GET',
  headers: {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36',
    'X-GitHub-Api-Version': '2022-11-28',
    Accept: 'application/json',
    Authorization: 'Bearer tokenxxx'
  },
  params: { per_page: 1, page: 1 },
  data: {},
  timeout: 60000,
  maxRetry: 0,
  proxy: ''
}

返回错误:

{
  id: 1,
  isSuccess: false,
  maxRetry: 0,
  retryCount: 0,
  proxyDetails: [],
  crawlErrorQueue: [
    TypeError: The "chunk" argument must be of type string or an instance of Buffer or Uint8Array. Received an instance of Object
        ... {
      code: 'ERR_INVALID_ARG_TYPE'
    }
  ],
  data: null
}

查看源码发现

res.on('data', (chunk) => container.push(chunk))

这里对数据返回结果的处理,对于字符串格式数据不适用,字符串格式数据拼接处理即可。

最小可重复的例子

const crawler = Crawl();
const options = {
      'url': 'https://api.github.com/repos/xxx/xxx/releases',
      'method': 'GET',
      'headers': {
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36',
        'X-GitHub-Api-Version': '2022-11-28',
        'Accept': 'application/json',
  'Authorization': 'Bearer tokenxxx'
      },
      'params': {
        'per_page': 1,
        'page': 1
      },
      'data': {},
      'timeout': 30000,
      'maxRetry': 0
    };
const res = await crawler.crawlData(options);

报错信息

TypeError: The "chunk" argument must be of type string or an instance of Buffer or Uint8Array. Received an instance of Object

x-crawl 版本

7.1.2

Node 版本

v18.6.0

包管理器

pnpm

包管理器版本

8.6.5

windows正常的功能在Linux无结果无报错

Bug 预期

希望能够正常返回数据

最小可重复的例子

我window上能正常使用的功能,在Linux上获取不到结果,也不报错,日志根据提示是完成的,我是通过fastify起api,debian 11 

fastify.post('/api/screenshoot', async function handler(request, reply) {
    const { url } = request.body
    if (!url) {
        return reply.send({ code: 0, msg: "url is required" })
    }
    try {
        const buffer = await screenshoot({ url })
        // 在发送非常规数据时。一定一定要指定响应数据类型
        reply.type("image/jpeg")
        return buffer
    } catch (error) {

    }
})

// screenshoot model
const res = await x.crawlPage({
            url,
            maxRetry: 10,
            viewport: { width: 1920, height: 1080 },
            // 为此次的目标统一设置指纹
            fingerprints: [
                // 设备指纹 1
                {
                    maxWidth: 1024,
                    maxHeight: 800,
                    platform: 'Windows',
                    mobile: 'random',
                    userAgent: {
                        value:
                            'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36',
                        versions: [
                            {
                                name: 'Chrome',
                                // 浏览器版本
                                maxMajorVersion: 112,
                                minMajorVersion: 100,
                                maxMinorVersion: 20,
                                maxPatchVersion: 5000
                            },
                            {
                                name: 'Safari',
                                maxMajorVersion: 537,
                                minMajorVersion: 500,
                                maxMinorVersion: 36,
                                maxPatchVersion: 5000
                            }
                        ]
                    }
                },
                // 设备指纹 2
                {
                    platform: 'Windows',
                    mobile: 'random',
                    userAgent: {
                        value:
                            'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36 Edg/91.0.864.59',
                        versions: [
                            {
                                name: 'Chrome',
                                maxMajorVersion: 91,
                                minMajorVersion: 88,
                                maxMinorVersion: 10,
                                maxPatchVersion: 5615
                            },
                            { name: 'Safari', maxMinorVersion: 36, maxPatchVersion: 2333 },
                            { name: 'Edg', maxMinorVersion: 10, maxPatchVersion: 864 }
                        ]
                    }
                },
                // 设备指纹 3
                {
                    platform: 'Windows',
                    mobile: 'random',
                    userAgent: {
                        value:
                            'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:47.0) Gecko/20100101 Firefox/47.0',
                        versions: [
                            {
                                name: 'Firefox',
                                maxMajorVersion: 47,
                                minMajorVersion: 43,
                                maxMinorVersion: 10,
                                maxPatchVersion: 5000
                            }
                        ]
                    }
                }
            ]
        })

        const { browser, page } = res.data
        // // Get a screenshot of the rendered page
        const buffer = await page.screenshot({ path: `../uploads/${host}_${Date.now()}.png` })
        console.log('Screen capture is complete')

        if (buffer) {
            page.close()
        }
        // close brower
        // browser.close()

        return buffer

报错信息

无报错,无结果返回

x-crawl 版本

latest

Node 版本

20.9.0

包管理器

pnpm

包管理器版本

latest

xCrawl.crawlFile 函数不能完美的兼容linux

Bug 预期

https://github.com/dext7r/puppeteer/blob/master/src/utils/bilibili.ts#L45

image

在window下是可以直接新建這個目錄并写入的。
但是在linux报错如下

Run pnpm task

> @dext7r/[email protected] task /home/runner/work/puppeteer/puppeteer
> npx ts-node ./src/serve

Start crawling - name: page, mode: async, total: 1 
Id: 1 - Crawl does not need to sleep, send immediately
Crawl the final result:
  Success - total: 1, ids: [ 1 ]
    Error - total: 0, ids: [  ]
Start crawling - name: file, mode: async, total: 9 
Id: 1 - Crawl does not need to sleep, send immediately
Id: 2 - Crawl does not need to sleep, send immediately
Id: 3 - Crawl does not need to sleep, send immediately
Id: 4 - Crawl does not need to sleep, send immediately
Id: 5 - Crawl does not need to sleep, send immediately
Id: [6](https://github.com/dext7r/puppeteer/actions/runs/4793322343/jobs/8525664251#step:6:7) - Crawl does not need to sleep, send immediately
Id: [7](https://github.com/dext7r/puppeteer/actions/runs/4793322343/jobs/8525664251#step:6:8) - Crawl does not need to sleep, send immediately
Id: [8](https://github.com/dext7r/puppeteer/actions/runs/4793322343/jobs/8525664251#step:6:9) - Crawl does not need to sleep, send immediately
Id: [9](https://github.com/dext7r/puppeteer/actions/runs/4793322343/jobs/8525664251#step:6:10) - Crawl does not need to sleep, send immediately
Error: ENOENT: no such file or directory, mkdir
    at Object.mkdirSync (node:fs:1396:3)
    at /home/runner/work/puppeteer/puppeteer/node_modules/.pnpm/[email protected][email protected]/node_modules/x-crawl/dist/index.js:46:[10](https://github.com/dext7r/puppeteer/actions/runs/4793322343/jobs/8525664251#step:6:11)
    at Array.reduce (<anonymous>)
    at mkdirDirSync (/home/runner/work/puppeteer/puppeteer/node_modules/.pnpm/[email protected][email protected]/node_modules/x-crawl/dist/index.js:43:[12](https://github.com/dext7r/puppeteer/actions/runs/4793322343/jobs/8525664251#step:6:13))
    at fileSingleResultHandle (/home/runner/work/puppeteer/puppeteer/node_modules/.pnpm/[email protected][email protected]/node_modules/x-crawl/dist/index.js:951:7)
    at /home/runner/work/puppeteer/puppeteer/node_modules/.pnpm/[email protected][email protected]/node_modules/x-crawl/dist/index.js:[13](https://github.com/dext7r/puppeteer/actions/runs/4793322343/jobs/8525664251#step:6:14)5:[21](https://github.com/dext7r/puppeteer/actions/runs/4793322343/jobs/8525664251#step:6:22)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Promise.all (index 3) {
  errno: -2,
  syscall: 'mkdir',
  code: 'ENOENT'
}
Error: The operation was canceled.

最小可重复的例子

import xCrawl from 'x-crawl';
import path from 'path';
import fs from 'fs/promises';

const bilibiliXCrawl = xCrawl({ mode: 'async' });
export function getCurrentDate(separator: string = '-') {


  // 调用 crawlFile API 爬取图片
  await bilibiliXCrawl.crawlFile({
    targets: [...urls],
    // storeDir: `./bilibili/${getCurrentDate('/')}`,
    storeDir,
  });
  // 关闭页面
  page.close();

  // 关闭浏览器
  browser.close();
}

报错信息

Error: ENOENT: no such file or directory, mkdir

x-crawl 版本

"x-crawl": "^6.0.1"

Node 版本

18

包管理器

pnpm

包管理器版本

7.13.2

请问点击事件如何做呢

这个特性解决了什么问题?

比如我想爬一下油管的直播,但是保存的屏幕截图,有一个按钮,但是没法点击。有没有可以点击的操作

提议的 API 是什么样的?

click函数

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

github-actions
.github/workflows/dependency-review.yml
  • actions/checkout v4
  • actions/dependency-review-action v4
.github/workflows/deploy.yml
  • actions/checkout v4
  • pnpm/action-setup v3
  • actions/setup-node v4
  • actions/configure-pages v5
  • actions/upload-pages-artifact v3
  • actions/deploy-pages v4
.github/workflows/greetings.yml
  • actions/first-interaction v1
npm
docs/package.json
  • vitepress ^1.0.2
package.json
  • chalk 5.3.0
  • https-proxy-agent ^7.0.4
  • openai ^4.33.0
  • ora ^8.0.1
  • puppeteer 22.5.0
  • @babel/core ^7.24.0
  • @babel/preset-env ^7.24.0
  • @rollup/plugin-babel ^6.0.4
  • @rollup/plugin-run ^3.0.2
  • @rollup/plugin-terser ^0.4.4
  • @types/node ^20.11.28
  • @typescript-eslint/eslint-plugin ^7.2.0
  • @typescript-eslint/parser ^7.2.0
  • @vitest/coverage-v8 ^1.4.0
  • @vitest/ui ^1.4.0
  • eslint ^9.0.0
  • prettier ^3.2.5
  • rollup ^4.13.0
  • rollup-plugin-typescript2 ^0.36.0
  • typescript 5.4.4
  • vitest ^1.4.0
  • node >=18.0.0
publish/package.json
  • chalk 5.3.0
  • https-proxy-agent ^7.0.4
  • openai ^4.33.0
  • ora ^8.0.1
  • puppeteer 22.5.0
  • node >=18.0.0

  • Check this box to trigger a request for Renovate to run again on this repository

关于 `crawlFile` API 设计的想法建议

这个特性解决了什么问题?

作者好,感谢提供这样的一个库,非常好用!但我觉得代码中的类型声明部分存在一些冗余和不够灵活的地方,可能会给用户在使用函数时带来一些限制和不便。以crawlFile为例:

1、url 明显是必需的参数,建议将其作为单独的参数更直观和灵活。
2、将 CrawlFileDetailTargetConfig 和 CrawlFileAdvancedConfig 合并为一个更简洁的 CrawlFileConfig 类型,减少重复定义。
3、支持传入单个 URL 或多个 URL,以及为每个 URL 设置单独的存储目录、文件名和扩展名。

我尝试给出ts的类型作为参考,以便清楚地表达我的建议,希望这个库可以越来越火。

还有个小建议, 可以把crawFile改名成fetchFile, 因为 "fetch" 一词表示从网络获取数据的常见操作,结合 "file" 可以清晰地表达其作用。

提议的 API 是什么样的?

export function crawlFile(
  url: string | string[],
  config?: CrawlFileConfig,
  callback?: (result: CrawlFileSingleResult | CrawlFileSingleResult[]) => void
): Promise<CrawlFileSingleResult | CrawlFileSingleResult[]>

export interface CrawlFileConfig extends CrawlCommonConfig {
  outputDir?: string | string[] | null
  extension?: string | string[] | null
  fileName?: string | string[] | null
  intervalTime?: IntervalTime
  fingerprints?: DetailTargetFingerprintCommon[]
  headers?: AnyObject
  onCrawlItemComplete?: (result: CrawlFileSingleResult) => void
  onBeforeSaveItemFile?: (info: {
    id: number
    fileName: string
    filePath: string
    data: Buffer
  }) => Promise<Buffer>
}

crawlData配置问题

Bug 预期

在我使用crawlData爬取接口的时候,配置headers的Content-Type为application/x-www-form-urlencoded,被替换成application/json

最小可重复的例子

const targets = {
    url: "https://glxy.mot.gov.cn/company/getCompanyAptitude.do",
    method: "POST",
    headers: {
      "Content-Type": "application/x-www-form-urlencoded;charset=UTF-8",
    },
    data: Qs.stringify({ type: 2 }),
  };
  const res = await myXCrawl.crawlData(targets);

报错信息

无报错

x-crawl 版本

7.1.0

Node 版本

18.16.0

包管理器

npm

包管理器版本

9.5.1

请问有跳转的文件下载如何处理

感谢提供这么好用的工具,正在学习中,在使用过程中,有些网站并不提供固定的文件下载地址,而是一个页面后,再跳转(302或返回一段js)直到实际的下载地址,这个地址也是临时的(所以没法事先抓取),请问这种情况如何处理,感谢~

crawlData中data参数为string

Bug 预期

在我传入的参数data为String类型时,isDataUndefine的判断会把data再包一层引号,这样会导致接口参数对不上

最小可重复的例子

const post = {
    url: "https://glxy.mot.gov.cn/company/getCompanyAptitude.do",
    method: "post",
    headers: {
      "Content-Type": "application/x-www-form-urlencoded",
    },
    data: "page=1",
  };
  let res = await myXCrawl.crawlData(post);

报错信息

无报错

x-crawl 版本

7.1.1

Node 版本

18.16.0

包管理器

npm

包管理器版本

9.5.1

创建实例时候 报错 TypeError: (0 , x_crawl_1.default) is not a function

Bug 预期

执行爬虫实例化时
const myXCrawl = xCrawl();

报错

TypeError: (0 , x_crawl_1.default) is not a function

最小可重复的例子

import xCrawl from 'x-crawl';

  async crawl(path: string) {
    // 2.创建一个爬虫实例
    const myXCrawl = xCrawl();
    myXCrawl.crawlPage(path).then(res => {
      const { browser, page } = res.data;
      console.log(page);
      // 关闭浏览器
      browser.close();
    });
  }

报错信息

ERROR 389 TypeError: (0 , x_crawl_1.default) is not a function

x-crawl 版本

8.3.1

Node 版本

16.19.1

包管理器

npm

包管理器版本

9.6.1

crawlData 的请求参数传递有问题

Bug 预期

crawlData 请求数据,使用 CrawlDataDetailTargetConfigparams 传递请求参数,发现实际请求并未携带参数。

查看源码发现 此处params 参数处理为 search,然后在 此处 传递给 https.request() 方法进行请求。

查看 Node.js 文档,http.request() 的解释 HTTP #httprequestoptions-callback | Node.js v18.16.0 Documentation,请求参数应该是拼接在 path 后。

最小可重复的例子

const pageSize = 30;
    const sort = 'time';
    const rating = 'all';
    const filter = 'all';
    const mode = 'list';
    const url = `https://${type}.${this.BASE_URL}/people/${uid}/${status}`;
    const request: CrawlDataDetailTargetConfig = {
      url,
      method: "GET",
      params: {
        start: (page - 1) * pageSize,
        sort,
        rating,
        filter,
        mode,
      },
      data: {},
      timeout: 30000,
      maxRetry: 0
    };

    const crawler = Crawl({
      enableRandomFingerprint: true
    });
    const res = await crawler.crawlData(request);

报错信息

无报错

x-crawl 版本

7.0.0

Node 版本

18.15.0

包管理器

npm

包管理器版本

8.3.1

这个是不能在centos服务器上使用吗?安装依赖的时候,无头浏览器一直安装不成功

Bug 预期

安装依赖会有一下报错
20230510111428
跳过报错后,运行不起来
微信图片_20230510134126

最小可重复的例子

npm i x-crawl

报错信息

ERROR: Failed to set up Chrome r113.0.5672.63! Set "PUPPETEER_SKIP_DOWNLOAD" env variable to skip download. npm ERR! Error: Download failed: server returned code 404. URL: https://npm.taobao.org/mirrors/113.0.5672.63/linux64/chrome-linux64.zip

x-crawl 版本

7.0.1

Node 版本

16.18.1

包管理器

npm

包管理器版本

8.19.2

crawlPage 爬取多个 link 时, 返回结果是数组, 但是不知道每个结果对应的原始 url

这个特性解决了什么问题?

避免重复爬取某个 page, 看下面这个例子

const ress = await myXCrawl.crawlPage({
    targets: ['https://docs.aave.com/portal/'],
    viewport: { width: 1920, height: 1080 },
    intervalTime: { max: 2000, min: 1000 },
    maxRetry: 3,
});

for (const res of ress) {
    const { browser, page } = res.data;
    console.log(page.url());
    await page.waitForNetworkIdle();
    const content = await page.content();
    console.log(page.url());
}

https://docs.aave.com/portal/ 会跳转到 https://docs.aave.com/hub/
通过 page.url() 获取的是调整后的 link, 这样我就无法根据 link 来判断一个page是否已经被爬取过了

提议的 API 是什么样的?

返回结果提供原始的 link

pnpm 安装依赖报错

Bug 预期

mac系统
报错信息

node_modules/.pnpm/[email protected]/node_modules/puppeteer: Running postinstall script, failed in 5.2s
.../node_modules/puppeteer postinstall$ node install.js
│ ERROR: Failed to set up Chromium r1108766! Set "PUPPETEER_SKIP_DOWNLOAD" env variable to skip download.
│ Error: read ECONNRESET
│     at TLSWrap.onStreamRead (node:internal/stream_base_commons:217:20) {
│   errno: -54,
│   code: 'ECONNRESET',
│   syscall: 'read'
│ }
└─ Failed in 5.2s

最小可重复的例子

pnpm i xxx

报错信息

node_modules/.pnpm/[email protected]/node_modules/puppeteer: Running postinstall script, failed in 5.2s .../node_modules/puppeteer postinstall$ node install.js │ ERROR: Failed to set up Chromium r1108766! Set "PUPPETEER_SKIP_DOWNLOAD" env variable to skip download. │ Error: read ECONNRESET │ at TLSWrap.onStreamRead (node:internal/stream_base_commons:217:20) { │ errno: -54, │ code: 'ECONNRESET', │ syscall: 'read' │ } └─ Failed in 5.2s

x-crawl 版本

最新

Node 版本

v18.6.0

包管理器

pnpm

包管理器版本

7.14.2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.