Giter Site home page Giter Site logo

apify / fingerprint-suite Goto Github PK

View Code? Open in Web Editor NEW
702.0 18.0 74.0 72.44 MB

Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.

License: Apache License 2.0

JavaScript 21.09% TypeScript 77.07% HTML 1.17% Shell 0.48% Dockerfile 0.20%
fingerprinting playwright puppeteer scraping typescript

fingerprint-suite's Introduction

NPM dev version Chat on discord

fingerprint-suite is a handcrafted assembly of tools for browser fingerprint generation and injection. Today's websites are increasingly using fingerprinting to track users and identify them. With the help of fingerprint-suite you can generate and inject browser fingerprints into your browser, allowing you to fly your scrapers under the radar.

Would you like to work with us on our fingerprinting tools or similar projects? We are hiring!

Overview

fingerprint-suite is a modular toolkit for browser fingerprint generation and injection. It consists of the following npm packages, which you can use separately, or together:

  • header-generator: generates configurable, realistic HTTP headers
  • fingerprint-generator: generates realistic browser fingerprints, affecting the HTTP headers and browser JS APIs
  • fingerprint-injector: injects browser fingerprints into your Playwright or Puppeteer managed browser instance
  • generative-bayesian-network: our fast implementation of a Bayesian generative network used to generate realistic browser fingerprints

Quick start

The following example shows how to use the fingerprinting tools to camouflage your Playwright-managed Chromium instance.

import { chromium } from 'playwright';
import { newInjectedContext } from 'fingerprint-injector';

(async () => {
    const browser = await chromium.launch({ headless: false });
    const context = await newInjectedContext(
        browser,
        {
            // Constraints for the generated fingerprint (optional)
            fingerprintOptions: {
                devices: ['mobile'],
                operatingSystems: ['ios'],
            },
            // Playwright's newContext() options (optional, random example for illustration)
            newContextOptions: {
                geolocation: {
                    latitude: 51.50853,
                    longitude: -0.12574,
                }
            }
        },
    );

    const page = await context.newPage();
   // ... your code using `page` here
})();

Here is the same example using Puppeteer:

import puppeteer from 'puppeteer';
import { newInjectedPage } from 'fingerprint-injector';

(async () => {
    const browser = await puppeteer.launch({ headless: false });
    const page = await newInjectedPage(
        browser,
        {
            // constraints for the generated fingerprint
            fingerprintOptions: {
                devices: ['mobile'],
                operatingSystems: ['ios'],
            },
        },
    );

    // ... your code using `page` here
    await page.goto('https://example.com');
})();

Performance

With ever-improving performance of antibot fingerprinting services, we use some of the industry-leading services to benchmark our performance. The following table shows how is the latest build of fingerprint-suite doing in comparison to other popular fingerprinting tools.

Fingerprinting Benchmark Report

Support

If you find any bug or issue with any of the fingerprinting tools, please submit an issue on GitHub. For questions, you can ask on Stack Overflow or contact [email protected]

Contributing

Your code contributions are welcome and you'll be praised for eternity! If you have any ideas for improvements, either submit an issue or create a pull request. For contribution guidelines and the code of conduct, see CONTRIBUTING.md.

License

This project is licensed under the Apache License 2.0 - see the LICENSE.md file for details.

fingerprint-suite's People

Contributors

b4nan avatar barjin avatar fnesveda avatar honzajavorek avatar jacobmoyle avatar marcplouhinec avatar petrpatek avatar renovate[bot] avatar szmarczak avatar vladfrangu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fingerprint-suite's Issues

Automatic model generation/updates

Now @Equidem generates the models on his machine from the collected data. We should do this automatically every month to provide a smooth version release with updated data.

The end game should look like this:

  1. Actor generates the models
  2. Actor creates a PR with the new models
  3. If there are no conflicts, the PR is auto-merged
  4. Beta is released
  5. Latest is released manually after checking that everything worked well - once we gain confidence in the process, we can automate this too.

Internationalization API is not injected with locale

Describe the bug
The ECMAScript Internationalization API is not affected by the injected fingerprint's locale.

To Reproduce
See @mnmkng 's gist.

Expected behavior
The Intl methods should return values consistent with the injected locale.

System information:

  • OS: [Linux Mint 20.2 Cinnamon]
  • Node.js version [16.15.1]

Additional context
Originally brought up by @mnmkng in Slack.

Show results masking fingerprint when test in pixelscan

Describe the bug
When i use Fingerprint Injector with fingerprint below and test in pixelscan.net but it show "Very likely you are masking your fingerprint."

image

To Reproduce
Use this optin with fingerprintGenerator and test pixelscan.net

    const pluginOptions = {
        launchOptions: {
            headless: false,
            channel: 'chrome',
        },
    };

    const playwrightPlugin = new PlaywrightPlugin(playwright.chromium, pluginOptions);
    const fingerprintGenerator = new FingerprintGenerator({
        devices: ['desktop'],
        browsers: [{ name: 'chrome', minVersion: 100, maxVersion: 100 }],
        operatingSystems: ['macos'],
    });

Expected behavior
Pass test in pixcelscan

image

System information:
I use FingerprintGenerator

{
  fingerprint: {
    screen: {
      availHeight: 875,
      availWidth: 1440,
      pixelDepth: 24,
      height: 900,
      width: 1440,
      availTop: 25,
      availLeft: 0,
      colorDepth: 24,
      innerHeight: 0,
      outerHeight: 858,
      outerWidth: 1284,
      innerWidth: 0,
      screenX: 0,
      pageXOffset: 0,
      pageYOffset: 0,
      devicePixelRatio: 2,
      clientWidth: 0,
      clientHeight: 0,
      hasHDR: false
    },
    navigator: {
      userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36',
      userAgentData: [Object],
      language: 'en-US',
      languages: [Array],
      platform: 'MacIntel',
      deviceMemory: 8,
      hardwareConcurrency: 8,
      maxTouchPoints: 0,
      product: 'Gecko',
      productSub: '20030107',
      vendor: 'Google Inc.',
      vendorSub: null,
      doNotTrack: null,
      appCodeName: 'Mozilla',
      appName: 'Netscape',
      appVersion: '5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36',
      oscpu: null,
      extraProperties: [Object],
      webdriver: false
    },
    audioCodecs: {
      ogg: 'probably',
      mp3: 'probably',
      wav: 'probably',
      m4a: 'maybe',
      aac: 'probably'
    },
    videoCodecs: { ogg: 'probably', h264: 'probably', webm: 'probably' },
    pluginsData: { plugins: [Array], mimeTypes: [Array] },
    battery: {
      charging: false,
      chargingTime: null,
      dischargingTime: 36660,
      level: 0.88
    },
    videoCard: {
      vendor: 'Google Inc. (ATI Technologies Inc.)',
      renderer: 'ANGLE (ATI Technologies Inc., AMD Radeon Pro 455 OpenGL Engine, OpenGL 4.1)'
    },
    multimediaDevices: { speakers: [Array], micros: [Array], webcams: [Array] },
    fonts: []
  },
  headers: {
    'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="100", "Google Chrome";v="100"',
    'sec-ch-ua-mobile': '?0',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36',
    accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'sec-fetch-site': 'same-site',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-user': '?1',
    'sec-fetch-dest': 'document',
    'accept-encoding': 'gzip, deflate, br',
    'accept-language': 'en-US'
  }
}

Additional context
Test other config but same result

Tks team

Fingerprint generator v2

Fingerprint generator v2 will have a new fingerprint structure with 20+ attributes and browser features.

Error: Could not override property: oscpu on [object Navigator]

Describe the bug
I'm getting the following error on page loads in some websites:
VM277:49 Could not override property: oscpu on [object Navigator]. Reason: Cannot read properties of undefined (reading 'get')

Related to the following code from utils.js:

/**
 * @param instance Instance to override.
 * @param overrideObj New instance values.
 */
// eslint-disable-next-line no-unused-vars
function overrideInstancePrototype(instance, overrideObj) {
  Object.keys(overrideObj).forEach((key) => {
    if (!(overrideObj[key] === null)) {
      try {
        overrideGetterWithProxy(Object.getPrototypeOf(instance), key, makeHandler().getterValue(overrideObj[key]));
      }
      catch (e) {
        console.error(`Could not override property: ${key} on ${instance}. Reason: ${e.message} `);
      }
    }
  });
}

To Reproduce
Load https://allpeople.com/.

Expected behavior
No console errors.

System information:

  • OS: MacOS
  • Node.js version v16.14.0
  • Puppeteer v15.5.0
  • Latest fingerprint-generator & fingerprint-injector libs

Getting exception after upgrading to latest version

Getting this exception after upgrading to latest version:

TypeError: Cannot convert undefined or null to object
    at Function.keys (<anonymous>)
    at BayesianNode.sample (.../node_modules/generative-bayesian-network/bayesian-node.js:79:39)
    at BayesianNetwork.generateSample (.../node_modules/generative-bayesian-network/bayesian-network.js:41:42)
    at FingerprintGenerator.getFingerprint (.../node_modules/fingerprint-generator/fingerprint-generator.js:35:62)
    ...

To Reproduce
Upgrade fingerprint-generator from v2.0.1 to v2.0.2

Expected behavior
No exception in patch upgrade

System information:

  • OS: MacOS
  • Node.js version: v16.14.0

Code change in overrideCodecs method is causing bot detection

Code change in overrideCodecs method from fingerprint-injector v2 is causing bot detection.
Using fingerprint-injector v2 with the code from v1 works just fine.

To Reproduce
Load nuwber.com

Expected behavior
Website should load, but instead it blocks the session with anti-bot error.

System information:

  • OS: MacOS
  • Node.js v16.14.0

Additional context
Here's the affected code (v2 on the left):

image

No typescript definitions

The distributed package on npm doesn't have any d.ts files, thus ts is not working. Could this be added? I could also look into submitting a PR if that's favored.

Plugin fingerprint

Describe the feature
Every time I get new fingerprints, amiunique.org shows the same plugins list. It would be better to have it different for better flying under the radars.

Fingerprint generator: does not generate iOS fingerprint

Describe the bug
I have error:

Uncaught Error: No headers based on this input can be generated. Please relax or change some of the requirements you specified.

when try to generate fingerprint with mobile+ios+safari input, I'm also try only mobile+ios input and mobile+safari
Always have an error;

To Reproduce

const fpGenerator = new FingerprintGenerator();

const fingerprint = fpGenerator.getFingerprint({
  devices: ['mobile'],
  browsers: ['safari'],
  operatingSystems: ['ios'],
});

Expected behavior
Correct fingerprint instead Error

System information:

  • Lib version: 2.1.1
  • OS: MacOS 13.0 (arm64)
  • Node.js version v16.17.1

Is there any solution?

Generate appropriate headers for HTTP/1 and HTTP/2

Currently it's generated like this:

https://github.com/apify/got-scraping/blob/b9127634e2a426e3a9ee609207db0ba01d5621a5/src/hooks/browser-headers.ts#L60-L69

so it generates 2 different devices for 2 different protocols. header-generator should expose a function that would first pick a device and then generate headers for both protocols (for that specific device).

Also sometimes it generates connection: close for HTTP/2 which is invalid.

Because HTTP/2 is mostly used and HTTP/3 is getting more and more popular I think we should consider removing devices that are HTTP/1-only.

Periodical auto deploys.

It would be nice to have and Github action that would release the package with refreshed data every month.

Platform distribution in generative networks

Recently, I have come across complaints (ty @B4nan, @AndreyBykov ) about suspicious screen dimensions in the fingerprint-injected browsers.
Those complaints were mainly about generating vertical screen dimensions for desktop devices. While there are real-life situations when a desktop computer can have a tall screen (using a vertical display for example), it really shouldn't be frequent. Following experiments might hint at more serious problems with the way the generative networks work.

The following charts show how the distribution differs between training data and generated fingerprints.

Platform distribution (not a problem anymore, just check out how well it works)

The culprit was devices: ['desktop', 'mobile'] setting, the generator generates only desktop fingerprints by default.

pie
  title Platform distribution in collected data
  "Win32" : 0.4627
  "MacIntel" : 0.2874
  "Linux x86_64" : 0.1413
  "iPhone": 0.0428
  "Linux armv8l": 0.0327
  "Linux aarch64": 0.0247
  "Linux armv7l": 0.006
  "iPad": 0.0014
  "Linux armv81": 0.0005
  "Linux": 0.0002
  "OpenBSD i386": 0.0001
  "PlayStation 4": 0.0001
  "Windows": 0.0001
pie
  title Platform distribution in generated data
  "Win32": 4609
  "MacIntel": 2918
  "Linux x86_64": 1339
  "iPhone": 451
  "Linux armv8l": 337
  "Linux aarch64": 254
  "Linux armv7l": 72
  "Linux armv81": 4
  "iPad": 13
  "Linux": 2
  "Windows": 1

Note how the generated distribution is skewed towards desktop platforms. This leads to a problem while trying to generate mobile platforms with "mobile" OSs - fg.getFingerprint({operatingSystems: ['android','ios'] }) ends with Error: No headers based on this input can be generated.

Vertical screen distribution (no real problem here anymore either, just some cool charts :) )

A vertical screen was detected in 11.77 % of collected samples.

pie
  title Platform distribution - vertical screen (in collected data)
"iPhone": 0.36363636363636365
"Linux armv8l": 0.27442650807136787
"Linux aarch64": 0.20815632965165676
"MacIntel": 0.055225148683092605
"Linux armv7l": 0.05097706032285471
"Linux x86_64": 0.025488530161427356
"iPad": 0.0118946474086661
"Win32": 0.005097706032285472
"Linux armv81": 0.004248088360237893
"Linux": 0.0008496176720475786

A vertical screen was detected in 11.98 % of generated data.

pie
  title Platform distribution - vertical screen (in generated data)
  "iPhone": 451
  "Linux armv8l": 331
  "Linux aarch64": 251
  "Linux armv7l": 70
  "MacIntel": 81
  "Linux armv81": 4
  "Win32": 6
  "Linux x86_64": 27
  "iPad": 13
  "Linux": 2

While the results of this experiment don't show any specific problem, there are here-and-there problems (see example in comments). Not looking into the Bayesian network internals much, there might be a problem with the fingerprint preprocessing, perhaps?

CC @petrpatek @Equidem do you guys have any idea why this might be happening?

Edit: I didn't know how the generator works, the examples show no real problems now :)

Canvas and WebGL

Describe the feature
There're a lot of different fingerprints out there, but there are no Canvas and WebGL fingerprints. So e.g. if you check at iphey.com, then you will find the same tokens for both cases.

So it would be great to add them to fill the whole list of fingerprints... Or are there any other fingerprints that aren't handled?

Desktop device sometimes generate too small screen resolution

In case, when I need fingerprints for desktop devices only, generator sometimes generate too small screen resolution, for example 800x600px and website is displaying page like for tablets. Could you please add also option to ability set minimal resolution?

Example of generated output:

{
   "fingerprint":{
      "screen":{
         "availHeight":800,
         "availWidth":600,
         "pixelDepth":24,
         "height":800,
         "width":600
      },
      "webGl":{
         "vendor":"Google Inc.",
         "renderer":"Google SwiftShader"
      },
      "audioCodecs":{
         "ogg":"probably",
         "mp3":"probably",
         "wav":"probably",
         "m4a":"",
         "aac":""
      },
      "videoCodecs":{
         "ogg":"probably",
         "h264":"",
         "webm":"probably"
      },
      "pluginsData":{
         
      },
      "navigator":{
         "cookieEnabled":true,
         "doNotTrack":"1",
         "language":"en-US",
         "languages":[
            "en-US"
         ],
         "platform":"Linux x86_64",
         "deviceMemory":8,
         "hardwareConcurrency":8,
         "productSub":"20030107",
         "vendor":"Google Inc.",
         "maxTouchPoints":0
      },
      "batteryData":{
         "level":0.25,
         "chargingTime":322,
         "dischargingTime":null
      },
      "userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.67 Safari/537.36"
   }
}

Header generator v2

There are not going to be any changes to the headers structure. There are only new features improving the dev experience, which are prepared. We also need to accommodate an updated dataset.

mimeTypes and plugins spoof in Firefox.

Describe the bug
Fingerprint-injector does not plugins spoof in firefox.

To Reproduce
default fingerprint-injector

Expected behavior
Spoof plugins and mimeTypes

System information:

  • OS: MAC
  • Node.js: 16

Spoof detection

Describe the bug

  1. Looks like botd detects OS spoof, when i set operatingSystems: ['windows'] it detects spoof, when i'm running browser with my Mac.

  2. Cloudflare JS Challenge detects, when userAgent set in browser.newContext, and os is not/is spoofed.

Proof
image

System information:

  • OS: MacOS
  • Node.js version: 18

Creepjs detects it...

Screenshot_40

Hi the creepjs can detect real info using workers. Is there any solution for that?

Can not login to google

Describe the bug
Something wrong with codecs ... + can not login to google with "fingerprint-injector"

To Reproduce

const { chromium } = require('playwright-core');
const { FingerprintGenerator } = require('fingerprint-generator');
const { FingerprintInjector } = require('fingerprint-injector');

(async () => {
  const fingerprintGenerator = new FingerprintGenerator();
  const fingerprintInjector = new FingerprintInjector();

  const fingerprint = fingerprintGenerator.getFingerprint({
    devices: ['desktop'],
    browsers: [{ name: "firefox", minVersion: 98 }, { name: 'chrome', minVersion: 98 }],
    locales: ["en-US"],
  });

  const ctx = await chromium.launchPersistentContext('./profile/', {
    headless: false,
    ignoreHTTPSErrors: true,
    viewport: null,
    userAgent: fingerprint.fingerprint.navigator.userAgent,
    locale: fingerprint.fingerprint.navigator.language,
  });

  await fingerprintInjector.attachFingerprintToPlaywright(ctx, fingerprint);

  const page = ctx.pages()[0];
  page.setDefaultNavigationTimeout(0);

  const navigationPromise = page.waitForNavigation({ waitUntil: "domcontentloaded", });
  await page.goto('https://accounts.google.com/signin/v2/identifier?flowName=GlifWebSignIn&flowEntry=ServiceLogin');
  await navigationPromise;
  await page.waitForSelector('input[type="email"]');
  await page.type('input[type="email"]', "your.email");
  await page.click("#identifierNext", { delay: 31 });
  await page.waitForSelector('input[type="password"]', { visible: true });
  await page.type('input[type="password"]', "your.password");
  await page.waitForSelector("#passwordNext", { visible: true });
  await page.click("#passwordNext");
  await navigationPromise;
})();

Expected behavior
login to google ... with extra stealth plugin is working.

System information:

  • OS: Win10x64
  • Node.js version: 16

Additional context
3
photo_2022-08-12_11-31-04
photo_2022-08-12_11-31-11

checked with https://bot.sannysoft.com

in vanila chrome browser look
изображение

What about browser fonts?

I've seen multiple example of fingerprinting using the browser installed fonts.

Can this be achieved by injecting fonts while injecting headers?

Less verbosity

An option to disable the default logging to console or an option lower verbosity would be great

Like the message: "INFO FingerprintInjector: Successfully initialized." on instantiation

Fingerprint injector v2

Fingerprint injector v2 must accommodate changes to the fingerprint structure as well as find out ways to inject the new attributes in an undetectable way.

Electron.js support

Describe the feature
Please add electron.js support, so that it could have different fingerprints also.

Motivation
Sometimes electron.js is more convenient than Chromium + Puppeteer, especially taking into account that it is 1 installation instead of 2 tools.

Please update the internal puppeteer version..

image

Fingerprint injector package shows it requires up to puppeteer 15.1 version. Now the latest puppeteer version is 16.2 i think. Is there any plan to update it? And sometime it creates a big conflicts also for the version mismatch of puppeteer.

Can this suit stealth navigationStart

I find some bot detect window.performance.timing.navigationStart which is used to find which browser contextes are created at same time.

Please help to stealth this property.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.