Giter Site home page Giter Site logo

daijro / browserforge Goto Github PK

View Code? Open in Web Editor NEW
108.0 6.0 5.0 95 KB

๐ŸŽญ Intelligent browser header & fingerprint generator

Home Page: https://pypi.org/project/browserforge

License: Apache License 2.0

Python 100.00%
browsers fingerprinting headers networking playwright python requests scraping

browserforge's Introduction

BrowserForge

PyPI PyPI

๐ŸŽญ Intelligent browser header & fingerprint generator


What is it?

BrowserForge is a browser header and fingerprint generator that mimics the frequency of different browsers, operating systems, and devices found in the wild.

It is a reimplementation of Apify's fingerprint-suite in Python.

Features

  • Uses a Bayesian generative network to mimic actual web traffic
  • Extremely fast runtime (0.1-0.2 miliseconds)
  • Easy and simple for humans to use
  • Extensive customization options for browsers, operating systems, devices, locales, and HTTP version
  • Written with type safety

Installation

pip install browserforge[all]
python -m browserforge update

The [all] extra will include optional libraries like orjson.

Use python -m browserforge update to fetch necessary model files. If the command is not run, files will be downloaded on the first import.


Usage

Generating Headers

Simple usage

>>> from browserforge.headers import HeaderGenerator
>>> headers = HeaderGenerator()
>>> headers.generate()
{'sec-ch-ua': '"Chromium";v="122", "Not(A:Brand";v="24", "Google Chrome";v="122"', 'Sec-Ch-Ua-Mobile': '?0', 'Sec-Ch-Ua-Platform': '"Windows"', 'Upgrade-Insecure-Requests': '1', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7', 'Sec-Fetch-Site': '?1', 'Sec-Fetch-Mode': 'same-site', 'Sec-Fetch-User': 'document', 'Sec-Fetch-Dest': 'navigate', 'Accept-Encoding': 'gzip, deflate, br, zstd', 'Accept-Language': 'en-US;q=1.0'}

Using with requests

Headers can be added to a session in requests (or similar libraries) by assigning them to the headers attribute:

import requests
session = requests.Session()
# Set the session headers
session.headers = headers.generate()
Parameters for HeaderGenerator
Parameters:
    browser (Union[ListOrString, Iterable[Browser]], optional): Browser(s) or Browser object(s).
    os (ListOrString, optional): Operating system(s) to generate headers for.
    device (ListOrString, optional): Device(s) to generate the headers for.
    locale (ListOrString, optional): List of at most 10 languages for the Accept-Language header. Default is 'en-US'.
    http_version (Literal[1, 2], optional): Http version to be used to generate headers. Defaults to 2.
    strict (bool, optional): Throws an error if it cannot generate headers based on the input. Defaults to False.
Parameters for HeaderGenerator.generate
Generates headers using the default options and their possible overrides.

Parameters:
    browser (Optional[Iterable[Union[str, Browser]]], optional): Browser(s) to generate the headers for.
    os (Optional[ListOrString], optional): Operating system(s) to generate the headers for.
    device (Optional[ListOrString], optional): Device(s) to generate the headers for.
    locale (Optional[ListOrString], optional): Language(s) to include in the Accept-Language header.
    http_version (Optional[Literal[1, 2]], optional): HTTP version to be used to generate headers.
    user_agent (Optional[ListOrString], optional): User-Agent(s) to use.
    request_dependent_headers (Optional[Dict[str, str]], optional): Known values of request-dependent headers.
    strict (Optional[bool], optional): If true, throws an error if it cannot generate headers based on the input.

Constraining headers

Single constraint

Set constraints for browsers by passing the optional strings below:

headers = HeaderGenerator(
    browser='chrome',
    os='windows',
    device='desktop',
    locale='en-US',
    http_version=2
)

Multiple constraints

Set multiple constraints to select from. Options are selected based on their actual frequency in the wild:

headers = HeaderGenerator(
    browser=('chrome', 'firefox', 'safari', 'edge'),
    os=('windows', 'macos', 'linux', 'android', 'ios'),
    device=('desktop', 'mobile'),
    locale=('en-US', 'en', 'de'),
    http_version=2
)

Browser specifications

Set specificiations for browsers, including version ranges and HTTP version:

from browserforge.headers import Browser

browsers = [
    Browser(name='chrome', min_version=100, max_version=110),
    Browser(name='firefox', max_version=80, http_version=1),
    Browser(name='edge', min_version=95),
]
headers = HeaderGenerator(browser=browsers)

Note that all constraints passed into the HeaderGenerator constructor can be overridden by passing them into the generate method.

Generate headers given User-Agent

Headers can be generated given an existing user agent:

>>> headers.generate(user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36')

Select from multiple User-Agents based on their frequency in the wild:

>>> headers.generate(user_agent=(
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/115.0'
))

Generating Fingerprints

Simple usage

Initialize FingerprintGenerator:

from browserforge.fingerprints import FingerprintGenerator
fingerprints = FingerprintGenerator()
fingerprints.generate()
Parameters for FingerprintGenerator
Parameters:
    screen (Screen, optional): Screen constraints for the generated fingerprint.
    strict (bool, optional): Whether to raise an exception if the constraints are too strict. Default is False.
    mock_webrtc (bool, optional): Whether to mock WebRTC when injecting the fingerprint. Default is False.
    slim (bool, optional): Disables performance-heavy evasions when injecting the fingerprint. Default is False.
    **header_kwargs: Header generation options for HeaderGenerator
Parameters for FingerprintGenerator.generate
Generates a fingerprint and a matching set of ordered headers using a combination of the default options specified in the constructor and their possible overrides provided here.

Parameters:
    screen (Screen, optional): Screen constraints for the generated fingerprint.
    strict (bool, optional): Whether to raise an exception if the constraints are too strict.
    mock_webrtc (bool, optional): Whether to mock WebRTC when injecting the fingerprint. Default is False.
    slim (bool, optional): Disables performance-heavy evasions when injecting the fingerprint. Default is False.
    **header_kwargs: Additional header generation options for HeaderGenerator.generate
Example response
Fingerprint(screen=ScreenFingerprint(availHeight=784,
                                     availWidth=1440,
                                     availTop=25,
                                     availLeft=0,
                                     colorDepth=30,
                                     height=900,
                                     pixelDepth=30,
                                     width=1440,
                                     devicePixelRatio=2,
                                     pageXOffset=0,
                                     pageYOffset=0,
                                     innerHeight=0,
                                     outerHeight=718,
                                     outerWidth=1440,
                                     innerWidth=0,
                                     screenX=0,
                                     clientWidth=0,
                                     clientHeight=19,
                                     hasHDR=True),
            navigator=NavigatorFingerprint(userAgent='Mozilla/5.0 (Macintosh; '
                                                     'Intel Mac OS X 10_15_7) '
                                                     'AppleWebKit/537.36 '
                                                     '(KHTML, like Gecko) '
                                                     'Chrome/121.0.0.0 '
                                                     'Safari/537.36',
                                           userAgentData={'architecture': 'arm',
                                                          'bitness': '64',
                                                          'brands': [{'brand': 'Not '
                                                                               'A(Brand',
                                                                      'version': '99'},
                                                                     {'brand': 'Google '
                                                                               'Chrome',
                                                                      'version': '121'},
                                                                     {'brand': 'Chromium',
                                                                      'version': '121'}],
                                                          'fullVersionList': [{'brand': 'Not '
                                                                                        'A(Brand',
                                                                               'version': '99.0.0.0'},
                                                                              {'brand': 'Google '
                                                                                        'Chrome',
                                                                               'version': '121.0.6167.160'},
                                                                              {'brand': 'Chromium',
                                                                               'version': '121.0.6167.160'}],
                                                          'mobile': False,
                                                          'model': '',
                                                          'platform': 'macOS',
                                                          'platformVersion': '13.6.1',
                                                          'uaFullVersion': '121.0.6167.160'},
                                           doNotTrack=None,
                                           appCodeName='Mozilla',
                                           appName='Netscape',
                                           appVersion='5.0 (Macintosh; Intel '
                                                      'Mac OS X 10_15_7) '
                                                      'AppleWebKit/537.36 '
                                                      '(KHTML, like Gecko) '
                                                      'Chrome/121.0.0.0 '
                                                      'Safari/537.36',
                                           oscpu=None,
                                           webdriver=False,
                                           language='en-US',
                                           languages=['en-US'],
                                           platform='MacIntel',
                                           deviceMemory=8,
                                           hardwareConcurrency=10,
                                           product='Gecko',
                                           productSub='20030107',
                                           vendor='Google Inc.',
                                           vendorSub=None,
                                           maxTouchPoints=0,
                                           extraProperties={'globalPrivacyControl': None,
                                                            'installedApps': [],
                                                            'isBluetoothSupported': False,
                                                            'pdfViewerEnabled': True,
                                                            'vendorFlavors': ['chrome']}),
            headers={'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
                     'Accept-Encoding': 'gzip, deflate, br',
                     'Accept-Language': 'en-US;q=1.0',
                     'Sec-Fetch-Dest': 'navigate',
                     'Sec-Fetch-Mode': 'same-site',
                     'Sec-Fetch-Site': '?1',
                     'Sec-Fetch-User': 'document',
                     'Upgrade-Insecure-Requests': '1',
                     'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X '
                                   '10_15_7) AppleWebKit/537.36 (KHTML, like '
                                   'Gecko) Chrome/121.0.0.0 Safari/537.36',
                     'sec-ch-ua': '"Not A(Brand";v="99", "Google '
                                  'Chrome";v="121", "Chromium";v="121"',
                     'sec-ch-ua-mobile': '?0',
                     'sec-ch-ua-platform': '"macOS"'},
            videoCodecs={'h264': 'probably', 'ogg': '', 'webm': 'probably'},
            audioCodecs={'aac': 'probably',
                         'm4a': 'maybe',
                         'mp3': 'probably',
                         'ogg': 'probably',
                         'wav': 'probably'},
            pluginsData={'mimeTypes': ['Portable Document '
                                       'Format~~application/pdf~~pdf',
                                       'Portable Document '
                                       'Format~~text/pdf~~pdf'],
                         'plugins': [{'description': 'Portable Document Format',
                                      'filename': 'internal-pdf-viewer',
                                      'mimeTypes': [{'description': 'Portable '
                                                                    'Document '
                                                                    'Format',
                                                     'enabledPlugin': 'PDF '
                                                                      'Viewer',
                                                     'suffixes': 'pdf',
                                                     'type': 'application/pdf'},
                                                    {'description': 'Portable '
                                                                    'Document '
                                                                    'Format',
                                                     'enabledPlugin': 'PDF '
                                                                      'Viewer',
                                                     'suffixes': 'pdf',
                                                     'type': 'text/pdf'}],
                                      'name': 'PDF Viewer'},
                                     {'description': 'Portable Document Format',
                                      'filename': 'internal-pdf-viewer',
                                      'mimeTypes': [{'description': 'Portable '
                                                                    'Document '
                                                                    'Format',
                                                     'enabledPlugin': 'Chrome '
                                                                      'PDF '
                                                                      'Viewer',
                                                     'suffixes': 'pdf',
                                                     'type': 'application/pdf'},
                                                    {'description': 'Portable '
                                                                    'Document '
                                                                    'Format',
                                                     'enabledPlugin': 'Chrome '
                                                                      'PDF '
                                                                      'Viewer',
                                                     'suffixes': 'pdf',
                                                     'type': 'text/pdf'}],
                                      'name': 'Chrome PDF Viewer'},
                                     {'description': 'Portable Document Format',
                                      'filename': 'internal-pdf-viewer',
                                      'mimeTypes': [{'description': 'Portable '
                                                                    'Document '
                                                                    'Format',
                                                     'enabledPlugin': 'Chromium '
                                                                      'PDF '
                                                                      'Viewer',
                                                     'suffixes': 'pdf',
                                                     'type': 'application/pdf'},
                                                    {'description': 'Portable '
                                                                    'Document '
                                                                    'Format',
                                                     'enabledPlugin': 'Chromium '
                                                                      'PDF '
                                                                      'Viewer',
                                                     'suffixes': 'pdf',
                                                     'type': 'text/pdf'}],
                                      'name': 'Chromium PDF Viewer'},
                                     {'description': 'Portable Document Format',
                                      'filename': 'internal-pdf-viewer',
                                      'mimeTypes': [{'description': 'Portable '
                                                                    'Document '
                                                                    'Format',
                                                     'enabledPlugin': 'Microsoft '
                                                                      'Edge '
                                                                      'PDF '
                                                                      'Viewer',
                                                     'suffixes': 'pdf',
                                                     'type': 'application/pdf'},
                                                                    'Document '
                                                                    'Format',
                                                     'enabledPlugin': 'Microsoft '
                                                                      'Edge '
                                                                      'PDF '
                                                                      'Viewer',
                                                     'suffixes': 'pdf',
                                                     'type': 'text/pdf'}],
                                      'name': 'Microsoft Edge PDF Viewer'},
                                     {'description': 'Portable Document Format',
                                      'filename': 'internal-pdf-viewer',
                                      'mimeTypes': [{'description': 'Portable '
                                                                    'Document '
                                                                    'Format',
                                                     'enabledPlugin': 'WebKit '
                                                                      'built-in '
                                                                      'PDF',
                                                     'suffixes': 'pdf',
                                                     'type': 'application/pdf'},
                                                    {'description': 'Portable '
                                                                    'Document '
                                                                    'Format',
                                                     'enabledPlugin': 'WebKit '
                                                                      'built-in '
                                                                      'PDF',
                                                     'suffixes': 'pdf',
                                                     'type': 'text/pdf'}],
                                      'name': 'WebKit built-in PDF'}]},
            battery={'charging': False,
                     'chargingTime': None,
                     'dischargingTime': 29940,
                     'level': 0.98},
            videoCard=VideoCard(renderer='ANGLE (Apple, ANGLE Metal Renderer: '
                                         'Apple M2 Pro, Unspecified Version)',
                                vendor='Google Inc. (Apple)'),
            multimediaDevices={'micros': [{'deviceId': '',
                                           'groupId': '',
                                           'kind': 'audioinput',
                                           'label': ''}],
                               'speakers': [{'deviceId': '',
                                             'groupId': '',
                                             'kind': 'audiooutput',
                                             'label': ''}],
                               'webcams': [{'deviceId': '',
                                            'groupId': '',
                                            'kind': 'videoinput',
                                            'label': ''}]},
            fonts=['Arial Unicode MS', 'Gill Sans', 'Helvetica Neue', 'Menlo']
            mockWebRTC: False,
            slim: False)

Constraining fingerprints

Screen width/height

Constrain the minimum/maximum screen width and height:

from browserforge.fingerprints import Screen

screen = Screen(
    min_width=100
    max_width=1280
    min_height=400
    max_height=720
)

fingerprints = FingerprintGenerator(screen=screen)

Note: Not all bounds need to be defined.

Browser specifications

FingerprintGenerator and FingerprintGenerator.generate inherit the same parameters from HeaderGenerator.

Because of this, user agents, browser specifications, device types, and operating system constrains can also be passed into FingerprintGenerator.generate.

Here is a usage example:

fingerprint.generate(browser='chrome', os='windows')

Injecting Fingerprints

BrowserForge is fully compatible with your existing Playwright and Pyppeteer code. You only have to change your context/page initialization.

Playwright

Async API:

# Import the AsyncNewContext injector
from browserforge.injectors.playwright import AsyncNewContext

async def main():
    async with async_playwright() as playwright:
        browser = await playwright.chromium.launch()
        # Create a new async context with the injected fingerprint
        context = await AsyncNewContext(browser, fingerprint=fingerprint)
        page = await context.new_page()
        ...

Replace await browser.new_context with await AsyncNewContext in your existing Playwright code.

Parameters for AsyncNewContext
Injects an async_api Playwright context with a Fingerprint.

Parameters:
    browser (Browser): The browser to create the context in
    fingerprint (Optional[Fingerprint]): The fingerprint to inject. If None, one will be generated
    fingerprint_options (Optional[Dict]): Options for the Fingerprint generator if `fingerprint` is not passed
    **new_context_options: Other options for the new context

Sync API:

# Import the NewContext injector
from browserforge.injectors.playwright import NewContext

def main():
    with sync_playwright() as playwright:
        browser = playwright.chromium.launch()
        # Create a new context with the injected fingerprint
        context = NewContext(browser, fingerprint=fingerprint)
        page = context.new_page()
        ...

Replace browser.new_context with NewContext in your existing Playwright code.

Parameters for NewContext
Injects a sync_api Playwright context with a Fingerprint.

Parameters:
    browser (Browser): The browser to create the context in
    fingerprint (Optional[Fingerprint]): The fingerprint to inject. If None, one will be generated
    fingerprint_options (Optional[Dict]): Options for the Fingerprint generator if `fingerprint` is not passed
    **new_context_options: Other options for the new context

Undetected-Playwright

Undetected-Playwright is also supported in the browserforge.injectors.undetected_playwright package. The usage is the same as the Playwright injector.

Pyppeteer

# Import the NewPage injector
from browserforge.injectors.pyppeteer import NewPage
from pyppeteer import launch

async def test():
    browser = await launch()
    # Create a new page with the injected fingerprint
    page = await NewPage(browser, fingerprint=fingerprint)
    ...

Replace browser.newPage with NewPage in your existing Pyppeteer code.

Parameters for NewPage
Injects a Pyppeteer browser object with a Fingerprint.

Parameters:
    browser (Browser): The browser to create the context in
    fingerprint (Optional[Fingerprint]): The fingerprint to inject. If None, one will be generated
    fingerprint_options (Optional[Dict]): Options for the Fingerprint generator if `fingerprint` is not passed

Uninstall

To fully remove all files, run the following commands:

python -m browserforge remove
pip uninstall browserforge

browserforge's People

Contributors

daijro avatar deedy5 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

browserforge's Issues

ImportError: cannot import name 'TypeAlias' from 'typing' (python 3.8)

Problem

python 3.8 raises:

/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/browserforge/headers/generator.py:3: in <module>
    from typing import Any, Dict, Iterable, List, Literal, Optional, Tuple, TypeAlias, Union
E   ImportError: cannot import name 'TypeAlias' from 'typing' (/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/typing.py)

from typing import Any, Dict, Iterable, List, Literal, Optional, Tuple, TypeAlias, Union

Solution

https://docs.python.org/3/library/typing.html#typing.TypeAlias
New in version 3.10.

Can you remove Rich?

Hello.

Can you remove Rich from dependencies?
Rich is only used once to print Panel, it can be done with click.

The problem is that Rich increases the size of dependencies after installation (together with Pygments it takes 10 megabytes on disk).

Download only related files

Problem

When importing

from browserforge.headers import HeaderGenerator

the code checks if the required files are downloaded

DownloadIfNotExists()

and downloads the files needed not only for HeaderGenerator, but also for FingerprintGenerator
def DownloadIfNotExists() -> None:

In this case, the file needed only for FingerprintGenerator is also downloaded, but will not be used, and still takes up more than 10 megabytes of disk space.

Suggestion:

When importing HeaderGenerator, download files needed only for HeaderGenerator.
For FingerprintGenerator respectively.

Making a separate browserforge-injector package

I'm considering separating the the browserforge.injectors submodule into an independently updated browserforge-injector library on pypi in order to:

  1. Make the injector submodules and dependencies optional to those who do not use BrowserForge for controlled browsing
  2. Focus on more advanced integration of Playwright/Pyppeteer fingerprinting workarounds that may be out of scope of the base browserforge package
  3. Make the base browserforge package lighter

Does anyone have any thoughts on this? Let me know!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.