Giter Site home page Giter Site logo

yoyonel / vhcalc Goto Github PK

View Code? Open in Web Editor NEW
1.0 3.0 0.0 4.79 MB

It's a client-side library that implements a custom algorithm for extracting video hashes (fingerprints) from any video source.

Home Page: https://yoyonel.github.io/vhcalc/

License: MIT License

Dockerfile 2.16% Python 97.84%

vhcalc's Introduction

PRs Welcome Conventional Commits Code style: black Github Actions

PyPI Package latest release PyPI Package download count (per month) Supported versions

Docker Image Size (latest semver)

Coverage Status

vhcalc

It's a client-side library that implements a custom algorithm for extracting video hashes (fingerprints) from any video source.

Getting Started

Prerequisites

Usage

$ vhcalc --help
Usage: vhcalc [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.
imghash*                   Compute image hashes from and to binaries stream
                           (by default: stdin/out)
export-imghash-from-media  extracting and exporting binary video hashes
                           (fingerprints) from any video source

imghash default entrypoint

$ vhcalc imghash --help
Usage: vhcalc imghash [OPTIONS] [INPUT_STREAM] [OUTPUT_STREAM]

  Simple form of the application: Input filepath > image hashes (to stdout by
  default)

Options:
  --image-hashing-method [AverageHashing|PerceptualHashing|PerceptualHashing_Simple|DifferenceHashing|WaveletHashing]
                                  [default: PerceptualHashing]
  --decompress
  --from-url URL
  --help                          Show this message and exit.

Media into binary images hashes (fingerprints)

# using pipes for input/output streams
$ cat tests/data/big_buck_bunny_trailer_480p.mkv | \
  # input: binary data from video/media (readable by ffmpeg)
  vhcalc imghash --image-hashing-method PerceptualHashing | \
  # output: binary representation of images hashes (to stdout)
  hexdump | tail -n 8
The frame size for reading (32, 32) is different from the source frame size (854, 480).
*
0001900 93f5 6d91 926a 6585 d2f5 6d91 926a 6585
0001910 92f5 6d91 9a6a 6585 92f5 6c91 9b6e 6485
0001920 92d5 6d91 9a7a 6585 92d5 6591 9a6e e585
0001930 d2d5 6591 9a6a e585 d2d5 2c91 9b6e e485
0001940 d2d5 6d91 926e e581 d2d5 6d91 d26e e181
0001950 d6d5 b581 c26e f181 d5d5 d52a d528 d528
0001960

Media to hexadecimal representation of images hashes (fingerprints)

$ cat tests/data/big_buck_bunny_trailer_480p.mkv | \
    # input: binary data from video/media (readable by ffmpeg)
    vhcalc imghash --image-hashing-method PerceptualHashing | \
    # output/input: binary images hashes (through pipe stream)
    vhcalc imghash --decompress | \
    # output: string hexadecimal representation of images hashes (to stdout)
    fold -w 16 | tail -n 8
The frame size for reading (32, 32) is different from the source frame size (854, 480).
d592916d7a9a8565
d59291656e9a85e5
d5d291656a9a85e5
d5d2912c6e9b85e4
d5d2916d6e9281e5
d5d2916d6ed281e1
d5d681b56ec281f1
d5d52ad528d528d5%

From URL

  # launching (in background) a webserver deserving `tests/data` files
$ python3 -m http.server -d tests/data & \
  # input: url to tests/data video serving by http server
  vhcalc --from-url http://0.0.0.0:8000/big_buck_bunny_trailer_480p.mkv | \
  # output/input: binary images hashes (through pipe stream)
  vhcalc --decompress | \
  # output: string hexadecimal representation of images hashes (to stdout)
  fold -w 16 | tail -n 8; \
  # killing the http server launch at the beginning
  ps -ef | grep http.server | grep tests/data | grep -v grep | awk '{print $2}' | xargs kill
[1] 2217597
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
127.0.0.1 - - [12/Sep/2024 16:36:44] "GET /big_buck_bunny_trailer_480p.mkv HTTP/1.1" 200 -
The frame size for reading (32, 32) is different from the source frame size (854, 480).
d592916d7a9a8565
d59291656e9a85e5
d5d291656a9a85e5
d5d2912c6e9b85e4
d5d2916d6e9281e5
d5d2916d6ed281e1
d5d681b56ec281f1
d5d52ad528d528d5%
[1]  + 2217597 terminated  python3 -m http.server -d tests/data
From Youtube video

Requirement: yt-dlp - A feature-rich command-line audio/video downloader

  # input: Youtube url grab by yt-dlp tool
$ vhcalc --from-url $(yt-dlp --youtube-skip-dash-manifest -g https://www.youtube.com/watch?v=W6QOj6vWmoQ | head -n 1) | \
  # output/input: binary images hashes (through pipe stream)
  vhcalc --decompress | \
  # output: string hexadecimal representation of images hashes (to stdout)
  fold -w 16 | tail -n 8;
The frame size for reading (32, 32) is different from the source frame size (606, 1080).
8adaf123841eb379
8adaf123841eb379
8283bb72f46ce871
8283bb72f46ce871
8283bb72f46ce871
8283bb72f46ce871
8283bb72f46ce871
8283bb72f46ce871%  

Docker

Docker hub: yoyonel/vhcalc

$ docker run -it yoyonel/vhcalc:latest --help
Usage: vhcalc [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.
imghash*                   Compute image hashes from and to binaries stream
                           (by default: stdin/out)
export-imghash-from-media  extracting and exporting binary video hashes
                           (fingerprints) from any video source
# with '-i' docker run option, we can use (host) pipes from stdin to stdout (for example)
$ cat tests/data/big_buck_bunny_trailer_480p.mkv | docker run -i --rm yoyonel/vhcalc:latest | md5sum
The frame size for reading (32, 32) is different from the source frame size (854, 480).
bf5c7468df01d78862c847596de92ff3  -

# using HTTP server and url input
$ python3 -m http.server -d tests/data & \
  # input: url to tests/data video serving by http server
  docker run -i --network host yoyonel/vhcalc --from-url http://0.0.0.0:8000/big_buck_bunny_trailer_480p.mkv | \
  # output/input: binary images hashes (through pipe stream)
  docker run -i yoyonel/vhcalc --decompress | \
  # output: string hexadecimal representation of images hashes (to stdout)
  fold -w 16 | tail -n 8; \
  # killing the http server launch at the beginning
  ps -ef | grep http.server | grep tests/data | grep -v grep | awk '{print $2}' | xargs kill
[1] 1929365
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
127.0.0.1 - - [13/Sep/2024 14:59:04] "GET /big_buck_bunny_trailer_480p.mkv HTTP/1.1" 200 -
The frame size for reading (32, 32) is different from the source frame size (854, 480).
d592916d7a9a8565
d59291656e9a85e5
d5d291656a9a85e5
d5d2912c6e9b85e4
d5d2916d6e9281e5
d5d2916d6ed281e1
d5d681b56ec281f1
d5d52ad528d528d5%

Contributing

See Contributing

Authors

Lionel Atty [email protected]

Created from Lee-W/cookiecutter-python-template version 1.1.2

vhcalc's People

Contributors

yoyonel avatar github-actions[bot] avatar snyk-bot avatar

Stargazers

cgnerds avatar

Watchers

James Cloos avatar  avatar  avatar

vhcalc's Issues

[Feature Request] Support Youtube video input

Description

$ vhcalc --from-url $(yt-dlp --youtube-skip-dash-manifest -g https://www.youtube.com/shorts/KZi7ukiHkao | head -n 1) | \
   vhcalc --decompress | \
   fold -w 16 | tail -n 8
The frame size for reading (32, 32) is different from the source frame size (606, 1080).
8adaf123841eb379
8adaf123841eb379
8283bb72f46ce871
8283bb72f46ce871
8283bb72f46ce871
8283bb72f46ce871
8283bb72f46ce871
8283bb72f46ce871%

Possible Solution

Additional context

Related Issue

[Feature Request] Upgrade Python version

Description

Support more python version interpreters: 3.10, 3.11, 3.12 maybe 3.13

Possible Solution

Using tox (in local dev/tests) and matrix target in (Github Actions) CI.

Additional context

Related Issue

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.