Giter Site home page Giter Site logo

sharp-phash's Introduction

sharp-phash

Sharp based implementation of perceptual hash (phash) algorithm described there.

Installation

yarn add sharp sharp-phash
# or
npm i sharp sharp-phash

You must install sharp yourself.

How to use

"use strict";

const fs = require("fs");
const Promise = require("bluebird");

const assert = require("assert");

const phash = require("sharp-phash");
const dist = require("sharp-phash/distance");

const img1 = fs.readFileSync("./Lenna.png");
const img2 = fs.readFileSync("./Lenna.jpg");
const img3 = fs.readFileSync("./Lenna-sepia.jpg");

Promise.all([phash(img1), phash(img2), phash(img3)]).then(
  ([hash1, hash2, hash3]) => {
    // hash returned is 64 characters length string with 0 and 1 only
    assert(dist(hash1, hash2) < 5);
    assert(dist(hash2, hash3) < 5);
    assert(dist(hash3, hash1) < 5);
  }
);

sharp-phash's People

Contributors

btd avatar catsmiaow avatar hoonoh avatar pubkey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

sharp-phash's Issues

TypeScript typings

๐Ÿ‘‹ @btd! I just came across your library โ€“ looks great! WDYT of adding typescript typings so that users could avoid these errors in TS imports:

Could not find a declaration file for module 'sharp-phash'. '/path/tp/project/node_modules/.pnpm/[email protected][email protected]/node_modules/sharp-phash/index.js' implicitly has an 'any' type.
  Try `npm i --save-dev @types/sharp-phash` if it exists or add a new declaration (.d.ts) file containing `declare module 'sharp-phash';`ts(7016)

In the meantime, this serves as a workaround:

// @ts-expect-error -- upstream types do not exist: https://github.com/btd/sharp-phash/issues/14
import phash from "sharp-phash";
// @ts-expect-error -- upstream types do not exist: https://github.com/btd/sharp-phash/issues/14
import phashDistance from "sharp-phash/distance";

Rewrite to support different image libs

I had problems with different sharp versions (which was solved recently in 1.1.0, thanks). While solving this on my side I came up by splitting the problem into

  • Read and resize image - this could be done by sharp, jimp or canvas
  • Calculate the dct2d - this might be useful for other things if the inverse function is supported
  • Calculate phash - the origin task
  • Encoding of the result

I did some rewrite and came up with https://github.com/xemle/sharp-phash/tree/rewrite-phash

This drops the sharp dependency and the resize of the image up to the customer/client. This is the tricky part of the rewrite for a lib called sharp-phash ยฏ_(ใƒ„)_/ยฏ

The dct2d was rewritten and aligned with the used variable names of the source https://www.mathworks.com/help/images/ref/dct2.html including the idct2d function based on https://www.mathworks.com/help/images/ref/idct2.html

Old Browsers do not support 64 bit numbers due lack of BigInt support. The hash result are 32 bit high and low values with converters of toBin, toHex and toDec (requires BigInt support).

Since I store and transfer the hash result via JSON documents the result is encoded in 16 byte hex string to save space. The comparison is done char-wise by a 16x16 lookup table. Further, since one major use case is to find similar images, the difference algorithm supports a max bit difference to skip further comparisons if the image differs too much.

RFC

Configurable SAMPLE_SIZE / LOW_SIZE

๐Ÿ‘‹ again @btd! Iโ€™m curious if these params could be made configurable:

const SAMPLE_SIZE = 32;

const LOW_SIZE = 8;

This is an open question: technically, they can come as a third param of phash function (after sharp options), but Iโ€™m not sure if there is any value in this. I am still experimenting with perceptual hashes, so am trying to understand if changing sample sizing can improve the result.

Thanks a lot for putting this library together!

Face "Input buffer contains unsupported image format" error on AWS Graviton with ARM Neoverse 64 bits

Hi,

This lib works perfectly well on classical AWS processor but I tried to run my code on AWS Graviton with ARM Neoverse 64 bits and I face this exception :

Input buffer contains unsupported image format

source code where I'm guessing it's not working

npx envinfo --binaries --system

npx envinfo --binaries --system
npx: installed 1 in 3.953s

  System:
    OS: Linux 4.18 CentOS Linux 8
    CPU: (2) arm64 unknown
    Memory: 2.01 GB / 3.83 GB
    Container: Yes
    Shell: 4.4.20 - /bin/bash
  Binaries:
    Node: 14.17.6 - /usr/bin/node
    Yarn: 1.22.5 - /usr/bin/yarn
    npm: 6.14.15 - /usr/bin/npm

npm ls sharp

xxx@xxxx /home/centos/xxxx/xxx
โ””โ”€โ”€ [email protected]  extraneous

npm ERR! extraneous: [email protected] /home/centos/xxxx/api/node_modules/sharp

Questions / Notes

  • Do you have any clue about what's going on ?
  • Is there any chance to run sharp-phash on AWS Graviton with ARM Neoverse 64 bits ?
  • I open an issue as well on sharp repo here : lovell/sharp#2904

Thank you for your time

Warn log for using ignoreAspectRatio() in phash()

When using your lib, I get a warning log telling me that using ignoreAspectRatio() is depreciated.

DeprecationWarning: ignoreAspectRatio() is deprecated, use resize({ fit: "fill" }) instead

the solution is to implement your phash() like so :

...
sharp(image)
      .greyscale()
      .resize(SAMPLE_SIZE, SAMPLE_SIZE, { fit: "fill" })
.rotate()
...

Is "rotate()" necessary?

Current code:

  const data = await sharp(image)
    .greyscale()
    .resize(SAMPLE_SIZE, SAMPLE_SIZE, { fit: "fill" })
    .rotate()
    .raw()
    .toBuffer();

I wonder if .rotate() can be removed for speedup?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.