Giter Site home page Giter Site logo

albanobattistella / detwinner Goto Github PK

View Code? Open in Web Editor NEW

This project forked from neatdecisions/detwinner

0.0 1.0 0.0 427 KB

Detwinner - duplicate file finder for the Linux desktop

Home Page: https://neatdecisions.com/products/detwinner-linux/

License: GNU General Public License v3.0

Shell 0.06% Roff 0.09% Meson 1.83% C++ 98.02%

detwinner's Introduction

Detwinner

Detwinner is a tool for the Linux desktop which allows searching and removing duplicate files and similar images.

Main functionalities

As description suggests, Detwinner can search for duplicates in two modes:

  1. Exact duplicates.
  2. Similar images.

The mode can be selected using the toolbar from the main window of Detwinner. Each mode can be configured by clicking on a little settings icon next to it. These settings include:

  • restrictions on the file size;
  • including/excluding certain file attributes in the search;
  • regular expressions to match file paths.

Exact duplicates

In this mode Detwinner first arranges files according to their size, and then distributes them in groups by applying Murmurhash on their content.

Similar images

This mode introduces a couple of new settings:

  • similarity level - shows to which extent two images should be similar in order to include them in the results (a value from 1 to 100);
  • a setting which indicates whether the images should be considered as-is, without rotating them to find the best matching position.

Briefly the algorithm can be described as follows:

  1. Split each image in 4 sections and compute 4 histograms (Y,U,V and intensity) for each of them.
  2. Apply hierarchical clustering algorithm to the images using Hassanat distance between their respective histograms as a distance function.

Processing results

The results of the search are presented in a window where duplicate files are organized in groups. The files can be previewed in the bottom pane. One of the previews is related to the file with a lock indicator nearby, another - to the currently selected files. The file locked for a preview can be changed by clicking on the lock icon.

Files to delete can be selected manually or using the smart select button in the toolbar (will apply the selection for all groups) or the selection menu which can be invoked by right-clicking on the duplicate group.

The selected files can be deleted permanently, moved to trash (not available in flatpak installation) or moved to a backup folder. If the last option is chosen, the full folder structure of the original files will be recreated in the selected folder.

Compiling

A recent C++ compiler is required (at least C++14), together with gtkmm of at least 3.22 version. Build process is handled by meson.

./configure.sh script will create two folder (Debug and Release) with corresponding configurations. To trigger the build, run ninja in one of the folders. Running ninja test will obviously execute the unit tests as well.

Acknowledgements

The nice frog images used in the unit tests are part of GraphicsMagick.

detwinner's People

Contributors

albanobattistella avatar neatdecisions avatar vistaus avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.