Giter Site home page Giter Site logo

julianchia / adp Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.63 MB

App to quickly find and delete duplicated pictures/images in desktop/laptop.

License: Apache License 2.0

Python 100.00%
photos python3 concurrent-programming duplicate-finder multiprocessing multithreading pictures tkinter-gui tkinter-python image-detection tkinter-application

adp's Introduction

ADP

Title "ANY DUPLICATED PICTURES" / ADP / adp is a simple-to-use application that let's you quickly find and delete duplicated picture files in a desktop/laptop. It is fast because concurrent programming is used in addition to its fast picture recognition capability. All you have to do is click the Folder button to select a folder that you want to search, followed by a click on the Find button to search the folder and all its subfolders for duplicated pictures. Once completed, you can click on the duplicated pictures you want to delete and finally click on the Delete button to delete them.

User notes:

  1. 'Original' denotes the earliest version of a duplicate picture and is colour coded in green. 'Copies' denotes its later versions and is colour coded in blue. Clicking the Originals or Copies buttons toggles their quick selection/deselection.

  2. The maximum 'Duplicates Group' number is always one less the quantity of 'Original' pictures.

  3. To manually select/deselect multiple picture files, press the Shift key in your keyboard followed by clicking the first and last file paths with your mouse pointer. You can also select/deselect a single picture file by clicking on its filepath or thumbnail-image.

  4. ADP defaults to using a pool of your CPU logical cores (i.e. process-pool or cfe=process) vs a pool of CPU threads (i.e. thread-pool or cfe=thread) to find pictures and their duplicates. This is one of the main reason why ADP is fast. You can visualise its performance when you run ADP via a terminal and use a system monitor or tasks manager tool, if need, as shown below:

    CPU process pool in action

  5. The diagram below illustrates just how much faster ADP can be when using many logical cores vs 1 logical core to find pictures and their duplicates in a NVMe.m.2 solid-state-drive (SSD). The contrast of faster performance is most obvious when large quantities of large pictures(i.e, high resolution pictures) and their duplicates are processed. Consequently, ADP defaults to using all logical cores of a CPU.

    Performances

  6. If your pictures are in a traditional hard disk (HDD), it is recommended that you transfer your pictures to a SSD before using ADP to search out picture duplicates. This is because the performance of a HDD is snail pace compared to a SSD and the high performance from using process or thread pool will be mitigated by the HDD. Moreover, ADP is set to timeout if a search for pictures or picture duplicates exceeds 10 minutes.

Sponsor This App

If you like this app, please : Buy Me A Coffee.

How To Install and Run ADP?

Option 1:

  • To install ADP on any x86_64 Linux distros, you can simply download the adp_ver011-3.10.14-x86_64.AppImage file and give it execution permission.
  • To run/use it, you can either click on the adp_ver011-3.10.14-x86_64.AppImage file with your mouse pointer or run this command ./adp_ver011-3.10.14-x86_64.AppImage on a terminal.

Option 2

For x86_64 Linux, Mac and Windows operating systems installed with Python 3.10 and above, you can install ADP by following these steps:

  1. Clone this repository into your computer.

    1. You can download a zipped version of ADP by pressing the Code button on this webpage and then extract the adp folder into your computer.

    2. Alternatively, you can issue this command in your terminal:

      git clone https://github.com/JulianChia/ADP.git
      
  2. Next, go to your cloned local ADP directory, e.g.:

    cd path/to/your/cloned/ADP/directory
    
  3. Install the cloned adp module, its virtualenv and dependencies (Numpy v1.26.4 and Pillow v10.2.0) using this pipenv command:

    pipenv sync
    

    Note: This command ensures that the above mentioned packages resides only in your user account and does not affect or depend on any other already installed package(s) in your system. The pipenv package must first be installed in your system. If it isn't, you can install pipenv following these instructions.

  4. You can issue the following terminal command (with/without its optional arguments) from the adp folder:

    $ pipenv run python3 -m adp  [-m or --mode {g,t,f}]  # Run in either 'gallery', 'table' or 'find' mode. Default is 'gallery'.
                            [-l or --layout {h,v}]  # Set GUI to use either a 'horizontal' or 'vertical' layout. Default for `gallery` and `table` modes is 'horizontal'. 'find' mode allows 'only vertical' layout. 
                            [-c or --cfe {p,t}]     # Use CPU 'process' or 'thread' pool for execution. Default is 'process'.
                            [-h]                    # Get help. 
    
    Examples:
    $ pipenv run python3 -m adp           # default options: adp -m g -l h -c p 
    $ pipenv run python3 -m adp -m t      # in 'table' mode
    $ pipenv run python3 -m adp -m f      # in 'find' mode
    

Operating Systems (OS):

  • Linux (tested on Ubuntu 22.04.4, Linux 6.5.0-26-generic, x86_64)
  • MacOS (tested on Catalina 10.15, Darwin 19.6.0, x86_64)
  • Windows (not tested but should work; please alert me of any issue.)

Softwares:

  • Required: Python >=3.10.
  • Dependencies: NumPy 1.26.4, Pillow 10.2.0 and Tk 8.6. .

Notes To Python Programmers:

  1. You can use the adp module in ADP as a library. To access its classes and/or functions in your python script, copy the adp module to your project and use the command:

    from adp import (classes and/or functions)
    

    Accessible classes:

    Widgets:       ADP, ADPFind, ADPGallery, ADPTable, About, AutoScrollbar, DonutCharts, DupGroup, Find, Findings, Gallery, Progressbarwithblank, Table, VerticalScrollFrame
    For picture:   RasterImage,
    For internet:  HyperlinkManager
    

    Accessible functions:

    For widgets:      customise_ttk_widgets_style, filesize, get_geometry_values, get_thumbnail, get_thumbnail_c, get_thumbnails_concurrently_with_queue, pop_kwargs, sort_pictures_by_creation_time, str_geometry_values, string_pixel_size, stylename_elements_options, timings
    Find subfolders:  fast_scandir
    Find pictures:    dataklass, get_filepaths_in, get_image, get_rasterimages_in_one_folder_concurrently, list_scandir_images, scandir_images, scandir_images_concurrently
    Find duplicates:  detect_duplicates_concurrently, detect_duplicates_serially
    For terminal:     main, percent_complete, show_logo_in_terminal
    

    Please refer to the source codes for their details.

  2. Python script highlights:

    1. Algorithm to search out pictures and their duplicates quickly.
    2. A paging system to view searched results in tkinter widgets with the mousewheel without overflowing memory and with minimal lag.
    3. Stable integration of Python's threading.Thread, concurrent.futures.ProcessPoolExecutor and concurrent.futures.ThreadPoolExecutor objects with tkinter's main event loop.

adp's People

Contributors

julianchia avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.