distant-viewing / dvt Goto Github PK

View Code? Open in Web Editor NEW

86.0 86.0 12.0 51.89 MB

Distant Viewing Toolkit for the Analysis of Visual Culture

Home Page: https://distantviewing.org

License: GNU General Public License v2.0

Python 100.00%

computer-vision cultural-analytics digital-humanities

dvt's People

Contributors

Stargazers

Watchers

Forkers

wenlong0913 nanne cutuchiqueno wajahat-mirza marvintrautmann chavezheras diegosiqueir4 gregorycrane ereyes yoonsen rlskoeser vhmaheshp

dvt's Issues

DiffAnnotator/CutAggregator Not Functional for Black and White Films

Describe the bug
DiffAnnotator and CutAggregator do not work with black and white films. I apologize if this is from lack of user knowledge, rather than a bug/missing feature.

To Reproduce
Steps to reproduce the behavior:

Run DataExtraction on a black and white movie file.
Run DiffAnnotator on the extracted data. The quantiles setting does not appear to have any effect.
Run CutAggregator(cut_vals={q[quantiles]: [value]})
No matter what the initial quantiles setting was, or the value run through the CutAggregator, any black and white clip is returned as being a single shot.

Expected behavior
Shots to be detected. This works perfectly fine with color film, but not at all for any black and white movie to which I have access.

Desktop (please complete the following information):

OS: macOS Catalina 10.15.2

AttributeError: 'Node' object has no attribute 'output_masks'

Describe the bug
When running the included sample Colab notebook I hit a AttributeError: 'Node' object has no attribute 'output_masks' while attempting to initialize the VideoCsvPipeline!

To Reproduce

from dvt.pipeline.csv import VideoCsvPipeline
from os.path import join
import pandas as pd
from IPython.display import Image

VideoCsvPipeline(finput="video-clip.mp4", dirout="dvt-output-csv").run()

(As a sidenote, the notebook passes include_images=True to the VideoCsvPipeline constructor but it seems that kwarg has been removed?)

Expected behavior
I'm not sure what should happen yet! I'm just kicking the tires and seeing what follows :)

Screenshots
NA

Desktop (please complete the following information):

OS: MacOS 11.2.1
Browser recent Chrome
Version dvt==0.3.3

Smartphone (please complete the following information):
NA

Additional context
I'm happy to look into this if that's helpful!

Colab notebook demo does not exist

It shows the file you have requested does not exist.

Performance Issue with custom annotator

Basically because HSV is not derived from a color awareness model like CIE-Lab is but in comparison to CIE-Lab defines more meaningful axis I wanted to build a custom annotator that turns the input RGB (it is RGB, right? Not openCVs BGR?) into CIE-LCH (Luminance, Chroma, Hue), a color awareness appropriate version of HSV. (Instead of a histogramm I calculate the median value for each channel, but that is not important here). openCV, unfortunately, does not provide a conversion to CIE-LCH, but scikit-image does. In principle everything works out, BUT for a 110 minute movie such as Il Divo with this custom annotator dvt requires 7h30m!!! Considering, that the demo project, on my computer, requires just 30-45 minutes to annotate everything it can (the color annotator and the dominant colors annotator included), I wondered if I got something completely wrong.

My annotator looks like this:

class LchAnnotator(FrameAnnotator):

    name = 'lch'

    def annotate(self, batch):

        img = batch.get_batch()
        fnames = batch.get_frame_names()

        luminance = list()
        chroma = list()
        hue = list()

        for i in img:

            img_lab = rgb2lab(i)
            img_lch = lab2lch(img_lab)

            lch = img_lch.reshape(-1, 3)

            luminance.append(median(lch[:, 0]))
            chroma.append(median(lch[:, 1]))
            hue.append(median(lch[:, 2]))

        output = {'frame': fnames,
                  'luminance': luminance,
                  'chroma': chroma,
                  'hue': hue}

        return output

Annotator based on optical flow

Hi!

One of the annotators I would like to add to dvt is based on the optical flow between subsequent frames (with cv2's calcOpticalFlowFarneback). Yet, I'm not sure how to do this nicely, because for each batch it wouldn't be possible to calculate the optical flow for the last frame in the batch.

Additionally, in our project we segment a video into shots first, before analysing it, and then extract optical flow only for frames within the same shot, as the optical flow between two frames from different cuts is not very meaningful. I guess this can be solved during aggregation, where the frame before the cut is ignored (if necessary), but it would result in a decent amount of wasted computation for videos with many cuts..

Did you have any thoughts on what the 'right' way would be to deal with these kind of annotators within the dvt pipeline? I took a look at the DiffAnnotator, but that appears to be very 'batch oriented', which isn't always ideal I would think.

Documented vs. actual license mismatch

This issue is part of my review in openjournals/joss-reviews#1800.

In various forms of documentation, I find the consistent claim that dvt is licensed under GPL v2. I found this in the following places, but there may be more.

The license that is actually included in the source code is the GPL v3: https://github.com/distant-viewing/dvt/blob/master/LICENSE.

Regarding the shield in the README, this could update automatically with the information available on PyPI if you were to use the following image URL (documentation and options here):

https://img.shields.io/pypi/l/dvt

Note that this reflects the information in the setup.py rather than the information in the actual license.

please consider updating PyPI

The version on PyPI is 1.0.0 still although the github repo shows 1.0.1. If the most recent commits (PR 36 merge) are considered good, maybe even do a 1.0.2 release?

Disclaimer: I'm not using dvt but someone on discuss.python.org was asking about an install problem and I noticed they were getting dvt from github instead of PyPI (still using pip but with a github URL). I was hoping to suggest to them they could just pip install dvt.

Thank you.

Comments on the documentation

This issue is part of my review in openjournals/joss-reviews#1800. However, in this issue I'm only providing suggestions for minor improvements to the documentation. I consider all of these suggestions optional; the review criteria can also be met without adopting them.

The README and the Sphinx documentation both suggest that the user installs dvt in an Anaconda environment. I tried installing the package in a regular "bare" virtualenv instead and this worked as documented, too. So you could add a clarifying note that Anaconda is not strictly required. A regular virtualenv, or in fact any Python 3.7 environment with pip and a compiler toolchain to back it, should work.
On the page https://distant-viewing.github.io/dvt/tutorial/annotators.html, example code is shown. Following my programming habits, I copy-pasted the example code into a .py file, assuming that I would encounter a way to invoke the code from the command line later on. Instead, I found that the third code block already contained example output, so I was supposed to enter the code into an interactive Python console instead (this worked fine in my "bare" virtualenv, too). It would be helpful to mention this before introducing the example code. Another option is to include the >>> prompt in the example displays, following the pydoc/doctest convention. You already did this in the DataExtraction API documentation. Doing both might be best.
The clarification from the previous point also applies to the Low-level API section of the README.
While following the API tutorial, I got different results out of the diff, demo_anno and demo_agg examples. In diff and demo_agg the differences were all less than 1 and I'm generally not so impressed by differences in floating point numbers, but I found the differences in the demo_anno output quite apparent because I don't expect different machines to disagree about pixel values in a video file. Has the example video somehow been edited after the tutorials were written? Published example output here, my output below.

dextra.get_data()['demo_anno'].head()

   frame  red  green  blue
0      0    6      1     0
1      1    6      1     0
2      2    6      1     0
3      3    6      1     0
4      4    4      2     0

In the command line documentation page, the text mentions an example file name of "input.mp4", but the example command has "video-clip.mp4". To me it was immediately clear that these refer to the same file, but to avoid confusion entirely, I suggest removing the discrepancy.
On the same page, there is an example command python3 -m http.server –directory dvt-output-data which has an N-dash unicode character instead of two regular dashes before the directory option. This will fail if users try to copy-paste the command. I don't know Sphinx very well, but I suspect this can be fixed by convincing Sphinx that the following line is code:

dvt/dvt/cli.py

Line 19 in c833dc8

> python3 -m http.server --directory dvt-output-data

In the VisualInput abstract base class, documented here and code referenced below, the next_batch method strongly reminds me of the special __next__ method that is used in Python's iterator convention (documented here). Adopting this convention would permit more declarative for batch in my_video loops with my_video an instance of VisualInput.

dvt/dvt/abstract.py

Lines 14 to 55 in c833dc8

    
           class VisualInput(ABC):     # pragma: no cover 
        
               """Base class for producing batches of visual input. 
        
               Implementations in the toolkit provide inputs for video files and 
        
               collections of still images. Users can further implement additional 
        
               input types if needed. 
        
               """ 
        
               @abstractmethod 
        
               def __init__(self, **kwargs): 
        
                   """Creates a new input object, with possible keyword arguments. 
        
                   """ 
        
                   return 
        
               @abstractmethod 
        
               def open_input(self): 
        
                   """Restart and initialize the input feed. 
        
                   This method is called once before passing data through a collection of 
        
                   annotators. 
        
                   """ 
        
                   return 
        
               @abstractmethod 
        
               def next_batch(self): 
        
                   """Move forward one batch and return the current FrameBatch object. 
        
                   When called, this should return a FrameBatch object. It will be 
        
                   iteratively called until a batch is returned with the continue_read 
        
                   flag is set to false. 
        
                   """ 
        
                   return 
        
               @abstractmethod 
        
               def get_metadata(self): 
        
                   """Provide metadata from the input connection. 
        
                   Returns an object that can be processed with the process_output_values 
        
                   utility function. The output is stored as a phantom metadata 
        
                   annotator. 
        
                   """ 
        
                   return

In the dvt.utils.sub_image documentation, the notion of a three-dimensional image deserves a bit more explanation. What is the third dimension? Color channels?
The dvt.annotate.cielab.CIElabAnnotator.num_buckets documentation is very confusing because the documented default bucket size of 16 does not meet the documented requirement of being divisible by 256. Is it the other way round (should the bucket size be a divisor of 256, i.e., one of the first seven powers of 2) or does the requirement perhaps apply only when the bucket size exceeds some particular threshold?
In the documentation of dvt.annotate.diff.DiffAnnotator.size, I presume the intention is “(Size of one side of the square) (used for down sampling [sic] the image)”, but I first read it as “(Size of one side of (the square used for down sampling [sic] the image))”, in which case it is ambiguous whether this square represents the target image size after resampling or a resampling window size (e.g. for a convolution). Perhaps it would be clearer to write “Side length of the square that the image will be downsampled to for the pixel-by-pixel comparison”.
The annotators section of the API documentation has two subsections that both have “Face Annotations” as the title. The second actually describes the ImgAnnotator.
The API documentation of dvt.annotate.obj.ObjectAnnotator has multiple occurrences of the word “face” that should probably read “object”.

State of the field

This issue is part of my review in openjournals/joss-reviews#1800.

One of the requirements on the paper is to describe how your package compares to other commonly-used packages. Currently, the second paragraph of the summary mentions the existence of numerous standalone algorithm implementations but “limited options” for building end-to-end pipelines. It may be valuable to name these limited options explicitly and discuss their relative advantages and disadvantages.

Improve consistency of documentation

(I am refering to the end-user documentation on Github)

The "low-level" tutorial refers to two files (ainput="video-clip.wav",
sinput="video-clip.srt") which have to be copied from tests/test-data/ to the working directory as the mp4 file stated above. Otherwise, the tutorial will fail if you strictly follow the instructions.
Under Mac OS, I could not reproduce the values quoted in the tutorial probably due to rounding issues. The documentation should discuss this issue.

(I am refering to the program documentation on https://distant-viewing.github.io/dvt/))

Example analysis using aggregated metadata: “Visual Style in Two Network Era Sitcoms” refers to a dead link (https://doi.org/10.22148/16.043)

Improve documentation for layperson and Mac users

Generally speaking, the documentation is very well-written. As the software also focusses on layperson users, some parts of the documentation should be expanded.

Although the docs clearly state that the software was built and test on Python 3.7 it should be put more clearly that it actually need Python 3.7 (or above) as the installation scripts strictly check for this requirement.
As the documentation contains shell commands to be copy and pasted, it should clarify that they rely on bash. For instance, Mac users (now with zsh as a standard) will not suceed on running all snippets. "pip install .[tests,optional]" will yield "zsh: no matches found: .[tests,optional]" and should be run with bash. This is less critical, as test will mainly run by developers who will easily cope with this issue.

Question regarding JOSS paper authorship

@statsmaths @nolauren @Nanne

I'm currently reviewing the JOSS submission of dvt (openjournals/joss-reviews#1800). One of the review criteria (per JOSS documentation) is that "the full list of paper authors seems appropriate and complete".

While I trust that the author list is correct, I am currently unable to objectively confirm this from the information available to me on GitHub:

From the paper, assuming I am looking at the right version, I find that the author list includes @nolauren and excludes @Nanne.
From the contributors graph, the contributions by @Nanne appear to be more substantial than those by @nolauren. Both appear to be very small compared to the contributions by @statsmaths. Of course, I understand that this is only a measure of committed lines of code or text and that it completely disregards any contributions that may have been made outside of Git, including potentially important conceptual contributions.

Could you please clarify why @nolauren is included in the author list and why @Nanne is not? Please let me emphasize that I trust that there is a good reason; I'm just asking because this reason is currently invisible to me and I need it in order to confirm one of the review criteria. Thanks in advance.

legacy setup.py deprecation warning

I used pip to install dvt and got a deprecation warning:

DEPRECATION: dvt is being installed using the legacy 'setup.py install' method, because it does not have a 'pyproject.toml' and the 'wheel' package is not installed. pip 23.1 will enforce this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion can be found at pypa/pip#8559

The pyproject.toml approach is newer to me, see https://peps.python.org/pep-0621/ or some background and documentation.

I also know from looking into this from my other projects, you may able to use pdm to import your existing setup.py and create a pyproject file

Please provide instructions on how to run the tests

This issue is part of my review in openjournals/joss-reviews#1800.

One of the review requirements is that there is a documented way to verify that the software works correctly. While there is a test suite, I was not able to find any instructions on how to run it. From searching and trial and error, it appears that these are the steps needed in order to run all tests with coverage and without errors after cloning the repository, although I still got a lot of warnings:

pip install .[tests,optional]
cd tests
pytest --cov=dvt

If installing from PyPI instead of cloning, substitute dvt for ..

Please include these instructions (or similar) in the Readme. It is also worth mentioning in the Readme that running all tests takes quite long; on my machine, it took nearly 15 minutes.

Make Anaconda a strict requirement in the documentation

To simulate to different usage scenarios, I tried to install the software on two plattforms:

Ubuntu 18.04.3 LTS with Python 3.6
Mac OS 10.14.6 with Anaconda Python 3.7

Ubuntu

LTS comes with Python 3.6 pre-installed, hence the documentation should address this requirement (see other issue)
After upgrading to 3.7 and PIP3, the installation of DVT fails due to numerous issues which might be in parts caused by Ubuntu
when using Anaconda under Ubuntu, it becomes possible to install dvt

Mac OS

installation with Anaconda succeeds, however the tutorial yields other values (see other issues)
minor problems with zsh and running tests

Downloading dvt_detect_shots.pt error

Hi there!

Great work on this, this is excellent.

Reporting an error that I did manage to fix on my own but thought you should be aware of.

Heads up: I'm running dvt as a git submodule which I then install into my virtual environment (which i suspect could be linked to the problem).

I was trying to run dvt.AnnoShotBreaks(), and dvt attempted to download this file : https://github.com/distant-viewing/dvt/releases/download/0.0.1/dvt_detect_shots.pt (I'm assuming a ML model?)

I was getting the following error:

Traceback (most recent call last):
  File "/Users/jacob/Documents/Git Repos/plozevet-archive/Scripts/DVT-Tests/breakpoint-test.py", line 23, in <module>
    process(params)
  File "/Users/jacob/Documents/Git Repos/plozevet-archive/Scripts/DVT-Tests/breakpoint-test.py", line 19, in process
    anno_breaks = dvt.AnnoShotBreaks()
  File "/Users/jacob/Documents/Git Repos/plozevet-archive/venv/lib/python3.10/site-packages/dvt/shots.py", line 26, in __init__
    model_path = _download_file("dvt_detect_shots.pt")
  File "/Users/jacob/Documents/Git Repos/plozevet-archive/venv/lib/python3.10/site-packages/dvt/utils.py", line 135, in _download_file
    download_url_to_file(url, cached_file, hash_prefix, progress=True)
  File "/Users/jacob/Documents/Git Repos/plozevet-archive/venv/lib/python3.10/site-packages/torch/hub.py", line 625, in download_url_to_file
    f = tempfile.NamedTemporaryFile(delete=False, dir=dst_dir)
  File "/usr/local/Cellar/[email protected]/3.10.12/Frameworks/Python.framework/Versions/3.10/lib/python3.10/tempfile.py", line 559, in NamedTemporaryFile
    file = _io.open(dir, mode, buffering=buffering,
  File "/usr/local/Cellar/[email protected]/3.10.12/Frameworks/Python.framework/Versions/3.10/lib/python3.10/tempfile.py", line 556, in opener
    fd, name = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
  File "/usr/local/Cellar/[email protected]/3.10.12/Frameworks/Python.framework/Versions/3.10/lib/python3.10/tempfile.py", line 256, in _mkstemp_inner
    fd = _os.open(file, flags, 0o600)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/jacob/.cache/torch/hub/checkpoints/tmps3k5bm1g'

In the end I went and manually created the path /torch/hub/checkpoints/. Sounds like a problem that could be easily resolved with a Path(dir).mkdir(parents=True) ? Or maybe the issue is my submodule/venv setup.

Anyway, all working my end and having fun with the toolkit!

colab notebook - more examples working from video

Very much appreciate the colab notebook, as I find the documentation a bit hard to parse, given my low skill level. With regard to working with video it'd be good to have code snippets demonstrating the other annotators on video as in examples 9 - 12 using images.

install into new 3.7 environment, Keras requires TensorFlow 2.2 or higher

Hiya,

Just trying to follow the minimal example. I've got a new python 3.7 environment (made with conda), I pip install dvt then give python3 -m dvt video-viz target.mp4 a whirl:

ModuleNotFoundError: No module named 'tensorflow.keras.layers.experimental.preprocessing'

and then

Keras requires TensorFlow 2.2 or higher.

But if I installed tf2.2, I get dvt 0.3.3 requires tensorflow==1.15.2, but you have tensorflow 2.2.0 which is incompatible.

so I'm assuming that I've got something borked on my machine.

Demo web app not usable under Mac OS

When following the tutorial and after launching the local webserver with the provided mp4 file, further navigation becomes impossible. The arrow icons are inactive as well as the scroll bar.
Checked with Firefox 71, Opera 60.0.3255.109, and Chrome 79.

Error running example face annotation code

I'm trying to run the face annotation code example in the readme and getting errors. I think I have everything installed correctly. I've tried in Google Colab, jupyter, and python console, so I don't think it's my environment.

Here's the error from running dvt.AnnoFaces() in a jupyter notebook:

Downloading: "https://github.com/distant-viewing/dvt/releases/download/0.0.1/dvt_detect_face.pt" to /Users/rkoeser/.cache/torch/hub/checkpoints/dvt_detect_face.pt

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[3], line 1
----> 1 anno_face = dvt.AnnoFaces()

File ~/workarea/env/dvt/lib/python3.9/site-packages/dvt/face.py:32, in AnnoFaces.__init__(self, model_path_mtcnn, model_path_vggface)
     29 def __init__(self, model_path_mtcnn=None, model_path_vggface=None):
     31     if not model_path_mtcnn:
---> 32         model_path_mtcnn = _download_file("dvt_detect_face.pt")
     34     if not model_path_vggface:
     35         model_path_vggface = _download_file("dvt_face_embed.pt")

File ~/workarea/env/dvt/lib/python3.9/site-packages/dvt/utils.py:131, in _download_file(url, basename)
    129     r = HASH_REGEX.search(filename)  # r is Optional[Match[str]]
    130     hash_prefix = r.group(1) if r else None
--> 131     download_url_to_file(url, cached_file, hash_prefix, progress=True)
    133 return cached_file

File ~/workarea/env/dvt/lib/python3.9/site-packages/torch/hub.py:611, in download_url_to_file(url, dst, hash_prefix, progress)
    609 dst = os.path.expanduser(dst)
    610 dst_dir = os.path.dirname(dst)
--> 611 f = tempfile.NamedTemporaryFile(delete=False, dir=dst_dir)
    613 try:
    614     if hash_prefix is not None:

File ~/.pyenv/versions/3.9.13/lib/python3.9/tempfile.py:545, in NamedTemporaryFile(mode, buffering, encoding, newline, suffix, prefix, dir, delete, errors)
    542 if _os.name == 'nt' and delete:
    543     flags |= _os.O_TEMPORARY
--> 545 (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
    546 try:
    547     file = _io.open(fd, mode, buffering=buffering,
    548                     newline=newline, encoding=encoding, errors=errors)

File ~/.pyenv/versions/3.9.13/lib/python3.9/tempfile.py:255, in _mkstemp_inner(dir, pre, suf, flags, output_type)
    253 _sys.audit("tempfile.mkstemp", file)
    254 try:
--> 255     fd = _os.open(file, flags, 0o600)
    256 except FileExistsError:
    257     continue    # try again

FileNotFoundError: [Errno 2] No such file or directory: '/Users/rkoeser/.cache/torch/hub/checkpoints/tmp3fu4p_3n'

I can't tell where it's going wrong, but hopefully this is meaningful to you. I tested manually and the .pt file does download.

Is there a reason these files are separate downloads? Would it make sense to include them as part of the python package?

`freq` does not work consistently across the whole video file

Describe the bug
utils._which_frames does not filter frames as one would expect with respect to the freq-Attribute. It applies freq only within a present batch but not across batches for the entire movie. This leads to inconsistent steps in the output data.

To Reproduce

dextra = DataExtraction(FrameInput(input_path="./VID-20191006-WA0014.mp4"))  # default bsize=256
dextra.run_annotators([LchAnnotatorRay(freq=10)], max_batch=2)  #my custom annotator, using utils._which_frames function

which gives me data for the frames: [0, 10, 20, ..., 240, 250, 256, 266, 276, ..., 500, 510]

you see wrong frame step after switching from batch 1 to batch 2. This makes sense when looking at the implementation of utils._which_frames:

dvt/dvt/utils.py

Line 218 in f4e5f89

return list(range(0, batch.bsize, freq))

Regardless of the starting point of the current batch, the selection of frames always starts with 0 and the first frame in the second batch is frame 256.

Expected behavior
The correct frame sequence in the above example should be [0, 10, 20, ..., 240, 250, 260, 270, ..., 500, 510].
The first frame of the second batch should be 260 which equates to the index 4 (not 0).

Desktop (please complete the following information):

Arch Linux
Python 3.7.1
dvt 0.3.3

Additional context
I developed a fix which for which I would make a pull request if you agree upon my expected behaviour. It works like this:

The freq minus the rest of bnum times the bsize divided by the freq for bnum greater than 0 otherwise 0.

if frames is None:
        # return list(range(0, batch.bsize, freq))
        first_frame = 0
        _, rest = divmod(batch.bnum * batch.bsize, freq)
        if rest != 0:
            first_frame = freq - rest
        return list(range(first_frame, batch.bsize, freq))

This works well for me so far.

	class VisualInput(ABC): # pragma: no cover
	"""Base class for producing batches of visual input.

	Implementations in the toolkit provide inputs for video files and
	collections of still images. Users can further implement additional
	input types if needed.
	"""

	@abstractmethod
	def __init__(self, **kwargs):
	"""Creates a new input object, with possible keyword arguments.
	"""
	return

	@abstractmethod
	def open_input(self):
	"""Restart and initialize the input feed.

	This method is called once before passing data through a collection of
	annotators.
	"""
	return

	@abstractmethod
	def next_batch(self):
	"""Move forward one batch and return the current FrameBatch object.

	When called, this should return a FrameBatch object. It will be
	iteratively called until a batch is returned with the continue_read
	flag is set to false.
	"""
	return

	@abstractmethod
	def get_metadata(self):
	"""Provide metadata from the input connection.

	Returns an object that can be processed with the process_output_values
	utility function. The output is stored as a phantom metadata
	annotator.
	"""
	return

distant-viewing / dvt Goto Github PK

dvt's People

Contributors

Stargazers

Watchers

Forkers

dvt's Issues

(I am refering to the end-user documentation on Github)

(I am refering to the program documentation on https://distant-viewing.github.io/dvt/))

Ubuntu

Mac OS

Recommend Projects

Recommend Topics

Recommend Org