Giter Site home page Giter Site logo

vidcontrol's Introduction

vidcontrol

PyPI version PyPI - Python Version PyPI - License

⚠️ This package is still under development and is not yet ready for production use. ⚠️

vidcontrol is a Python package for managing multi-webcam video capture. It is designed to be cross-platform and easy to use. Under the hood, we use imageio for video capture, which itself uses ffmpeg. This package offers various utility functions for listing and selecting available webcams, and for capturing video from multiple webcams simultaneously. We also provide an easy-to-use interface for resolving image resolutions and frame rates.

This package has been developed in conjunction with ctrlability together with the Prototype Fund.

Platform Status
MacOS Full support Tested and verified on macOS 12.0+
Linux Not yet, but planned
Windows Full support Tested and verified on Windows 11

Getting Started

Prerequisites

This package requires Python 3.8 or higher. We recommend using a virtual environment for development. You can create a virtual environment using venv or conda.

Additionally, you will need to install ffmpeg on your system. On macOS, you can install ffmpeg using Homebrew:

brew install ffmpeg

Installation

You can install the latest version of the package directly from this repository:

pip install git+https://github.com/inmotion-health/vidcontrol.git

Or install from source:

git clone https://github.com/inmotion-health/vidcontrol.git
cd vidcontrol
pip install .

Basic Usage

Generally, this package is constructed around two main classes: VideoManager and VideoSource. The VideoManager is used to manage all available webcams and to create new VideoSource instances. The VideoSource is used to capture video from a single webcam. The following example shows how to use these classes to capture video from a single webcam:

from vidcontrol import VideoManager

manager = VideoManager()

# List available webcams
manager.list_available_cameras()

# Set the height of the video capture
manager.set_preferred_height(480)

# Get a video source and start capturing
source = manager.get_video_source(0)
for frame in source:
    # Do something with the frame
    pass

To capture video from a webcam, you first need to request a VideoSource from the VideoManager. You can do this by calling get_video_source and passing the index of the webcam you want to use. The index is the same as the index returned by list_available_cameras. You can also pass a preferred_height and preferred_fps to get_video_source to set the height and frame rate of the video capture. If you do not pass these parameters, the VideoManager will use the default values.

The VideoManager will then try to find a webcam that supports the requested height and frame rate. If it cannot find a webcam that supports the requested height and frame rate, it will use the next best resolution and frame rate. By default, this fallback is disabled, but you can enable it by setting next_best to True when configuring the VideoManager.

If the VideoSource for a webcam is requested multiple times, the VideoManager will return the same VideoSource instance. This means that you can use the same VideoSource instance in multiple places in your code. This is useful if you want to capture video from a single webcam multiple times.

For more detailed examples, please see the examples folder.

Configuration

vidcontrol supports various options for configuring the video capture. You can set these options either when creating a new VideoManager or when creating a new VideoSource. The following table lists all available options:

Parameter Description Default Value
VideoManager
preferred_height The preferred height of the video capture. 480
preferred_fps The preferred frame rate of the video capture. 30
next_best Whether to use the next best resolution if the preferred resolution fails. False
VideoSource
color_format The color format of the video capture. rgb
Other options: bgr
mirror_frame Whether to mirror the frame vertically. True
flip_frame_horizontal Whether to flip the frame horizontally. False

These options are either passed as a dictionary to the VideoManager or VideoSource set_config, or via their respective functions in the form of set_<option>. See the basic usage or pass config example to see how to use some of these options.

Contributing and Issues

We welcome contributions and feedback. Please use GitHub issues to report bugs, discuss features, or ask questions. If you would like to contribute to the project, feel free to open a pull request. Please make sure to follow the PEP8 style guide.

Things we especially would appreciate help with:

  • Testing on different platforms and versions
  • Testing with different webcams

When contributing, please make sure to use conventional commits for your commit messages. This makes it easier to automatically generate a changelog. You can find more information about conventional commits here.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

vidcontrol's People

Contributors

derklinke avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

vidcontrol's Issues

Make webcam detection on macOS more robust

Currently, we are using the FFmpeg command and are parsing its command line output for a list of available webcams and resolutions, this feels more like a hack and not very reliable long term.

Maybe instead can strive to find a more permanent solution, such as directly interfacing with FFmpeg from c++ or talking to AVFoundation.

Test on multiple macOS & Windows Versions

We need to verify that this tool is not just working on the current OS but also older versions. Can we verify this, maybe also through automated testing?

Versions we should test:

  • macOS 10.15 and upwards
  • Windows 8 and upwards

Error when using a specific older webcam

When using an older webcam model, we sometimes get pixel format issues, but not always. We need to investigate this further and find a solution, hard-coding a pixel format seems like a hack.

File "/usr/local/lib/python3.11/site-packages/imageio/v2.py", line 293, in get_reader
    return image_file.legacy_get_reader(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/imageio/core/legacy_plugin_wrapper.py", line 116, in legacy_get_reader
    return self._format.get_reader(self._request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/imageio/core/format.py", line 221, in get_reader
    return self.Reader(self, request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/imageio/core/format.py", line 312, in _init_
    self._open(**self.request.kwargs.copy())
  File "/usr/local/lib/python3.11/site-packages/imageio/plugins/ffmpeg.py", line 343, in _open
    self._initialize()
  File "/usr/local/lib/python3.11/site-packages/imageio/plugins/ffmpeg.py", line 486, in _initialize
    raise IndexError(
IndexError: No (working) camera at <video0>.

Could not load meta information
=== stderr ===

ffmpeg version 4.2.2 Copyright (c) 2000-2019 the FFmpeg developers
  built with Apple clang version 11.0.0 (clang-1100.0.33.8)
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-appkit --enable-avfoundation --enable-coreimage --enable-audiotoolbox
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
[avfoundation @ 0x7fea9d808200] Selected pixel format (yuv420p) is not supported by the input device.
[avfoundation @ 0x7fea9d808200] Supported pixel formats:
[avfoundation @ 0x7fea9d808200]   uyvy422
[avfoundation @ 0x7fea9d808200]   yuyv422
[avfoundation @ 0x7fea9d808200]   nv12
[avfoundation @ 0x7fea9d808200]   0rgb
[avfoundation @ 0x7fea9d808200]   bgr0

What webcam model does this fail on? Which version of FFmpeg are we using?

Make resolution and FPS detection more robust

While currently we have a working automatic detection of the resolution and frame rate we want to use for the stream via a preferred height, we need to make this more robust and handle more edge cases:

  • what do we do when the preferred height does not exist? We could get the closest available and throw a warning
  • we should also be able to set a preferred frame rate

Crash when trying to open Face Time HD Camera with closed lid

When going over all available cameras, the inbuilt MacBook camera is being detected, but when we try to open it, vidcontrol crashes. How can we handle this more gracefully?

One idea could be to instead of raising an error to quietly accept the camera as not existing and remove it from our list? But then what happens when we open the lid while being used? We then would lose access.

Issue with multithreaded access

When accessing a frame of one video source from concurrently executing threads, we get thread race issues. To fix this, we need to implement a locking mechanism.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.