Giter Site home page Giter Site logo

copycat's Introduction

copycat - Your dumb local video collection
==========================================

Do you tend to watch the same favourite videos over and over again?
Did your youtube user experience turn painful lately, due to
perennial, aggressive embedded advertisements? Do you mind Big Brother
watching you each time you watch a movie? Maybe you have children who
like cartoons, but you don't want them accidentally stumbling into
other content (and did I mention the obnoxious ads before nearly all
videos)?

The purpose of copycat is to provide a locally hosted video service
accessible from all your devices. It runs perfectly on your home
server or NAS. The videos are sourced from youtube or any other online
video sharing site supported by youtube-dl. Clients are able to browse
and play downloaded videos, without re-downloading them from youtube
on each replay. This is valuable from a user perspective (re-gained
privacy and no advertising inserts ever)!

To give you an idea of how copycat looks and behaves, take a look at
this overview of the web interface:
https://github.com/tomszilagyi/copycat/wiki/The-copycat-web-interface

Features
--------

- Drive youtube-dl to download individual videos or playlists of
  videos from any site supported by youtube-dl.

- Videos are downloaded in a format suitable for playback on all
  modern HTML5-supporting browsers, both mobile and desktop.

- Videos are shown with their auto-generated thumbnail, title, source
  site (youtube, vimeo, etc.), and link to original.

- Interactive narrowing search based on video titles, using multiple
  search keywords with instantly updating results. If you like Emacs
  Helm (or anything similar), you will love this.

- Ability to instantly play individual videos or enqueue them into a
  playlist for sequential playing (preserves player settings, e.g.,
  volume and full-size, across videos). Add all search results to a
  playlist with a single click.  Playlists are ephemeral and
  browser-local.

- Seamless editing of video titles for organization (search relies on
  these, so be strategic about them if you plan to have many videos).
  Just click on the title and edit in place. It will be saved back to
  the copycat server when the focus is moved elsewhere.

- Ability to trash videos (remove them, freeing up precious storage
  space, but retaining an audit trail that makes it trivial to
  re-download them later).

- Simple, boring, dumb technology, both front and back. Less than 2000
  lines of total implementation code.

- Completely self-hosted.

- MIT license.


Setup
-----

For local development purposes, just run the appropriately named
script (./run.sh) and direct your browser to http://127.0.0.1:8799.
For prerequisites you need installed beforehand, best to look at the
Dockerfile (this is for Debian but should be directly applicable if
you use another distro). The run.sh script will also verify that all
required executables are present and error out if anything is missing.

If you actually want to watch videos, you need to front copycat's
webserver with a reverse proxy and arrange for the static media files
to be served directly by a real webserver. This is essential to
support range-based GETs (for prompt playback and the ability to jump
around in anything but the smallest videos), caching, and for general
performance and robustness.

If you use nginx, a suitable bare-minimum config for doing this is
below. Note the location block for /data/: this ensures that media
files stored within the data/ subdirectory will be served directly by
nginx and not go through the copycat webserver.  You might also want
to look at a similar config in docker.nginx.site for inspiration.

If you use some other webserver, you need to create an equivalent
configuration on your own. Alternatively, see below to learn how to run
copycat+nginx from a docker image.

--- /etc/nginx/sites-enabled/video ---
server {
    listen 80;

    server_name video.lan;

    access_log /var/log/nginx/video-access.log;

    location / {

        proxy_set_header    Host $host;
        proxy_set_header    X-Real-IP $remote_addr;
        proxy_set_header    X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header    X-Forwarded-Proto $scheme;

        proxy_pass          http://127.0.0.1:8799/;
        proxy_read_timeout  30;
        proxy_cache         off;
        proxy_request_buffering off;
        proxy_buffering     off;
        proxy_buffer_size   4k;

        expires             off;
        add_header 'Cache-Control' 'no-store, no-cache, must-revalidate, proxy-revalidate, max-age=0';
    }

    location /data/ {
        root /path/to/copycat;
        autoindex on;
        autoindex_exact_size off;
    }
}
--- END SNIPPET ---


Run with docker
---------------

You can build a docker image containing a copycat instance plus an
nginx frontend serving static media files and acting as reverse proxy
(as recommended above).

To build the image, run this command from the copycat repository root
(a.k.a. this directory):

    docker build --tag=copycat:latest .

This will take some time and emit a voluminous amount of output, but
if everything went well, it will end similar to this:

    Successfully built xxxxxxxxxxxx
    Successfully tagged copycat:latest

Having built the image, all that is left is launching an instance, which
will create a running docker container. Use a command similar to this:

    docker run -p 0.0.0.0:8000:80 -v </path/to/data>:/copycat/data copycat

I will not go into details on how to setup a daemonized instance with
resource limits, auto-restarts and all that -- the docker manuals will
serve you well. Just note that the example shown above will make the
server listen on port 8000, accessible from the whole network. If you
would like to restrict connections to within the same host only (e.g.,
because you run this behind yet another reverse proxy on the same
host) then change 0.0.0.0 to 127.0.0.1.

Note: within the docker image, both nginx and copycat will run under
the effective user account of www-data. Please ensure that the path to
the mounted data storage </path/to/data> has its permissions set so
that user www-data has write access to it. This volume will hold all
persistent data (the database and media files), so that you can freely
stop/restart the docker container, or even rebuild it after making
modifications to the sources, and still have all your data intact next
time you launch the site.


Adding videos
-------------

Simply invoke http://<your-copycat>/add/<youtube-video-url>
e.g.: http://video.lan/add/https://www.youtube.com/watch?v=u1kZ9zYr7kk
(yes, it looks hairy, but you can actually do that, and this API
recognizes the fact that it is easiest to copy the complete URL).

A simple but convenient way to do this while avoiding laborious manual
copy-pasting of URLs is to add a bookmark to the bookmarks toolbar of
your browser with the following destination (carefully copy this
verbatim into the field for Target or Destination URL):

    javascript:location.href="http://<your-copycat>/add/"+encodeURIComponent(location.href)

Save the bookmark with a name such as "copycat this" or similar. Then,
whenever you are watching a youtube video that you would like to
cache, just press the "copycat this" button in the bookmarks toolbar,
sit back, and wait. This also works for playlists (copycat will
download all the videos in the playlist).


Caveats and Disclaimers
-----------------------

This is not designed to be a proper web application that scales to a
very large amount of videos and to several independent users. This is
the kind of project that you could do in a single weekend if you were
so inclined (I was not, but the amount of work put into it, taken
together, is not much more than that). In short: it is a small, simple
solution that works under limited circumstances.

There is nothing akin to a user account, so if several people want
disjunct sets of videos then maybe best to set up multiple personal
instances.

Please take full responsibility for the security of your own computing
environment. DO NOT UNDER ANY CIRCUMSTANCES expose a 'naked' copycat
to the Internet without appropriate protection / reverse proxying in
front of it. I would strongly recommend adding mandatory SSL
termination and user authorization (customarily provided by an nginx
reverse proxy) if you wish to make your instance available from
anywhere outside your trusted home network.

The copycat backend is only used on Linux and completely untested
anywhere else. In principle there should be nothing Linux-specific
about it, so it should work fine on other UNIX-like systems such as
BSDs (including Mac OS X). Common utilities (awk, bash, etc.) might
differ in versions or flavours though, so YMMV. The frontend has been
tested with Firefox on Linux and Chrome on Android. As usual, THERE IS
NO WARRANTY and THERE IS NO SUPPORT.


Implementation notes
--------------------

The copycat backend runs on a minimal, embedded webserver implemented
in bash (yes, you read that right). Dynamically generated site content
(everything besides static stuff like js, css, thumbnail and video
files) is generated by a mixture of bash and awk.

The main database of videos consists of a single CSV-like file called
videos.dat, with an added twist: fields are separated by NUL / '\0'
instead of the conventional comma or semicolon. Records are still
separated by newlines ('\n'). This makes free-form content easier to
handle without relying on excessive field quotation.

Currently, the site is generated on-the-fly on each HTTP request, but
this could easily be changed to a static site regenerated on each
change (e.g., new video) for much higher performance.

copycat's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.