Giter Site home page Giter Site logo

ebu / ebu-tt-live-toolkit Goto Github PK

View Code? Open in Web Editor NEW
25.0 10.0 10.0 114.63 MB

Toolkit for supporting the EBU-TT Live specification

Home Page: http://ebu.github.io/ebu-tt-live-toolkit/

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.14% Python 63.72% Batchfile 0.09% HTML 1.84% CSS 0.20% Gherkin 17.12% JavaScript 16.89%
ebu-tt python subtitles captions subtitling captioning live broadcast video

ebu-tt-live-toolkit's Introduction

Build Status Build Status Coverage Status

ebu-tt-live-toolkit

This is the repository for the interoperability kit of EBU-TT Live.

The kit is envisaged to contain a set of components for generating, testing and distributing subtitle documents in EBU-TT Part 3 format.

This is an open source project. Anyone is welcome to contribute to the development of the components. Please see the wiki for the list of required components, guidelines and release plan.

The project home page is at http://ebu.github.io/ebu-tt-live-toolkit/ and links to the pre-built documentation.

We have a Slack team called ebu-tt-lit for day to day communications, questions etc. Please join up!

If you would like to contribute or join the Slack team, please contact [email protected] or [email protected]

Preparing the build environment

Make sure you have python 2.7+. Make sure you have python virtual environment capability.

If not you can install virtualenv systemwide from your operating system's package repository or by pip:

sudo pip install virtualenv

After that creating a virtual environment should be as simple as:

virtualenv env

Let's activate it (source makes sure the current shell executes the script and assumes the environment variables that the activation script sets):

source ./env/bin/activate

To build the project you will also need node.js. Please read the instructions for your system here.

After having created the python virtual environment, having activated it and having installed node.js the package can be built by typing make if you have GNU build tooling on your system.

make

Alternatively:

pip install -r requirements.txt
python setup.py develop

pyxbgen --binding-root=./ebu_tt_live/bindings -m __init__ --schema-root=./ebu_tt_live/xsd/ -r -u ebutt_all.xsd

npm install nunjucks
node_modules/nunjucks/bin/precompile ebu_tt_live/ui/user_input_producer/template/user_input_producer_template.xml > ebu_tt_live/ui/user_input_producer/template/user_input_producer_template.js

After this you are supposed to be able to launch the command line tools this python package provides i.e.:

ebu-dummy-encoder

Windows users

Windows is not the best friend of Makefiles. So there is a make.bat file for those who would like to develop using Windows. Assuming python 2.7 and virtualenv is installed and are on the PATH. To build the project you will also need node.js. Please read the instructions for your system here. Then run :

make

This will make sure a virtual environment is created and activated and installs all the tools into it.

After that the following command should work:

ebu-dummy-encoder

The Schema definitions XSD

The schema definitions are to be found embedded in the Python library in the xsd1.1 subfolder. The root schemadocument is called ebutt_live.xsd.

The Python library

The library uses XSD schemas from the xsd1.1 subdirectory. The bindings will keep the validation sane and PyXB makes sure that updates are working as expected. Should the schema be modified a regeneration can be run and the bindings will respect the changes.

Scripts

There are several scripts that emulate different components (nodes) in the infrastructure. They can be executed individually or in combinations by running ebu-run. Assuming the Makefile worked, the package is installed in a virtual environment and the virtual environment is active, the components should be available by running the ebu-run script and passing a configuration file. There are several example configuration files in examples/config. For the complete list see of scripts see docs/build/html/scripts_and_their_functions.html.

Below is a list of some of the key components. .

The simple producer is the beginning of the data pipeline. It generates EBU-TT-Live documents in a timed manner. In the repository root there is a test.html file that can be used for manual testing of the producer in any websocket capable browser. To run it use ebu-run:

`ebu-run --admin.conf ebu_tt_live/examples/config/simple_producer.json`

The simple consumer connects to the producer or later on in the pipeline, assuming there are more components inserted.

ebu-run --admin.conf ebu_tt_live/examples/config/simple_consumer.json

The User Input producer is a web page with a user interface that allows you to send subtitle documents and view the output of a downstream node. For complete documentation see docs/build/html/user_input_producer.html.

To run a configuration of components, use a configuration file with multiple nodes defined. For example, this will create 3 nodes: a distributer that listens to the UIP and two consumers that subscribe to the distributer:

ebu-run --admin.conf ebu_tt_live/examples/config/user_input_producer_dist_consumers.json

Documentation

Go straight to the pre-built documentation for the current master branch.

The documentation framework uses the popular Sphinx documentation generating engine and autodoc plugins to give developers the flexibility of writing Extra documentation interleaved with the autogenerated documentation created by autodoc.

Prerequisite: Graphviz

To display the images in the documentation, you need to have Graphviz installed and make sure the dot executable is on the PATH. For example, for users of homebrew:

brew install graphviz

Generating documentation

Documentation can be generated based on the sources in the docs/source directory. After having installed the packages in requirements.txt (which is done automatically by the make command) documentation can be generated by one of the following three ways:

1 Calling setuptools

python setup.py build_sphinx

2 Running make in the docs directory where separate makefiles and a make.bat file is giving a variety of options.

cd docs
make html

3 Calling the sphinx-build command line script that comes with sphinx. WARNING: Platform-dependent path-separators.

sphinx-build -b html docs/source/ docs/build/html

Previewing the documentation

After sphinx finished with a successful execution log the generated documentation should be accessible by opening the docs/build/html/index.html in any web browser.

Tests

The test framework is described in CONTRIBUTING.md

How to contribute

Please refer to CONTRIBUTING.md

ebu-tt-live-toolkit's People

Contributors

eyallavi avatar frans-ebu avatar kozmaz87 avatar malikbeytrison avatar nigelmegitt avatar skhameed86 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ebu-tt-live-toolkit's Issues

COMPONENT: Simple Part 3 consumer (Semantic validation)

As a developer/playout service provider, I want to validate an EBU-TT Live sequence against the specification so that I know if they conform to the specification.

Input: Single part 3 doc or sequence of part 3 documents
Output: display of text with begin and end times and validation message(s)

It requires the following steps to be completed:

  • validation framework for semantic rules #72
  • live semantic validation #73
  • timing calculation(consumer)
    • copy across timing constraints from BBC/kozma/consumerlogic branch #88
    • calculate document computed begin and end times #89
    • calculate resolved activation begin and end times #109
    • document begin and end times should be in that order on the timeline according to R16
    • write tests for correct resolution of activation time calculation
  • produce output
    • generate output requirements - see #87
    • write output

(this list is not yet complete)

COMPONENT: Teletext-to-part 3

As a broadcaster, I want to convert subtitles in teletext format to EBU-TT Live documents so that I can use legacy systems.

Input: Teletext (data feed/VBI/VANC)
Output: Sequence of EBU-TT Live documents

TimecountTimingType regex is wrong

Spotted this while working on my understanding of the bindings and how to add smpte <-> timedelta conversion.

Python's regexes are read sequentially, meaning that if you have for example [0-9]+(h|m|s|ms) :

  • 9h is parsed normally
  • 9m also
  • 9s also
  • However, 9ms is parsed as 9m and the s is forgotten

This is really problematic because it also allows 9mh for example and extracts 9m, however 9mh is not permitted in this case.

To solve the problem, the solution is to add a $ at the end of the regex, so for timecountTimingtype :

  • [0-9]+(\.[0-9]+)?(h|m|s|ms)$ in the xsd
  • ?P<numerator>[0-9]+(?:\\.[0-9]+)?)(?P<unit>h|m|s|ms)$ in bindings

COMPONENT: Delay node

As a playout service provider, I want to add a delay to the sequence so that I can synchronise subtitle with audio.

Input: a sequence of part 3 documents.
Output: a delayed sequence of part 3 documents.

  • implement a fixed delay node #231
  • implement a variable delay node #232
  • write tests for delay node #233

Badge-per-branch for Travis CI?

Currently the Travis CI badge referred to in the readme.md points to the Master branch.

This means the badge may show incorrect status on other branches (unless the readme.md is modified).

Some people have done provided a pre-built hook to get around this, but it seems to be a bit fragile: http://stackoverflow.com/questions/18673694/referencing-current-branch-in-github-readme-md

Also check out: http://stackoverflow.com/questions/19810386/showing-travis-build-status-in-github-repo

COMPONENT: Fixed input producer

As a developer/tester, I want a sequence of EBU-TT Live documents generated automatically so that I can develop/test an implementation against it without the need to input text.

Input: none.
Output: a sequence of Part 3 documents.

COMPONENT: Downstream validator

As a developer/playout service provider, I want to check that an EBU-TT Live document conforms to downstream requirements so that I know it will be successfully processed.

Input: a single part 3 document; metadata instructions.
Output: validation message.

COMPONENT: SDI carrier

As a playout service provider, I want to carry EBU-TT Live documents over HD SDI so that I can pass EBU-TT Live around with video in a broadcast environment.

Input: a sequence of part 3 documents.
Output: a sequence of part 3 documents in HD SDI

COMPONENT: User input producer

As a developer/tester, I want a sequence of part 3 documents generated from text input so that I can develop/test an implementation against it.

Input: Text file provided by user.
Output: Sequence of EBU-TT Live documents.

Timebase semantic testing

Test semantic validation on timebase rules :

Start point of a temporal interval associated with a tt:body element.
If the timebase is "smpte" the type shall be ebuttdt:smpteTimingType .
If the timebase is "media" the type shall be ebuttdt:mediaTimingType .
If the timebase is "media" the time expression should be the offset from a syncbase of "00:00:00.0".
If the timebase is "clock" the type shall be ebuttdt:clockTimingType .

CI e-mail notifications

Add CI build notifications to team (members).

Note that it seems committers get info on their commits already anyhow:

By default, email notifications are sent to the committer and the commit author, if they are members of the repository (that is, they have push or admin permissions for public repositories, or if they have pull, push or admin permissions for private repositories).

Options for wider notifications include:

COMPONENT: Simple noise introducer

As a developer/tester, I want to consume 'noisy' sequences of part 3 documents so that I can test my implementation against a known set of scenarios.

Input: a sequence of part 3 documents; noise options.
Output: a sequence of modified part 3 documents.

Slack notifications granularity

Check if we all agree with changing the notifications to only show:

  • Failures
  • Changes (includes the first build after fails)

Responses:

  • Frans OK
  • Zoltan OK
  • Eyal
  • Nigel
  • Gil

COMPONENT: Reference clock

As a developer/tester, I want an external reference clock so that I can ensure documents are processed correctly.
Input: none.
Output: UTC date-time value.

COMPONENT: Archiver

As a broadcaster/access services provider, I want to archive EBU-TT Live documents so that I can reuse and distribute them after live transmission as a single Part 1 document.

Input: a sequence of EBU-TT-Live documents.
Output: a single EBU-TT part 1 document.

COMPONENT: RTP carrier

As a playout service provider, I want to carry EBU-TT Live documents using the RTP protocol so that I can use EBU-TT Live over IP networks.

Input: a sequence of part 3 documents.
Output: a sequence of part 3 documents for interfacing with RFC 3550.

COMPONENT: Distributor node

As a playout service provider, I want to distribute a sequence so that it can be processed by multiple consumers.

Input: a sequence of part 3 documents
Output: a sequence of part 3 documents available to multiple destinations

COMPONENT: XSD

As a developer/playout service provider, I want to validate EBUT-TT Live documents so that I know if they are valid XML

Input: a single part 3 document.
Output: validation message.

COMPONENT: Handover node

As a subtitle provider/broadcaster, I want to combine documents from alternating respeakers/stenographers into a single EBU-TT live

Input: 2 or more part 3 sequences; handover options.
Output: a single part 3 sequence.

Subtasks:

  • Implement core functionality #363
  • Create configurator #368
  • Document handover node #364
  • Unittest handover node #373
  • BDD testing for handover node #397 (duplicate of #311)
  • UIP modifications to support handover use-case #374

COMPONENT:XSD ttm:agent element defined as string, should be a complexType

The ttm:agent element is defined in metadata.xsd as xs:string, but it should be as in TTML1 §12.1.5:

<ttm:agent
  type = (person|character|group|organization|other)
  xml:id = ID
  xml:lang = string
  xml:space = (default|preserve)
  {any attribute not in default or any TT namespace}>
  Content: ttm:name*, ttm:actor?
</ttm:agent>

where

<ttm:name
  type = (full|family|given|alias|other)
  xml:id = ID
  xml:lang = string
  xml:space = (default|preserve)
  {any attribute not in default or any TT namespace}>
  Content: #PCDATA
</ttm:name>

and

<ttm:actor
  agent = IDREF
  xml:id = ID
  xml:lang = string
  xml:space = (default|preserve)
  {any attribute not in default or any TT namespace}>
  Content: EMPTY
</ttm:actor>

This issues is copied from the BBC repo: bbc#8

Differentiate empty and missing template variable

For example, if : test_var = "?None?" and we have template :

{% if test_var != "?None?" %}
    <a_tag>{{ test_var }}</a_tag>
{% endif % }

So if test_var = '' the tag will be present but will contain an empty string.

graphviz dependency not documented

I think graphviz needs to be installed in order to see the figures in the documentation correctly, but this is not clear from the README.md.

(Building sphinx gives a warning it cannot find the dot executable if the user has not installed it).

I see two options:

  • install graphviz automatically (not easy to do platform independently?)
  • document in README.md that the user needs to install graphviz and add the location of the dot executable to the PATH

COMPONENT: Switcher node

As a playout service provider, I want to switch between multiple sequences so that I can choose which sequence is output.

Input: multiple part 3 sequences; switching options.
Output: a single part 3 sequence.

COMPONENT: EBU-TT-D encoder

As a playout service provider, I want to convert EBU-TT-Live documents to EBU-TT-D documents so that I can distribute them to the end device

Input: sequence of part 3 documents
Output: sequence of EBU-TT-D documents

Subtasks:

  • Add EBU-TT-D XSD 1.1 to repository and update it if necessary #177
  • Create bindings (move EBU-TT-3 current bindings to a sensible place) #176
  • Conversion initiation logic and validation #174
  • Create EBU-TT-D conversion classes #170

Investigate CI plug-ins

As @kozmaz87 suggested, 2 Jenkins plug ins that would be useful:

  • The junit results plugin, which makes navigation of the test suite results easy and adds retrospective of the last builds, so you can see how the test metrics changed in the last X builds
  • the cobertura coverage plugin which does the same just with coverage

Timedelta <-> SMPTE conversion

Conversion between XML time formats and timedelta values is done in this file. This setup allows us to do the conversions during pyxb binding loop.

Conversion for SMPTE is a bit more complicated than conversion for clock and media times. Indeed for SMPTE we need to access values of some attributes of the <tt> element, which is not easily done through pyxb at the stage of the binding where the conversion happens.

COMPONENT: WebSocket carrier

As a playout service provider, I want to carry EBU-TT Live documents using the WebSocket protocol so that I can use EBU-TT Live over TCP.

Input: a sequence of part 3 documents.
Output: a sequence of part 3 documents for RFC 6455.

Document manifest format

We need to document the manitfest format. E.g.

  • in the code
  • in the wiki
  • in the header of the manifest file?

related to: #59

tests for smpte timebase semantic checks

Test that when timebase is set to smpte, documents are validated/rejected depending on the format of the time. Also test that presence and correctness of all needed parameters is checked

Unit-Test documents comparison (ComparableMixin)

Documents use the mixin ComparableMixin (defined in the project root in file utils.py). This mixin allows for correct and easy comparison of documents :

  • two documents with the same sequenceIdentifier will be compared using their sequenceNumber :
document1 < document2 if document1.sequenceNumber < document2.sequenceNumber
  • If the documents do not have the same sequenceIdentifier, there is no comparison possible and an error is raised.

With this issue I want to address the fact that this was not tested yet, so I implemented tests to ensure that document comparison works as intended.

COMPONENT: Part 1 cued

As a playout service provider, I want to consume a cued EBU-TT Live sequence from an EBU-TT Part 1 document .

Input: a single EBU-TT Part 1 document; cueing options.
Output: a sequence of Part 3 documents released according to the cueing options.

Basic testing setup

  • Find a way to handle xml files (with templates for example)
  • Setup a basic test infrastructure for bdd
  • Write some tests

Timeformats in documents

Time formats in documents are a bit confusing :

<tt:body tt:begin="63016289ms" tt:dur="00:00:01">

Can we chose the format used to convert timedeltas more precisely ? For example here, having begin in hh:mm:ss.ms and dur in xxxxxms format would be more logical.

Raised from discussion during 13/07/2016 call.

COMPONENT: Complex noise introducer

As a developer/tester, I want to control the level of 'noise' and complexity in a stream of part 3 documents so that I can test my implementation against different scenarios.

Input: a sequence of part 3 documents; options for introducing complexity into the stream.
Output: a sequence of modified part 3 documents.

COMPONENT: Simple Part 3 consumer (XML)

As a developer/tester I want a simple view of the text and times in a sequence of EBU-TT Live documents so that I can verify that they will be processed correctly.

Input: Sequence of part 3 documents; 'validate only' option.
Output: display of text with begin and end times and validation message(s)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.