Giter Site home page Giter Site logo

lu-zero / libvpx Goto Github PK

View Code? Open in Web Editor NEW
5.0 9.0 3.0 39.67 MB

Local libvpx changes (POWER8 Altivec/VSX support)

License: BSD 3-Clause "New" or "Revised" License

C 74.83% Makefile 1.51% Perl 0.37% Shell 1.30% C++ 10.19% Python 0.14% Assembly 6.54% Objective-C 4.25% Perl 6 0.86%
libvpx powerpc vsx altivec

libvpx's Introduction

README - 24 January 2018

Welcome to the WebM VP8/VP9 Codec SDK!

COMPILING THE APPLICATIONS/LIBRARIES:
  The build system used is similar to autotools. Building generally consists of
  "configuring" with your desired build options, then using GNU make to build
  the application.

  1. Prerequisites

    * All x86 targets require the Yasm[1] assembler be installed[2].
    * All Windows builds require that Cygwin[3] be installed.
    * Building the documentation requires Doxygen[4]. If you do not
      have this package, the install-docs option will be disabled.
    * Downloading the data for the unit tests requires curl[5] and sha1sum.
      sha1sum is provided via the GNU coreutils, installed by default on
      many *nix platforms, as well as MinGW and Cygwin. If coreutils is not
      available, a compatible version of sha1sum can be built from
      source[6]. These requirements are optional if not running the unit
      tests.

    [1]: http://www.tortall.net/projects/yasm
    [2]: For Visual Studio the base yasm binary (not vsyasm) should be in the
         PATH for Visual Studio. For VS2017 it is sufficient to rename
         yasm-<version>-<arch>.exe to yasm.exe and place it in:
         Program Files (x86)/Microsoft Visual Studio/2017/<level>/Common7/Tools/
    [3]: http://www.cygwin.com
    [4]: http://www.doxygen.org
    [5]: http://curl.haxx.se
    [6]: http://www.microbrew.org/tools/md5sha1sum/

  2. Out-of-tree builds
  Out of tree builds are a supported method of building the application. For
  an out of tree build, the source tree is kept separate from the object
  files produced during compilation. For instance:

    $ mkdir build
    $ cd build
    $ ../libvpx/configure <options>
    $ make

  3. Configuration options
  The 'configure' script supports a number of options. The --help option can be
  used to get a list of supported options:
    $ ../libvpx/configure --help

  4. Compiler analyzers
  Compilers have added sanitizers which instrument binaries with information
  about address calculation, memory usage, threading, undefined behavior, and
  other common errors. To simplify building libvpx with some of these features
  use tools/set_analyzer_env.sh before running configure. It will set the
  compiler and necessary flags for building as well as environment variables
  read by the analyzer when testing the binaries.
    $ source ../libvpx/tools/set_analyzer_env.sh address

  5. Cross development
  For cross development, the most notable option is the --target option. The
  most up-to-date list of supported targets can be found at the bottom of the
  --help output of the configure script. As of this writing, the list of
  available targets is:

    arm64-android-gcc
    arm64-darwin-gcc
    arm64-linux-gcc
    armv7-android-gcc
    armv7-darwin-gcc
    armv7-linux-rvct
    armv7-linux-gcc
    armv7-none-rvct
    armv7-win32-vs11
    armv7-win32-vs12
    armv7-win32-vs14
    armv7-win32-vs15
    armv7s-darwin-gcc
    armv8-linux-gcc
    mips32-linux-gcc
    mips64-linux-gcc
    ppc64le-linux-gcc
    sparc-solaris-gcc
    x86-android-gcc
    x86-darwin8-gcc
    x86-darwin8-icc
    x86-darwin9-gcc
    x86-darwin9-icc
    x86-darwin10-gcc
    x86-darwin11-gcc
    x86-darwin12-gcc
    x86-darwin13-gcc
    x86-darwin14-gcc
    x86-darwin15-gcc
    x86-darwin16-gcc
    x86-iphonesimulator-gcc
    x86-linux-gcc
    x86-linux-icc
    x86-os2-gcc
    x86-solaris-gcc
    x86-win32-gcc
    x86-win32-vs10
    x86-win32-vs11
    x86-win32-vs12
    x86-win32-vs14
    x86-win32-vs15
    x86_64-android-gcc
    x86_64-darwin9-gcc
    x86_64-darwin10-gcc
    x86_64-darwin11-gcc
    x86_64-darwin12-gcc
    x86_64-darwin13-gcc
    x86_64-darwin14-gcc
    x86_64-darwin15-gcc
    x86_64-darwin16-gcc
    x86_64-iphonesimulator-gcc
    x86_64-linux-gcc
    x86_64-linux-icc
    x86_64-solaris-gcc
    x86_64-win64-gcc
    x86_64-win64-vs10
    x86_64-win64-vs11
    x86_64-win64-vs12
    x86_64-win64-vs14
    x86_64-win64-vs15
    generic-gnu

  The generic-gnu target, in conjunction with the CROSS environment variable,
  can be used to cross compile architectures that aren't explicitly listed, if
  the toolchain is a cross GNU (gcc/binutils) toolchain. Other POSIX toolchains
  will likely work as well. For instance, to build using the mipsel-linux-uclibc
  toolchain, the following command could be used (note, POSIX SH syntax, adapt
  to your shell as necessary):

    $ CROSS=mipsel-linux-uclibc- ../libvpx/configure

  In addition, the executables to be invoked can be overridden by specifying the
  environment variables: CC, AR, LD, AS, STRIP, NM. Additional flags can be
  passed to these executables with CFLAGS, LDFLAGS, and ASFLAGS.

  6. Configuration errors
  If the configuration step fails, the first step is to look in the error log.
  This defaults to config.log. This should give a good indication of what went
  wrong. If not, contact us for support.

VP8/VP9 TEST VECTORS:
  The test vectors can be downloaded and verified using the build system after
  running configure. To specify an alternate directory the
  LIBVPX_TEST_DATA_PATH environment variable can be used.

  $ ./configure --enable-unit-tests
  $ LIBVPX_TEST_DATA_PATH=../libvpx-test-data make testdata

CODE STYLE:
  The coding style used by this project is enforced with clang-format using the
  configuration contained in the .clang-format file in the root of the
  repository.

  Before pushing changes for review you can format your code with:
  # Apply clang-format to modified .c, .h and .cc files
  $ clang-format -i --style=file \
    $(git diff --name-only --diff-filter=ACMR '*.[hc]' '*.cc')

  Check the .clang-format file for the version used to generate it if there is
  any difference between your local formatting and the review system.

  See also: http://clang.llvm.org/docs/ClangFormat.html

SUPPORT
  This library is an open source project supported by its community. Please
  email [email protected] for help.

libvpx's People

Contributors

aconverse avatar agrange avatar debargha avatar dmitriykovalev avatar fritzk avatar imgmips1 avatar jamesaberry avatar jeromejj avatar jimbankoski avatar jingninghan avatar jkoleszar avatar jzern avatar kaustubhimg avatar komh avatar linfengz avatar lu-zero avatar luctrudeau avatar marco99zz avatar mstorsjo avatar paulwilkins avatar pbos avatar pengchongjin avatar punksu avatar rbultje avatar stefanholmer avatar tomfinegan avatar vigneshvg avatar yaowuxu avatar yinshiyou avatar zoeviper avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libvpx's Issues

Implement the idct

Functions to implement for this task:

  • vpx_idct4x4_16_add_vsx
  • vpx_idct8x8_64_add_vsx
  • vpx_idct16x16_256_add_vsx
  • vpx_idct32x32_1024_add_vsx

VSX Version of vpx_sad8x8

Implement a VSX version of vpx_sad8x8

Each function must:

  • Pass the SADTest suite
  • Include Speed Test to the SADTest suite (Disabled by default)
  • Report performance in commit msg (compared to C version)
    • Must show significant speedup over C version

Solaris 11.3 SPARC build error

hi, i cant compile on solaris 11.3 sparc, here the errors :

In file included from ./vp8/common/threading.h:195:0,
from ./vp8/encoder/onyx_int.h:24,
from vp8/vp8_cx_iface.c:20:
./vpx_util/vpx_atomics.h:62:2: error: #error Unsupported architecture!
./vpx_util/vpx_atomics.h: In function ‘vpx_atomic_store_release’:
./vpx_util/vpx_atomics.h:87:3: warning: implicit declaration of function ‘vpx_atomic_memory_barrier’
gmake[1]: *** [vp8/vp8_cx_iface.c.o] Error 1
gmake: *** [.DEFAULT] Error 2

VSX Version of vpx_sub_pixel_variance

Implement a VSX version of :

  • vpx_sub_pixel_variance8x8
  • vpx_sub_pixel_variance16x16
  • vpx_sub_pixel_variance32x32
  • vpx_sub_pixel_variance64x64

Each function must:

  • Pass the VpxSubpelVarianceTest suite
  • Include Speed Test to the VpxSubpelVarianceTest suite (Disabled by default)
  • Report performance in commit msg (compared to C version)
    • Must show significant speedup over C version

Implement the convolutions

The filter bank can be used as is and loaded to vectors on demand, should be possible to convert it to an array of vectors to avoid a round trip later.

Some functions of the family can have specific implementations adding some constraints on blocksize and filter offsets. Some can be efficiently implemented only for some blocksizes.

Speed up VSX convolution code

The vpx_convolve8_vsx function is the most time consuming function of libVPX on POWER. For POWER8, 24% of the runtime is spent in vpx_convolve8_vpx, while in POWER9 that value increases to 30%. Taking the time to optimize even more this function will have considerable impact on the libVPX encoding speed on POWER.

This is the optimal place to optimize libVPX on POWER in order to maximize results. Doubling the speed of vpx_convolve8_vsx will reduce encoding time by 10 to 15%.

This includes the following functions:

  • convolve
  • convolve_horiz
  • convolve_line_h
  • convolve_vert
  • convolve_line_v

Testing:

  • Must pass the ConvolveTestSuite suite
  • Refactor ConvolveTestSuite to use the AbstractBench
  • Report performance in commit msg (compared to C version)
  • Show significant speedup over C version

VSX version of vpx_quantize_fp

Implement a VSX version of :

  • vpx_quantize_fp
  • vpx_quantize_fp_32x32

Each function must:

  • Pass the VP9QuantizeTest suite
  • Include Speed Test to the VP9QuantizeTest suite (Disabled by default)
  • Report performance in commit msg (compared to C version)
    • Must show significant speedup over C version

VSX Version of vpx_fdct32x32_rd

Implement a VSX version of vpx_fdct32x32_rd

Each function must:

  • Pass the Trans32x32Test suite
  • Include Speed Test to the Trans32x32Test suite (Disabled by default)
  • Report performance in commit msg (compared to C version)
    • Must show significant speedup over C version

Speed Up SADNxNx4D

More than 15% of the encoding time of libVPX on POWER is spent in the SADNxNx4D functions.

%  Function
10.63% vpx_sad16x16x4d_vsx
3.60% vpx_sad32x32x4d_vsx
3.22% vpx_sad64x64x4d_vsx
1.12% vpx_sad8x8x4d_c

Current VSX SAD implementations can be further optimized for considerable performance improvements. Doubling the speed of the SADNxNx4D functions would reduce encoding time by 5 to 8%.

This includes the following functions:

  • vpx_sad16x16x4d_vsx
  • vpx_sad32x32x4d_vsx
  • vpx_sad64x64x4d_vsx
  • vpx_sad8x8x4d_vsx
  • PROCESS16_4D
  • SAD8_4D
  • SAD16_4D
  • SAD32_4D
  • SAD64_4D

Testing:

  • Must pass the SADx4Test suite
  • Refactor SADx4Test to use the AbstractBench
  • Report performance in commit msg (compared to C version)
  • Show significant speedup over C version

Support clang

Clang is not behaving like gcc and seems lacking some instructions:

  • Add the clang support in configure (e.g. add -maltivec)
  • Write the function replacements and use them only for clang
  • Benchmark and test on clang as well.

VSX version of vpx_subtract_block

Implement a VSX version of vpx_subtract_block.

  • It must pass the VP9SubtractBlockTest suite
  • Add Speed Test to the VP9SubtractBlockTest suite (Disabled by default)
  • Report performance in commit msg (compared to C version)
    • Must show significant speedup over C version

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.