Giter Site home page Giter Site logo

minecake147e / shamisen Goto Github PK

View Code? Open in Web Editor NEW
30.0 4.0 1.0 307.02 MB

Cross-Platform Audio Library for .NET 7 and later.

License: MIT License

C# 100.00% PowerShell 0.01%
dotnet dotnet-core csharp csharp-library audio audio-library audio-streaming audio-player dotnetcore cross-platform

shamisen's Introduction

Shamisen Logo

Shamisen - .NET Audio Library

CodeFactor .NET
A Cross-Platform Audio Library for .NET 8.

Usage of Shamisen

  • Fast Digital Signal Processing
  • Abstraction Layer for Audio I/O

Currently implemented features

Audio I/O and bindings

Managed backends using existing library

Name (Backend) Author (Backend) License (Binding) Windows10 Desktop Windows10 UWP Android Linux iOS Mac OSX
OpenTK (OpenAL) OpenTK MIT License
AudioGraph on
.NET 5 or later
Microsoft MIT License
NAudio (WaveOut, ASIO, DirectSound, WASAPI) NAudio MIT License
Xamarin.Android (AudioTrack) Xamarin MIT License

❓: Not Tested or needs more information
✅: Tested
❎: Impossible

Digital Signal Processing (Cross-Platform)

Fast and smooth Up-sampling using Catmull-Rom Spline

  • Utilizes System.Numerics.Vectors and System.Runtime.Intrinsics for resampling calculation.
  • Uses Direct/Wrapped caching for Catmull-Rom spline coefficients.

Benchmarks on .Net 5, Intel Core i7 4790

Highly Optimized Cooley–Tukey FFT algorithm for single-precision 1D complex data.

  • Utilizes System.Runtime.Intrinsics.X86.Avx and System.Runtime.Intrinsics.X86.Avx2 for FFT calculation.
  • Forward transformation rescales result by 1/N.
  • The memory consumption is O(N) even though the FFT is done in-place.
  • Does not need initialization for certain fixed size by default.
    • Fixed-size variant is also implemented and is slightly faster.
  • It requires that the size of the data be a power of two. Otherwise, the data will be implicitly resized to the size of the largest power of two less than the original size.

Fast conversion between PCM sample formats

To\From IEEE 754
Binary32(float)
32bit Linear
PCM(Q0.31)
24bit Linear
PCM(Q0.23)
16bit Linear
PCM(Q0.15)
8bit LPCM
(Excess-128)
G.711
μ−Law
G.711
A-Law
IEEE 754
Binary32(float)
✖️ ✅*
32bit Linear
PCM(Q0.31)
✅* ✖️ ☑️ ☑️ ☑️ ☑️ ☑️
24bit Linear
PCM(Q0.23)
✅* ☑️* ✖️ ☑️ ☑️ ☑️ ☑️
16bit Linear
PCM(Q0.15)
✅* ☑️* ☑️* ✖️ ☑️ ☑️ ☑️
8bit LPCM
(Excess-128)
✅* ☑️* ☑️* ☑️* ✖️ ☑️* ☑️*
G.711 μ−Law ✅* ☑️* ☑️* ☑️* ☑️* ✖️ ☑️*
G.711 A-Law ✅* ☑️* ☑️* ☑️* ☑️* ☑️* ✖️

Legends:
✅: Shamisen has optimized implementation of direct conversion.
☑️: Shamisen can handle conversion by 2 or more converter. Can be partially optimized. Depending on the combination, noise due to quantization error may occur.
✔: Shamisen has simple implementation of direct conversion.
⭕: Shamisen can handle conversion by 2 or more converter. Both converter is implemented in simple way. Depending on the combination, noise due to quantization error may occur.
❎: Shamisen has no support for conversion.
✖️: No conversion needed(same format).
* : Such conversion can cause noise due to quantization errors.

Optimized BiQuad Filters that supports some filtering

  • Uses Vector2 and Vector3 for filter calculations in each channels.
  • Unrolls channel loop for Monaural and <5ch filter calculation.
  • For some special cases, it utilizes SSEx.x and AVX(2) intrinsics for the calculation.

Other Features

  • FastFill for some types that fills quickly using Vector<T>.

File Formats and Codecs

Cross-Platform

Container Name Typical File Extensions Implemented Codec Library contains Decoder/Encoder License Decoding Encoding
Waveform
RF64
.wav Linear PCM, IEEE 754 Floating-Point PCM, A-Law, μ-law Shamisen MIT License
(RF64 as default)
FLAC .flac FLAC Shamisen.Codecs.Flac MIT License ❎(Planned)

Legends:
✅: Supported by Shamisen itself
✔: Supported by another library and its wrapper for Shamisen
❎: Not supported by Shamisen without any custom integration

Platform-Dependent

  • Any formats supported by platform-dependent binding libraries

Dependencies and system requirements

  • Currently, Unity IS NOT SUPPORTED AT ALL!
  • Requires DivideSharp for frequently appearing divide-by-number-of-channels operations.
  • The most processing in this library fully depends on SINGLE core.
    • Because Span<T> does not support multi-thread processing at all.

Features planned or under development

Audio I/O and bindings

Native backends

✅: Possible
❓: Needs more information
❎: Impossible

Name of Backend Author of Backend License (binding) Target Platform Status
Oboe Google MIT License Android >10 Planned

Managed backends

✅: Possible
❓: Needs more information
❎: Impossible

Name of Backend Author of Backend License (binding) Target Platforms Status
Silk.NET OpenAL .NET Foundation Silk.NET Team MIT License Cross Platform(OpenAL) Gathering Information
Silk.NET XAudio .NET Foundation Silk.NET Team MIT License Windows-Like(XAudio) Planned
Xamarin.iOS Microsoft MIT License iOS Planned

File Formats and Codecs

Cross-Platform

✅: Shamisen will have Managed Implementation of decoder/encoder itself
⭕: Shamisen will have Managed Wrapper for another library
❎: Not included in plan currently

Container Name Typical File Extensions Target Codec Planned Library containing Decoder/Encoder Planned Library License Decoding Encoding Status
FLAC .flac FLAC Shamisen.Codecs.Flac MIT License Implemented Decoder
Opus .opus Opus Shamisen.Codecs.Opus MIT License Planned
Ogg .ogg Vorbis Shamisen.Codecs.Ogg MIT License Planned

shamisen's People

Contributors

dependabot-preview[bot] avatar dependabot[bot] avatar minecake147e avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

bangush

shamisen's Issues

.NET 5

In order to improve performance of Shamisen, we have to move Shamisen from .NET Standard 2.0 to .NET 5.

  • .NET 5 Release

Major changes

  • Change target framework of Core library Shamisen from netstandard2.0 to net5.0;netcoreapp3.1;netstandard2.1;netstandard2.0 => a7e2741
  • Adopt new APIs almost EVERYWHERE
    • x86/64
      • SSE
      • SSE2
      • SSE3
      • SSE4.x
      • AVX
      • AVX2
      • Bmi1
      • Bmi2
      • Fma
      • Lzcnt
      • Popcnt
    • ARM
      • AdvSimd
      • ArmBase
      • Crc32
      • Dp
      • Rdm
    • Cross Platform
      • System.Numerics.BitOperations
      • System.MathF
      • System.Half
      • System.HashCode
      • System.Math.Tau
      • System.Math.FusedMultiplyAdd
      • System.MathF.FusedMultiplyAdd

Refactor Utility Features and pack them independently

Shamisen's SIMD-Optimized Utility functions has been grown greatly.
There could be some demands of these functions out of Shamisen.Core's ecosystem.

Tasks

  • Create new Class Library Project Shamisen.Utils
  • Migrate every utility functions that are mainly in the namespace Shamisen.Utils to the library Shamisen.Utils
  • Separate tests and benchmarks for Shamisen.Utils
  • Publish a NuGet package for Shamisen.Utils

CoreAudio API Wrapper for Windows 10

Background

Currently, output for Windows Desktop Apps relies NAudio and CSCore. They are all promising, but it'll be nice to have our own wrapper for Windows CoreAudio APIs.

Tasks

  • Gather information about COM Interoperation in .NET Core
  • Define API Candidates
  • Create a new library Project
  • Implement new APIs

Oboe Backend for Android

Background and motivation

Currently, audio output for Android relies on AudioTrack. Of course it's quite promising, but it'll be nice to have our own Oboe binding.

Tasks

  • Gather information about ClangSharp and Oboe
  • Make some API candidates
  • Implement the API

Idea: Faster Calculation of SinusoidSource

I have to fix the currently extremely-slow SinusoidSource which sucked load tests for SplineResampler.
I have to seek THE FASTEST IN THE WHOLE WORLD FOR ALL PLATFORMS in order to spread MonoAudio all over the world, so it DEFINITELY ABSOLUTELY SUCKS.
So I came across the ideas below:

  • Real part of the Complex number rotation removed due to precision and stability issue

  • Simple oscillation simulation e.g. Spring removed due to precision and stability issue

  • ~~Oscillate square wave at twice the target frequency sampled and resample ~~ it might be noisy

  • Devirtualization => a9f33ec

  • Inlining => a9f33ec

  • Faster phase shifts => a9f33ec

SplineResampler - possible further improvements

Background and motivation

Currently, code size of SplineResampler is skyrocketing while it's not fast enough to handle non-integer-ratio conversion very quickly.

Possible improvements

  • Reorder Catmull-Rom coefficients so that the branch will be out of dot-product loop.
    • Coefficients for CachedDirect
      • Coefficients access order will be like sawtooth wave.
    • Coefficients for CachedWrappedOdd
      • Coefficients access order will be like triangular wave.
    • Coefficients for CachedWrappedEven
      • Coefficients access order will be like triangular wave.
  • Increase the maximum size of coefficient memos for wrapped strategies.
    • CachedDirect can miss the L1 cache if we increase the maximum memo size because of its sawtooth-wave-like access.
  • Use delegate instead of switch statements for all combinations of channels, conversion ratio, and hardware.
    • Unify the parameter signature with newly introducing struct type.
  • Optimization for the conversion ratio 44.1Hz->48kHz.
    • Conversion-ratio-specific optimizations already exist for double-rate and quadruple-rate stereo/monaural conversion.
  • Utilize Expression Tree for non-trivial conversion ratio like 44.1kHz->96kHz. Currently impossible because Expression Tree supports neither ref locals nor pointers.
    • Vectorize using X86 and Arm Hardware Intrinsics if available.
    • Vectorize using System.Numerics.Vectors if no intrinsic available.
    • Unroll dot-product loops.

Implement a brand new Pipeline System

Backgrounds

  • We need to implement more efficient system for signal processing.
    • High throughput for BGM and ambient SFX with new architecture
    • Low Latency for SFX with existing architecture

Idea and Progress

  • Add interface IPipelineComponent<TSample, TFormat>.
    • async Task UpdateAsync() to swap buffer and process next block asynchronously.
  • Add class PipelineInput<TSample, TFormat> that translates IReadableAudioSource<TSample, TFormat> or IAsynchronouslyReadableAudioSource<TSample, TFormat> to be read as IPipelineComponent<TSample, TFormat>.
  • Add interface IPipelineOutput<TSample, TFormat> that can be read as IReadableAudioSource<TSample, TFormat>.
  • Add class AudioPipe<TSample, TFormat> that connects two IPipelineComponent<TSample, TFormat>.
  • Add struct AudioBuffer<TSample, TFormat> that holds single processing block.

More Codecs and Containers

Background and Motivation

In the Shamisen project, we are trying to make it more practical by adding our own features that the existing libraries have.
One of the existing libraries, CSCore, supports many more codecs than the current Shamisen, so it would be good if Shamisen supports more codecs.

Rough Roadmap

Container Format

Container Contents Current <v1.0 v1.0 Future
Ogg FLAC, Opus, Vorbis
Matroska Various Formats

Contained Audio

Codec Container Current <v1.0 v1.0 Future
FLAC Native FLAC, Ogg, Matroska D(Native Only) DE(All)
Opus Ogg,Matroska D E
Vorbis Ogg D(libvorbis) E(libvorbis)

Tasks

  • FLAC Codec
    • Decoder
      • Managed one => #116
    • Encoder
      • Managed one
  • Opus Codec
    • Decoder
      • Managed one
    • Encoder
      • Managed one
  • Vorbis Codec
    • Decoder
      • Through libvorbis
    • Encoder
      • Through libvorbis

Split `Shamisen` into smaller parts and publish NuGet packages

Background and motivation

Currently, Shamisen is getting larger and larger, and causing code bloat because of more features and their optimization.
Some user doesn't need a fully-managed FLAC decoder while some of other doesn't need any IOs.
So I can split Shamisen into smaller parts, and bundle them in several NuGet package if needed.

Possible new library structure

  • Shamisen NuGet package
    • Shamisen.Core NuGet package / library
      • Interfaces and DSP features
      • WAV decoder and encoder
    • Shamisen.IO NuGet package / library
      • IO-related interfaces
    • Shamisen.Codecs.Flac NuGet package / library
      • FLAC decoder and encoder
    • Shamisen.Pipeline NuGet package / library

Tasks

  • Split library into smaller parts
  • Release alpha version of each library to NuGet

`SplineResampler` needs to be fixed on All Platform

Backgrounds

  • Acoustic glitch on Xperia XZ3
    • Output without resampling has no glitch
    • There's nothing similar on Windows(i7-4790) It could appear on Windows due to same typo
      • It didn't appear because the frame border is always aligned with 440Hz test signal

Progress

Find what's going wrong

  • Dump output sample to file
    • Zeros found on block starts(Why???)
  • Analyze how glitches appear in Debug mode
    • Found a typo(should be 0 but 1)

Fix it

Mixing Reform

Backgrounds

  • SimpleMixer was not simple enough to be named after its features.
    • Supporting addition/removal of audio sources makes it complex

Idea and Progress

  • SimpleMixer Overhaul: Make SimpleMixer as fast and straightforward as possible
    • Mixes ONLY 2 SOURCES with 1:1 customizable ratio of volume
      • Scale & mix can be done quickly
    • Drop adding/removing/replacing support
  • Add net5.0 library project Shamisen.Game
    • Implement SFX System (on another Issue)
  • #12
  • Implement AdvancedMixer
    • Implement Data flow Management to maximize the output performance for real-time audio mixing and playback.
      • Set the Buffering Strategy for every input to calculate output
        • Aggressive Pre-Buffering for BGM from storage or network stream
        • Realtime for input from Sound Effect mixer(like SFX system introduced above) for minimal latency
      • Make whole mixing process multi-threaded
        • Pre-Read() mixing for BGM using Pairwise Summation
        • On-Read() mixing for SE Mixers(which is also called in parallel)

Advanced High-Quality Resampler

Background and motivation

Current SplineResampler has no optimization for down-sampling at all, relying on slow BiQuad LPF.
We need another resampler for down-sampling.

Ideas

  • Apply SIMD-friendly LPF
    • Wavelets like CDF 9/7 Wavelet
    • Further optimization of BiQuad filter might also be needed
  • Dynamic resolution scaling
    • Decimate or interpolate samples down or up to the maximum power-of-two multiple of source frequency less than target frequency, by wavelets.
  • Use existing SplineResampler for final interpolation

Feature Wishlist on 1.0 Release

This issue tracks ideas for new features and improvements for the first release of Shamisen.

New features

Architecture

  • Add an interface for Recording => 732f381
  • Add an interface for Audio Device and Audio Device Enumeration => 010435a

Primitives

  • Improve implementation of Shamisen.OffsetSByte => 4464b73
    • Implement IComparable<OffsetSByte> => 4464b73
    • Implement IEquatable<OffsetSByte> => 4464b73
    • Add MaxValue and MinValue static readonly field => 4464b73
    • Implement Parse, TryParse, and ToString => 49693e4
  • Add some types for manipulating Fixed-Point number arithmetics
    • Shamisen.Fixed8O (based on Shamisen.OffsetSByte)
    • Shamisen.Fixed16 => 49693e4
    • Shamisen.Fixed32

Codec

  • Add an interface for representing Codecs => 64252eb
  • Implement a fully-managed WAVE decoder. => 3227c35
    • RF64 Support
    • Sample formats
      • Linear PCM
        • OffsetSByte
        • short(Int16)
        • Int24
        • int(Int32)
        • long(Int64)
      • IEEE 754 Float
        • float(Single)
        • double(Double)
      • Other formats
        • A-law
        • μ-Law
  • Implement a fully-managed WAVE encoder. => fc16613
    • RF64 Support
    • Sample formats
      • Linear PCM
        • OffsetSByte
        • Int16
        • Int24
        • Int32
        • Int64
      • IEEE 754 Float
        • float(Single)
        • double(Double)

Output

  • Implement outputs for managed back-ends below:
    • AudioTrack(Xamarin.Android) => 9593125
    • CSCore(Windows) Deprecated
    • NAudio(Windows)
    • OpenTK(XPlat)
    • MonoGame DynamicSoundEffectInstance(XPlat) => b72d79e
  • Implement outputs for native back-ends below:(low priority)
    • WASAPI(Windows) => #58
    • ASIO(XPlat)
    • #120
    • ALSA(Linux)
      Unfortunately, I have no iOS devices and no Mac. So, the output for them will be in later release.

Signal Processing

  • Implement a multi-threaded audio buffer. => a92ab23
  • Implement a simple Mixer.
  • Implement a simple Attenuator(Amplifier). => 8432fe0
  • Implement AdvancedMixer => #11

Utilities

  • Add FastAdd that calculates an element-wise summation for Span<float>. => 63014dd
  • Add FastScalarMultiply for Span<float>. => 3165784
  • Add FastMix that adds a scaled samples to buffer, for Span<float>. => adf6e4a

Tests

  • Tests for several existing features
    • Attenuator => 4be8ecd
    • BiQuadFilter => 4be8ecd
    • SimpleMixer after overhaul

Miscellaneous

  • Documentation
  • Logo => 51e7620
  • NuGet Package

And more...

Modifications of existing features

Overhaul

  • Mixing Reform #11

Modification

  • SplineResampler
    • Resampling Quality
      • Make it working correctly again => a9f33ec
    • Performance improvement
      • Convert increment and multiply and modulo into add and modulo => 2ed171b
  • BiQuadFilter
    • Optimization
      • Localization for Format.Channels, Parameter.B, and Parameter.A => 30c3bf7
      • Loop unrolling depending on Channels
      • Improvement of Cache-Friendliness around internalStates

Rename "MonoAudio" => "Shamisen"

Background

As the development of this library has progressed, .NET 5 has emerged and Mono is becoming a thing of the past.
We are developing this library in the pursuit of performance. .NET 5 and .NET Core 3.1 were always beating Mono by a wide margin in recent benchmarks, so the current name "MonoAudio" no longer fits this library.
WE'LL NEVER SACRIFICE PERFORMANCE TO SOLVE THIS PROBLEM, so we are renaming MonoAudio.

Tasks

  • Gather rename candidates
  • Decide new name of the library
  • Rename repository
  • Rename VS2019 Solution
  • Rename all VS2019 Projects
  • Rename all directories
  • Refactor all source codes
  • Refactor documentation and README.md
  • Change Logo and create NuGet Icon

The rule of new name

  • Do not EXACTLY(but case INSENSITIVE) collide with other GitHub repositories with more than 100 stars.
  • Must be able to be C# identifier.
  • Each word is not very long (<15 characters in total).
  • Named after musical instruments(like Oboe).
    • Not to be a percussion instrument.
    • Optionally be a traditional Japanese musical instrument.

Renaming Candidates

  • Shakuhachi
  • Shinobue
  • Shamisen
  • Kokyu
  • Kagurabue

IAudioSource Reform

WIP

Background

Currently, MonoAudio has an issue mainly around Length and Position.
Ex: DummySource and SilenceSource have infinite length.
SinusoidSource also has infinite length, but, it implements Position which causes exceptions.

Proposed Changes

  • Add interface ISkipSupport
  • Add interface ISeekSupport
  • Add some properties in IAudioSource:
ulong? Length { get; }
ulong? TotalLength { get; }
ulong? Position { get; }
ISkipSupport? SkipSupport { get; }
ISeekSupport? SeekSupport { get; }
  • [ ]

Drop support for .NET 6 and earlier

Background and motivation

.NET 7 is released.
.NET Core 3.1 and earlier is being deprecated.
Shamisen is really struggling to keep up on writing many #if to keep performance superior while keeping the
cross-version compatibility.
So I decided not to support earlier versions of .NET in the first place.

Tasks

WIP

`IDataSource<TSample>` and nullable reference types

Motivation

IDataSource<TSample> does not support the nullable Position property, and it unnecessarily forces to implement ReadAsync(Memory<TSample) with very little chance of it properly implemented.
It is also not as elegant as IAudioSource<TSample> in dealing with source-dependent implementation of ISkippableDataSource<TSample>.

Tasks

  • Add the interfaces below:
    • IReadSupport<TSample> => 39cf50d
    • IAsyncReadSupport<TSample> => 39cf50d
  • Add the properties below in IDataSource<TSample> => 39cf50d
    • ulong? Length { get; }
    • ulong? TotalLength { get; }
    • ulong? Position { get; } instead of current ulong Position { get; }
    • ISkipSupport? SkipSupport { get; }
    • ISeekSupport? SeekSupport { get; }
  • Replace ReadResult Read(Span<TSample> destination) with IReadSupport<TSample>? ReadSupport {get;} in IDataSource<TSample> => 39cf50d
  • Replace ValueTask<ReadResult> ReadAsync(Memory<TSample> destination) with IAsyncReadSupport<TSample>? AsyncReadSupport {get;} in IDataSource<TSample> => 39cf50d
  • Add the IReadableDataSource<TSample> for existing uses of IDataSource<TSample> => 39cf50d
  • Make IReadableAudioSource<TSample, TFormat> implement IReadSupport<TSample> => 39cf50d
  • Make IAsyncReadableAudioSource<TSample, TFormat> implement IAsyncReadSupport<TSample> => 39cf50d

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.