Giter Site home page Giter Site logo

v3 rewrite about neko HOT 18 OPEN

m1k1o avatar m1k1o commented on July 17, 2024 6
v3 rewrite

from neko.

Comments (18)

m1k1o avatar m1k1o commented on July 17, 2024 3

I already started and did the first step, where i upgraded demodesk/neko to vue3. Next step would be to remove vue3 as dependency for the core module and use vue only in the test client to speed up testing.

from neko.

ehfd avatar ehfd commented on July 17, 2024 2

Wayland Support

This would require gstreamer1.0-pipewire or kmsgrab. The latter requires setcap cap_sys_admin+ep, thus might not be practical. Weylus and Rustdesk uses gstreamer1.0-pipewire through xdg-desktop-portal, so this may be preferable.

Implementations readily available at https://github.com/H-M-H/Weylus, https://github.com/pavlobu/deskreen, and of course https://github.com/LizardByte/Sunshine.
More reference:
https://github.com/any1/wayvnc
https://github.com/bbusse/swayvnc-firefox
https://github.com/boppreh/keyboard
https://github.com/boppreh/mouse
https://github.com/maliit/keyboard

Another alternative is https://github.com/games-on-whales/gst-wayland-display by @ABeltramo and @Drakulix. This is by using nested compositors. However, this would likely not work when in conjunction with real monitors.

PipeWire as a hub for all media (Wayland video capture, V4L2 webcam stream dispensation, and PulseAudio drop-in replacement)

Switch to PipeWire for the containers and accept PipeWire audio (for either X11 or Wayland) as well as screen capture (for Wayland) directly as well as PulseAudio (stop-gap solution would be pipewire-pulse). gstreamer1.0-pipewire exists for this.

Moreover, an interesting capability Pipewire has is its potential to replace v4l2loopback with pipewire-v4l2.
It uses LD_PRELOAD (much like the Selkies uinput joystick interposer) to intercept v4l2 communication and route clients to ingest streams from PipeWire. Thus webcams and other video media can utilize this pathway.
This is very interesting as it will work better with containers and Kubernetes and does not require the loopback kernel device.

It's possible to compile the PipeWire GStreamer plugin together with GStreamer if it is included as a subproject in meson.build. Else, use the pipewire-debian PPA.

from neko.

ehfd avatar ehfd commented on July 17, 2024 1

QUIC / WebTransport + WebCodecs & WebAudio

For QUIC, it should (obviously) be over HTTP/3 WebTransport instead of a custom QUIC protocol to be compatible with web browsers. Even if we're talking about native clients, HTTP/3 WebTransport would offer no disadvantages.

Note that https://developer.mozilla.org/en-US/docs/Web/API/WebTransport/WebTransport#servercertificatehashes needs to be generated for self-signed certificates at every session start, as the self-signed certificate maximum length is 14 days.

It is worth noting that because WebTransport lacks several capabilities (including the availability of reverse proxies), the technology might need to wait to mature.
w3c/webtransport#525

WebCodecs should be used for decoding in any other protocol than WebRTC, and fall back to WebAssembly + MSE libraries or other different methods if it doesn't exist (since it's pretty new).

It is important to understand that WebSockets should typically not be mixed with HTTP/3 or WebTransport to obtain maximum benefit from HTTP/3. The reason WebSockets were required in WebRTC was to exchange signaling information. HTTP/3 can be used for this very purpose as well and should be.

MediaStream/getUserMedia should still be used for WebRTC. DataChannels cannot transport video or audio efficiently.

All methods for video processing in browsers:
https://webrtchacks.com/video-frame-processing-on-the-web-webassembly-webgpu-webgl-webcodecs-webnn-and-webtransport/
https://rxdb.info/articles/websockets-sse-polling-webrtc-webtransport.html

https://www.youtube.com/watch?v=0RvosCplkCc
https://github.com/colinmarc/magic-mirror
https://github.com/mira-screen-share/sharer
https://github.com/netrisdotme/netris
https://quic.video/
https://developer.mozilla.org/en-US/docs/Web/API/WebTransport/WebTransport
https://www.w3.org/2021/03/media-production-workshop/talks/chris-cunningham-webcodecs-videoencoderconfig.html
https://docs.libp2p.io/concepts/transports/webtransport/
https://github.com/quic-go/webtransport-go

https://developer.mozilla.org/en-US/docs/Web/API/WebCodecs_API
https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API

WebSockets for both media and signaling

Also worth noting is that some restricted firewalled environments need to use WebSockets (with upgrade from HTTP/1.1 required) for both media and signaling. This might be worth one of the options.

For WebSockets, WebCodecs, then falling back to WebAssembly + MSE might work for video and audio decoding.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Protocol_upgrade_mechanism#upgrading_to_a_websocket_connection

RTWebSocket is a pretty interesting option for the WebSocket portion of this project.

https://github.com/zenomt/rtwebsocket
https://github.com/zenomt/rtmfp-cpp

https://www.rfc-editor.org/rfc/rfc7016.html

Return flow association (bi-directional information):
https://www.rfc-editor.org/rfc/rfc7016.html#section-2.3.11.1.2

https://www.rfc-editor.org/rfc/rfc7425.html

Return flow association (bi-directional information):
https://www.rfc-editor.org/rfc/rfc7425.html#section-5.3.5

Along with WebTransport, the possible reason to use this is to show frames as soon as they arrive instead of going through internal jitterbuffers which WebRTC has limited control over (but the WebRTC approach of having separate audio and video streams should still make the frame display as fast as possible).

Two things to keep in mind about RTWebSocket when looking at the above links:

RTWebSocket uses the same primary abstraction as RTMFP: unidirectional ordered message-oriented "flows" named by arbitrary binary metadata, where each flow can have an independent priority/precedence (that you can change at any time), each message can have an arbitrary transmission deadline (that you can change at any time), and "return flow association" generalizes bidirectional communication into arbitrarily-complex trees of unidirectional flows.
To fully realize the "Real-Time" benefits of RTWebSocket (or RTMFP), you need to avail yourself of the real-time facilities of priority/precedence and transmission deadlines.

Note that you can revise a message's transmission deadline after it's been queued (for example, you can revise the transmission deadlines of previously-queued video messages when you get a new keyframe).

The link to Section 5.3.5 of RFC 7425 above is to an illustration of multiple levels and many more than 2 flows of a bidirectional flow tree.

RFC 7016 describes RTMFP (a UDP-based transport protocol suspiciously similar to, but significantly predating, Quic). RFC 7425 describes how to send RTMP video, audio, data, and RPC messages over RTMFP. the same method can be used to send RTMP messages over RTWebSocket, and the https://github.com/zenomt/rtwebsocket and https://github.com/zenomt/rtmfp-cpp repos have demonstrations of doing that.

WebRTC Improvements

Investigate WHIP and WHEP, allowing unlimited users through WebRTC:
https://bloggeek.me/whip-whep-webrtc-live-streaming/
https://gstreamer.freedesktop.org/documentation/rswebrtc/whipclientsink.html?gi-language=c
https://gstreamer.freedesktop.org/documentation/rswebrtc/whipserversrc.html?gi-language=c
https://gstreamer.freedesktop.org/documentation/webrtchttp/whepsrc.html?gi-language=c
However, WHIP and WHEP don't offer DataChannels, thus requiring another stream or in usage simply for view-only users.

Chunked DataChannel for clipboard and other large information: https://groups.google.com/g/discuss-webrtc/c/f3dfmu3oh00
WebRTC TURN REST API usage: #370
Investigate MultiOpus: node-webrtc/node-webrtc#603 (comment)

Miscellaneous

Developers must understand that all protocols and APIs are bound by web browser specifications and standards. Luckily, we have reached an era where most required APIs for this project objective are supplied by web browsers. In some cases, look for an external JavaScript or WebAssembly implementation.

DASH is available through JavaScript fallback.
https://caniuse.com/mpeg-dash
https://github.com/Dash-Industry-Forum/dash.js

RTSP seems to be not supported directly in the web browser; rather a protocol to capture from another source accessible to the server, than a transport protocol on the web.
https://stackoverflow.com/questions/2245040/how-can-i-display-an-rtsp-video-stream-in-a-web-page

Same for QUIC; QUIC on a web browser is only done with WebTransport.

MQTT is also redundant; all MQTT web client libraries use WebSockets to transport MQTT requests.

from neko.

ehfd avatar ehfd commented on July 17, 2024 1

Anyone familiar with any of the concepts described above is encouraged to discuss and contribute.

This is definitely a possible project. GeForce Now, XBOX Cloud, Reemo all did it in the web browser through WebRTC. Now it's time for something open-source and not too restrictive (more permissive than GPL).

from neko.

ehfd avatar ehfd commented on July 17, 2024 1

image
This is the start screen for https://stim.io, where Neko v3, I think, may be defined as a goal which combines these two concepts.

from neko.

ehfd avatar ehfd commented on July 17, 2024 1

https://github.com/go-gst/go-gst

Go is now a first-class citizen on GStreamer. Combined with its compiled language characteristics being able to use C libraries and high legibility compared to Rust, as well as the existence of great web protocol libraries such as Pion, will hopefully work out well.

This library will support dynamic property and capsfilter updates like how C, Rust, Python has done.

from neko.

ehfd avatar ehfd commented on July 17, 2024

I've heard from @totaam that WebSockets for the stream, not only signaling (or any other TCP ways to do it) works until there's packet loss. When there's packet loss, there would be visible defects in the stream.

from neko.

totaam avatar totaam commented on July 17, 2024

When there's packet loss, there would be visible defects in the stream.

It can happen if partial decoding is implemented, but that's very hard to get right.
More likely, you will just get delayed frames, stuttering and bandwidth issues.
Generally, to get visible defects, you need a transport layer that handles packet loss. TCP and websockets do not.

from neko.

ehfd avatar ehfd commented on July 17, 2024

I included "delayed frames, stuttering and bandwidth issues" into visible (I think I meant perceivable) defects.
But anyways, yes, it's not about artifacts on the video, I agree.

from neko.

ehfd avatar ehfd commented on July 17, 2024

GPU Acceleration

Zero-copy buffers are very important for shaving off the last bits of latency.

#291 (SW: openh264enc, x265enc, rav1enc, AOM av1enc / NVIDIA nvcudah264enc, nvcudah265enc, nvav1enc - not implemented yet / VA-API vah264enc, vah265enc, vaav1enc)

Performance optimizations: https://git.dec05eba.com/gpu-screen-recorder-gtk/about/ (I encourage talking to the maintainer)

NVIDIA

NVIDIA Capture API (NvFBC) zero-copy framebuffer encoding for NVIDIA GPUs (may lead to great X11 performance improvements): https://github.com/CERIT-SC/gstreamer-nvimagesrc

GStreamer >= 1.22 supports nvcudah264enc and nvcudah265enc. This enables zero-copy encoding pipelines compared to previous nvh264enc and similar plugins. AV1 encoders may be supported later on.

Jetson (aarch64)

https://docs.nvidia.com/jetson/archives/r35.2.1/DeveloperGuide/text/SD/Multimedia/AcceleratedGstreamer.html

VA-API (AMD and Intel)

More notes: https://github.com/selkies-project/selkies-gstreamer/blob/332001b5314c352c06f1adb4af3a67b6b7b31fb1/src/selkies_gstreamer/gstwebrtc_app.py#L311

A successful VA-API pipeline which was quite redundant for me so far. vaapih264enc is deprecated and the new plugin everyone is focusing on (since GStreamer 1.22) is vah264enc.
qsvh264enc using Intel QuickSync is also an interesting option and while amfh264enc is not supported in Linux, may be revived with the new Vulkan video encoding APIs.
https://www.khronos.org/blog/khronos-finalizes-vulkan-video-extensions-for-accelerated-h.264-and-h.265-encode

I do not guarantee everything will work for all GPUs, libva version, VA drivers, etc., but I feel it's much better than the deprecated plugins including working well on iHD drivers.

Some more pointers for GPU acceleration and high-performance streaming:
selkies-project/selkies-gstreamer#34
https://github.com/colinmarc/magic-mirror
https://github.com/hgaiser/moonshine
https://github.com/cea-sec/sanzu

Hardware-accelerated JPEG encoding

NVIDIA and VA-API both provide hardware-accelerated JPEG encoding and are supported by most modern GPUs.
This may provide an alternative protocol that goes back to the VNC methodologies, except that it would be hardware-accelerated. JPEG can send each pixel independently, which may prove useful for partial screen refreshes.

https://gstreamer.freedesktop.org/documentation/nvcodec/nvjpegenc.html?gi-language=c

Miscellaneous

Note that x264 requires screen resolutions to be an even number. It wouldn't hurt to default to that always.
selkies-project/selkies-gstreamer#124

GStreamer examples: https://gist.github.com/hum4n0id/2760d987a5a4b68c24256edd9db6b42b

GStreamer Portable Build

GStreamer may be built statically if using C, Rust, or Go (not for Python). Because the most prominent performance and encoding optimizations are in the latest production releases, the most recent releases must be used. Then, Neko can be deployed regardless of environment, standalone without containers.
https://blogs.igalia.com/scerveau/discover-gstreamer-full-static-mode/
https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/3406

Even if we don't do a static build, it could be useful to make shared library builds for GStreamer which is compatible with all active distros for ABI and glibc (separate GPL and LGPL builds). musl libc is also a viable option especially if there is no Python component.

Using AppImage is also a way to make the resulting application portable. Used in Sunshine and RustDesk.

Conda (https://conda-forge.org/news/2023/07/12/end-of-life-for-centos-6/), used frequently in science and technology, maintains a portable compiler toolchain and package ecosystem (based on CentOS 7) and packages recent GStreamer versions. Especially useful when Python components exist.

from neko.

ehfd avatar ehfd commented on July 17, 2024

Low-latency graphical streaming (including 3D graphics development - Teradici/NICE DCV and game development) and relative cursors

#339 (Reduce Latency by using eliminating A/V sync for WebRTC or QUIC)

Host Encoder and WebRTC Web Browser Decoder Settings (eliminate all client-side latency)

selkies-project/selkies-gstreamer#34 (comment)
selkies-project/selkies-gstreamer#78
https://multi.app/blog/making-illegible-slow-webrtc-screenshare-legible-and-fast
https://www.rtcbits.com/2023/05/webrtc-header-extensions.html

#344 (Relative cursors in X11 and Wayland, pointer lock)
selkies-project/selkies-gstreamer#107 (Keyboard Lock and Pointer Lock are important. What's also important is the Press and Hold ESC to exit fullscreen capability).

#364 (Unicode Keysyms)

selkies-project/selkies-gstreamer#22 (Are there touch keyboards for mobile users?)

selkies-project/selkies-gstreamer#25 (URL authentication and JSON Web Token authentication?)

selkies-project/selkies-gstreamer#98:
(Handle (1) local scaling - fit client interface to server screen without server display resolution change, and (2) remote scaling - change server display resolution to fit server screen to client interface, and (3) scale cursor movement as well so that it isn't faster or slower than intended by the user)

selkies-project/selkies-gstreamer#102
selkies-project/selkies-gstreamer#99
(Validate cursor behavior with touchpad always, but shouldn't be a big issue since most of our fixes are from Neko itself)

selkies-project/selkies-gstreamer#110 (HiDPI management: https://wiki.archlinux.org/title/HiDPI / https://linuxreviews.org/KDE_Plasma#Workaround_For_Bugs)

And of course, a heap of information in https://docs.lizardbyte.dev/projects/sunshine/en/latest/index.html

More reference:
https://web.archive.org/web/20211003152946/https://parsec.app/blog/game-streaming-tech-in-the-browser-with-parsec-5b70d0f359bc
https://github.com/parsec-cloud/parsec-sdk/blob/master/sdk/parsec.h
https://github.com/evshiron/parsec-web-client
https://github.com/robterrell/web-client
https://github.com/pod-arcade
https://github.com/giongto35/cloud-game
https://github.com/giongto35/cloud-morph

Gamepads/Joysticks and Wayland Input

In the web browser perspective, the interface could utilize whatever the Gamepad API exposes.
https://developer.mozilla.org/en-US/docs/Web/API/Gamepad_API
This is implemented in https://github.com/selkies-project/selkies-gstreamer/blob/main/src/selkies_gstreamer/gamepad.py and https://github.com/selkies-project/selkies-gstreamer/blob/main/addons/gst-web/src/gamepad.js.

But from the server, it is not trivial to use arbitrary input devices in an unprivileged Docker or Kubernetes container, because /dev/uinput is required in most cases. For instance, Sunshine requires /dev/uinput in order to be run.

However, a number of workarounds are available.

https://github.com/selkies-project/selkies-gstreamer/tree/main/addons/js-interposer
This is a LD_PRELOAD method to redirect input calls without using /dev/uinput.

Additional approaches:
https://github.com/JohnCMcDonough/virtual-gamepad
https://github.com/games-on-whales/wolf/tree/stable/src/fake-udev
https://github.com/Steam-Headless/dumb-udev

What's intriguing in this approach is that this workaround method may also pave ways to replace /dev/uinput in Wayland for cursor and keyboard control. X11 included its own interface for input, but Wayland no longer has that and replaced with /dev/uinput. This may allow for a vendor-neutral method to process input in Wayland.

Touchscreen and Stylus

https://github.com/H-M-H/Weylus
https://github.com/pavlobu/deskreen

from neko.

ehfd avatar ehfd commented on July 17, 2024

Multi-user GPU Sharing

https://dev.to/douglasmakey/a-simple-example-of-using-unix-domain-socket-in-kubernetes-1fga
Looks like it does work now.

  • From Discord a few months ago

The issue with this is that this cannot be shared between different Kubernetes pods. It only works within multiple containers within the same pod. This means that GPU sharing with a single X server is a bit harder.

An alternative would be to use X11 through TCP instead of UNIX Sockets.

VirtualGL through GLX or using Wayland would also be an alternative that makes things smoother.

Multi-architecture Environments

Support arm64 (aarch64), arm/v7 (armhf), and ppc64le - for people happening to use the Sierra or Summit supercomputers, perhaps?

"supported": [
      "linux/amd64",
      "linux/arm64",
      "linux/riscv64",
      "linux/ppc64le",
      "linux/s390x",
      "linux/386",
      "linux/mips64le",
      "linux/mips64",
      "linux/arm/v7",
      "linux/arm/v6"

Available to build with QEMU using docker container --platform=linux/arm64,linux/ppc64le.

Multiarch paths must not be hardcoded.

from neko.

ehfd avatar ehfd commented on July 17, 2024

Client should be split to library typescript component that does not use vuejs or any library and can be imported by any project. It should be as easy to integrate into custom project as embedding video player. Similar to how demodesk/neko-client is built, but without VueJs.

https://vitejs.dev/guide/#trying-vite-online might be a good complement to this concept. And I think React (for backward compatibility) or Svelte (for lightweightness and development speed) as the default interface has their points compared to Vue (the whole reason the project needs to be rewritten).

from neko.

ehfd avatar ehfd commented on July 17, 2024

We were using Vue 2 ourselves, so Vue 3 I guess, wouldn't hurt.

Reference performance:
● | TodoMVC-JavaScript-ES6-Webpack-Complex-DOM | 574.40 | ± | 214.15 (37.3%) | ms
● | TodoMVC-WebComponents | 152.79 | ± | 21.32 (14.0%) | ms
● | TodoMVC-React-Complex-DOM | 238.28 | ± | 19.36 (8.1%) | ms
● | TodoMVC-React-Redux | 248.94 | ± | 31.37 (12.6%) | ms
● | TodoMVC-Backbone | 185.90 | ± | 19.09 (10.3%) | ms
● | TodoMVC-Angular-Complex-DOM | 207.88 | ± | 11.95 (5.7%) | ms
● | TodoMVC-Vue | 174.76 | ± | 10.65 (6.1%) | ms
● | TodoMVC-jQuery | 1336.29 | ± | 122.84 (9.2%) | ms
● | TodoMVC-Preact-Complex-DOM | 89.70 | ± | 6.47 (7.2%) | ms
● | TodoMVC-Svelte-Complex-DOM | 85.70 | ± | 5.29 (6.2%) | ms
● | TodoMVC-Lit-Complex-DOM | 122.30 | ± | 9.74 (8.0%) | ms

from neko.

ABeltramo avatar ABeltramo commented on July 17, 2024

I can add a bit of additional context to the virtual input part since I've moved my implementation from Wolf into a reusable standalone library games-on-whales/inputtino.

fake-udev is not a replacement for uinput or something similar to the LD_PRELOAD method; it's a way to "containerize" udev so that we can implement hot-plug for multiple containers without exposing the host devices and achieve proper isolation. I wrote an in-depth rationale on how it works here.

On a separate note, I've recently managed to implement gyro, acceleration, touchpad and force feedback for a virtual PS5 gamepad using uhid because unfortunately uinput is not enough. I wrote an explanation on why this is required here.
I'd be happy to discuss this further, but I'm afraid that in order to fully (and properly) support Joypads there's no way around this.
The good news is that uhid can easily be used to achieve a complete usb-over-ip solution and allow clients to expose any kind of device remotely: by using the WebHID API it's possible to access the raw HID data from the browser. For example, I've used DualSense Explorer to debug the low level details of my implementation.

from neko.

ehfd avatar ehfd commented on July 17, 2024

@ABeltramo Thank you!

Will it be possible to investigate the possibility of emulating the uhid device without the /dev/uhid host device?

Else, this could be an optional feature enabled or disabled based on user preference.

from neko.

ABeltramo avatar ABeltramo commented on July 17, 2024

@ABeltramo Thank you!

Will it be possible to investigate the possibility of emulating the uhid device without the /dev/uhid host device?

I don't exclude that it might be possible but I think it'll be fairly hard to achieve: the created devices will be picked up directly by the Linux kernel drivers just like when you plug an usb cable.
If exposing this is a serious concern I'd suggest to use a proxy container (or process straight on the host) that is only in charge of being the middle-man between the remote client and the local virtual devices. Wolf does this: makes the virtual devices and then roughly:

docker exec <unprivileged_container> mknod <virtual_device> && fake-udev

So that the end unprivileged container doesn't have access to uinput, udev or even /dev/input but it's managed "externally" by a more privileged container (or process).

This can be further locked down by running the process inside the unprivileged container with a low privileged user and let the external controller exec commands as root (or a higher privileged user) inside that container.

Not a security expert, but I think this can be a fairly secure approach..

from neko.

ehfd avatar ehfd commented on July 17, 2024

https://github.com/wanjohiryan/netris

Cloud gaming platform using WebTransport and Media over QUIC. Developed by @wanjohiryan with input from @kixelated.

from neko.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.