jitsi / gsoc-ideas Goto Github PK

Google Summer of Code ideas

gsoc-ideas's Introduction

Jitsi Desktop

Jitsi Desktop is a free open-source audio/video and chat communicator that supports protocols such as SIP, XMPP/Jabber, IRC and many other useful features.

Please do not confuse this project with Jitsi Meet, the online video conferencing solution with a free instance at https://meet.jit.si.

Support

Jitsi Desktop is the heritage of Jitsi Meet. While some components are still used in e.g. Jigasi, the project is not actively developed anymore. Improvements, bugfixes and builds are entirely based on community contributions.

Installation

Releases

Windows and macOS

Download the installers from GitHub releases.

Debian/Ubuntu

An APT repository is available at https://nexus.ingo.ch/jitsi-desktop/. Note the trailing slash at the end of the distro-name. This is required since the repository has no components.

deb https://nexus.ingo.ch/jitsi-desktop-unstable/ <distro>/

RPM Distros

Sorry, there are currently no rpm packages available.

Snapshots

Snapshot or pre-release builds are also available in additional repositories.

Windows and macOS: See https://github.com/jitsi/jitsi/releases
Debian/Ubuntu: https://nexus.ingo.ch/jitsi-desktop-unstable/

Helpful Resources

Contributing

Please, read the contribution guidelines before opening a new issue or pull request.

gsoc-ideas's People

Contributors

Stargazers

Watchers

gsoc-ideas's Issues

Had new idea that I think is worth sharing

Jitsi

Hi there, I just wanted to brainstorm a few things that I think is overlooked by many companies such as google meets, and I think this should be one of the many features that makes jitsi, special.

If this idea doesn't make sense, please feel free to ignore it.

The idea

The idea is you know how there are pointers you can use while presenting your screen in google meets, but can participants show you any part of the screen they want to direct your attention to? they will have to spell it out to you every time they want to point out at something, and this is very challenging, especially in a coding presentation scenario. as a result, we can come up with a solution where the participant will be given an access by the presenter to use pointers from their end.

I am not really sure if it is a good idea, so please let me know what you think 😊. and I will explain more.

Speech-to-text GSOC project ! discussion

@nikvaessen, I was studying the backend implementation 'Jigasi'. And there was a Heading Vosk Configuration, I read about this and found out that Vosk is an Open Source speech recognition system. So In this Project We will have to find a different Open Source Model other than Vosk ?if yes, What are the properties that Vosk is lacking behind. It would help in finding the correct Open source model as I was searching there are many other models. like deepspeech by Mozilla, OpenSeq2Seq by NVIDIA.

Speech to text feature in GSoC 2022

Hi everyone,
Hope you’re all doing great!

My name is Shahryar Soltanpour, I’m studying MSc in computer science at the University of Calgary. I’m very excited to join the Jitsi community. I want to contribute to speech-to-text feature of Jigasi and I’m preparing my proposal for it. I would be more than happy to hear any advice from you about getting started or any features and ideas that you want me to include in my proposal.

Thanks in advance

Speech-to-text with publicly available deep learning models

Hi!
I have read through the idea document for "Speech-to-text with publicly available deep learning models". But, one thing I feel is missing there: Will the inference servers have a GPU available to them? Meaning will Jigasi be using a GPU enabled instance to run?

Because, the choice of a specific model complexity would be decided based on the resources available. Though, we can run these heavy and high-performance models on CPU instances, we might not get the low latency required in the process. So, deciding the instance type early can help in choice of models and so on.

Questions about language support for the speech-to-text project implementation

Hi @nikvaessen, what languages is Jitsi looking to add support for? Is the focus going to be on English or on multi-lingual support. Also on a side note: I was browsing through the Wave2vec2 documentation on HuggingFace and saw that sometimes the model would predict acoustically accurate but grammatically incorrect words/sentences. What is the expectation with regards to handling those cases?

Some examples of what I'm referring to taken from HuggingFace's blog post

How do I post a new idea to the ideas list?

Hello there,
I have a new feature idea in my mind but am not sure how can I post it to the ideas list of GSOC 2022 . Should I directly initiate a pull request for that or should I simply post it here in the Issue section? Your guidance would be of great help.

Change `prometheus-stats` to `prometheus-stats.md`

Since the file doesn't have an extension as of now, therefore it appears in markdown format and is not readable properly. Adding an file extension of .md will slove this.

GSOC -22, Speech to Text (ISSUE)

Hi Jitsi team,
I have used Artyom Speech to text API in past which can be very well integrated in this project, I am attaching a video demo of the working prototype as well.

Why I am using artyom

It's completely free
It's accuracy is above the one present in the free model.
It processes the data and even rectifies any pronunciation errors with its ML model.
It converts the speech to text in real-time providing the user a lag-free & low latency experience.

Working prototype:-

https://drive.google.com/file/d/15lmaRUiFTYdGFd-yCguBwRnsAr0I9J2d/view?usp=sharing

Fix a typo in 2022 directory multiple files

In cast-meeting.md, ios-pip-window.md, ml-audio-enhancemants.md, mobile-video-effects.md, react-native-sdk.md, spatial-audio.md, template.md files, there is a small typo. Change ooutcomes to outcomes.

"Prometheus scraping and Grafana dashboards"

Should we consider utilizing the existing systemli/prometheus-jitsi-meet-exporter in our project instead of developing a new one from scratch?

cc @sawall @aaronkvanmeerten @saghul

Speech to Text

Hey... I was just exploring Speech to Text project ideas for GSoC 2022 and came across this Javascript Web Speech API which offers Speech Recognition. It seems to be quite accurate and real-time but the privacy policy is not very clear. However, it somewhere states:

Chrome currently takes the audio and sends it to Google's servers to perform the transcription.

Just wanted to know what are your views over the same and if it suits your requirements. Plus, I also tested the DeepSpeech open-source library over a local setup and the results are good enough for the English language but not very convincing for other regional languages. Do you consider it a probable solution? Looking forward to suggestions and feedback @nikvaessen

Regards
Rishabh

Hi I want to contribute to the Spatial Audio idea for Gsoc.

I want to contribute to the Spatial Audio as mentioned in the idealist , how can I interact with the mentioned mentors and is there any Discord or Slack channel for it ?

Speech to Text Feature.

Hello @saghul @nikvaessen!

Recently came across Jitsi while researching GSOC projects. Love the platform! Wanted to contribute to the real time speech to text feature. The options considered:

DeepSpeech2
Facebooks Hubert -with no LM, still got a fast, quality output)
Improvements to Vosk.

But, to narrow these options down for you, knowing about what specific areas need improvement would be helpful.

Regards!

Can anyone Review my GSOC proposal for speech to text.

I have created my proposal. Please Provide me a feedback @nikvaessen @saghul
Birat Datta GSOC.pdf

Feature : Design Enhancement in the Jitsi Website.

Fixing the Seo & UI Design and Front-end Web-development of Jitsi

The audience of Jitsi largely consists of teachers, educators and students. At the heart of Jitsi are Jitsi Vide bridge and Jitsi Meet, which let you have conferences on the internet, while other projects in the community enable other features such as audio, dial-in, recording, and simulcasting. One of the most requested feature in Jitsi is to engage large audience as in the market there are other market consuming product as well who are providing somewhere the same functionality as Jitsi provide.

The current website works well but lacks some potential features and has a lot of scope for improvements. This is the main objective of this project proposal, to replenish the website and engage more functionality to it for better user interface & user experience.

There have been instances of people migrating to platforms other than Jitsi because of the poor user interface which results in bad user experience on the website itself before reaching to the actual product features. Through this project, I would like to understand exactly what users expect from a Secure video conferencing environment and implement the same in Jitsi. I’ll try to finish as many UI design and its code implementation as I can during my Google Summer of Code coding period and work on the rest after that as people keep suggesting better ones.

Although I am done with the proposal and mockups for the new design ,Would love to know , if this can be consider as the GSoC proposal statement or not .

Draft Proposal for Speaker Queue Feature - GSoC 2024

Hello!, to whom it may concern.

Note: Before proceeding, I would like to clarify that I'm unsure whether creating this GitHub issue is the appropriate way to communicate my interest in applying to work on the Speaker Queue feature. Alternatively, I could send an email. Nevertheless, I've decided to provide the proposal here for review. I will also apply through the GSoC website.

Proposal for GSoC 2024: Jitsi Meet - Speaker Queue Feature

Abstract

The proposed project aims to implement a Speaker Queue feature for Jitsi Meet, enhancing the user experience by providing a structured way to manage speaking turns during meetings. Leveraging my expertise in frontend development with JavaScript and Vue, I will focus on developing the necessary user interface components and integrating them seamlessly with the existing Jitsi Meet system.

Overview

The Speaker Queue feature will allow participants to raise their hand when they wish to speak, organizing them into a queue based on the order of requests. Moderators will have the ability to manage the queue, ensuring smooth communication flow within meetings.

Objectives

Develop robust user interface components using JavaScript and React to support the Speaker Queue feature.
Integrate the frontend components with the existing Jitsi Meet system, ensuring seamless communication with backend services.
Implement moderator controls for managing the speaker queue and participant requests.
Ensure responsiveness and accessibility across different devices and screen sizes.

Deliverables

User interface components for displaying the speaker queue and moderator controls.
Integration of the frontend components with the existing Jitsi Meet system.
Implementation of participant actions, such as raising a hand, and making them join the speaker queue (this will be transparent to the end-user).
Comprehensive documentation for users and developers on how to use the speaker queue feature.

Timeline

Community Bonding Period: May 1 - May 30

Familiarize with the Jitsi Meet codebase and development environment.
Discuss project details with mentors and finalize implementation plan.

Phase 1: User Interface Development (June 1 - June 30)

Develop user interface components for displaying the speaker queue and moderator controls.
Ensure responsiveness and accessibility across different devices and screen sizes.

Phase 2: Integration with Jitsi Meet (July 1 - July 15)

Integrate the frontend components with the Jitsi Meet system, ensuring seamless communication.
Implement protocols for data exchange between frontend components and backend services.

Phase 3: Moderator Controls (July 16 - July 31)

Implement moderator controls for managing the speaker queue and participant requests.
Test the functionality and gather feedback for improvements.

Phase 4: Testing and Optimization (August 1 - August 15)

Conduct thorough testing to identify and fix bugs or performance issues.
Optimize the user interface components for scalability and performance.

Final Week (August 16 - August 23)

Finalize any remaining tasks and address feedback.
Prepare project presentation and demo for final evaluation.

Skills Required

Proficiency in JavaScript and React for frontend development.
Experience with web development and user interface design.
Strong problem-solving and communication skills.

Mentorship

Mentors for this project will include experienced developers from the Jitsi Meet team, such as Saúl Ibarra Corretgé, Calin Chitu, and Hristo Terezov. Regular communication and feedback sessions will be scheduled if needed to ensure smooth progress throughout the project.

Conclusion

The implementation of the Speaker Queue feature will significantly enhance the functionality of Jitsi Meet, providing users with a structured way to manage speaking turns during meetings. Leveraging my expertise in frontend development, I am confident in delivering a high-quality feature that meets the needs of both users and moderators.

ml_audio_enhancements

Hello,
I was looking at the ml audio enhancement in jitsi meet .I just wanted to know what sort of outcomes are being accepted in the proposal for this project .
Thanks!

ML audio enhancements project in GSoC' 2022 (Thread)

Hello @saghul & @hristoterezov, I'm Saurabh pursuing a prefinal year Btech in India, I have good experience with ML & Development. I'm interested in applying at GSoC for ML audio enhancements project. I have started exploring the structure and flow of the Jitsi codebase. I would like to have your suggestions before starting with my proposal for this project.

Thanks

JITSI: Speech-to-text with publicly available deep learning models

Hi @nikvaessen, this is Bhargav B R. I have submitted my proposal for the Jitsi: speech-to-text project.

I have tested various open source publicly available speech-to-text models like Mozilla’s DeepSpeech, Coqui STT, Facebook’s flashlight, SpeechBrain and Kaldi, comparing the word error rate and inference speed of each and also included the speech transcription results obtained by running against sample audio using various models in my GSOC proposal.

For enabling communication between Jigasi and the web server hosting the open source model, I have designed a sample client-server model based application considering both WebSocket framework and Rest Service architecture, exposing the open source speech-to-text model as a service for serving transcription results back to the client.

I went through the existing Jigasi implementation of speech-to-text conversion. I am very much interested in contributing to this project, making it a completely self-hosted solution for speech transcription. Please review my proposal.
Thank you.

Setting up JVB

@saghul I’m trying to set up Jitsi Videobridge (JVB) and I’m encountering some errors. Could you please take a look at them?

Multiple Recording Storage Providers - GSoC

Hello everyone! This is Jayanth, a second year undergraduate from Indian Institute of Technology, Madras. I am a software developer with strong expertise in web development (Javascript based). I was exploring all the projects put up for this year’s GSoC program and the ‘Multiple Recording Storage Providers’ project by Jitsi grabbed my interest. So, I would like to know about the potential mentors and what are their plans for this project. Any kind of inputs from them would help me draft the proposal. I wish to make an impactful contribution for this project. Is anyone planning for contributing to this project? I'm open for discussion.