Name: Banala Saritha
Type: User
Company: India
Bio: Speaker Recognition and Identification, Meta-learning, Few Shot Learning & Speech Processing, Speech-activity-detection , T-F Representations.
Location: National Institute of Technology
Banala Saritha's Projects
Individual Research Project-Weakly supervised learning-based segmentation of surgical instruments in laparoscopic video frames
This is one of my trainings in DS & ML: Resnet image classification
The art of using t-SNE for single-cell transcriptomics
S-method implementation
Frequency and Multi-Scale Selective Kernel Attention for Speaker Verification
Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".
PyTorch implementation of SENet
SincNet with Attention Mechanism using PyTorch
Image Captions Generation with Spatial and Channel-wise Attention
Classification with backbone Resnet and attentions: SE-Channel Attention, BAM - (Spatial Attention, Channel Attention, Joint Attention), CBAM - (Spatial Attention, Channel Attention, Joint Attention)
Speaker identification task using deep learning models
Speaker Identification using Neural Net.
Speakerbox: Fine-tune Audio Transformers for speaker identification.
Unofficial implement with paper SpeakerGAN: Speaker identification with conditional generative adversarial network
Time frequency representations of RSR 2015 database.
The TORGO database used for our NLP project. Contains speech of normal and dysarthric people
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
In oceanic remote sensing operations, underwater acoustic target recognition is always a difficult and extremely important task of sonar systems, especially in the condition of complex sound wave propagation characteristics. Expensively learning recognition model for big data analysis is typically an obstacle for most traditional machine learning
Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper
Training and evaluation of VGGVox neural network for speaker identification
Y-vector: Multiscale Waveform Encoder for Speaker Embedding
Materials for workshops on the Hugging Face ecosystem