emoremap_samples's Introduction

EmoRemap

Abstract

Although current Text-To-Speech (TTS) models are able to generate high-quality speech samples, emotion intensity controllable TTS remains a difficult challenge. Most existing TTS models achieve emotion intensity control by extracting intensity information from reference speeches. Unfortunately, limited by the lack of modeling for intra-class emotion intensity and the model's information decoupling capability, the generated speech cannot achieve fine-grained emotion intensity control and suffers from information leakage issues. In this paper, we propose an emotion transfer TTS model, which defines a remapping method to model intra-class relative intensity information, combined with mutual information (MI) to decouple speaker and emotion information, and synthesizes expressive speeches with clearly perceivable intensity. Experiments show that our model achieves fine-grained emotion control while preserving speaker information.

EmoRemap_demos

Our Demo Page: https://lemon-ustc.github.io/demo/

Recommend Projects

lemon-ustc / emoremap_samples Goto Github PK

emoremap_samples's Introduction

EmoRemap

Abstract

EmoRemap_demos

emoremap_samples's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent