This is a phase vocoder written in C++. The main purpose of a phase vocoder is to change the speed and pitch of a given voice recording file.
Current version supports the following time stretching algorithms
- Default: number of frames remain the same, while the output synthesis-rate/frame-shift/hop-size is modified correspondingly.
- --phaseLock: perform phase locking as described in [Laroche99]
- --fdTimeStretch: perform interpolate/extrapolate on spectrogram, thus the number of frames changed; while the output synthesis-rate/frame-shift/hop-size remain the same as analysis.
- Both --phaseLocking and --fdTimeStretch
Heuristically, use --phaseLock for human speech and --fdTimeStretch for music yield the best result.
Current version supports the following pitch shifting algorithms
- Default: perform time stretching for compensation, then modify output synthesis-rate/frame-shift/hop-size so the total length remains but pitch changed (due to up/down sampling)
- --fdPitchShift: directly modify spectrogram; both the number of frames and output synthesis-rate/frame-shift/hop-size remain the same as analysis. Default approach is suggested. --fdPitchShift hasn't been optimized yet.
Linux, OS X
g++ 4.2 or higher
make
to create PhaseVocoder.exe.
After PhaseVocoder.exe is complied, go to test/ directory and run sh test.sh
.
- Moulines, Eric, and Jean Laroche. "Non-parametric techniques for pitch-scale and time-scale modification of speech." Speech communication 16.2 (1995): 175-205.
- Laroche, Jean, and Mark Dolson. "Improved phase vocoder time-scale modification of audio." Speech and Audio Processing, IEEE Transactions on 7.3 (1999): 323-332.