Show simple item record

dc.contributor.authorDegottex, Gilles
dc.contributor.authorErro Eslava, Daniel
dc.date.accessioned2015-10-16T12:55:51Z
dc.date.available2015-10-16T12:55:51Z
dc.date.issued2014-10-16
dc.identifier.citationJournal on Audio, Speech and Music Processing 2014 : (2014) // Article ID 38es
dc.identifier.issn1687-4722
dc.identifier.urihttp://hdl.handle.net/10810/15924
dc.description.abstractFeature-based vocoders, e.g., STRAIGHT, offer a way to manipulate the perceived characteristics of the speech signal in speech transformation and synthesis. For the harmonic model, which provide excellent perceived quality, features for the amplitude parameters already exist (e.g., Line Spectral Frequencies (LSF), Mel-Frequency Cepstral Coefficients (MFCC)). However, because of the wrapping of the phase parameters, phase features are more difficult to design. To randomize the phase of the harmonic model during synthesis, a voicing feature is commonly used, which distinguishes voiced and unvoiced segments. However, voice production allows smooth transitions between voiced/unvoiced states which makes voicing segmentation sometimes tricky to estimate. In this article, two-phase features are suggested to represent the phase of the harmonic model in a uniform way, without voicing decision. The synthesis quality of the resulting vocoder has been evaluated, using subjective listening tests, in the context of resynthesis, pitch scaling, and Hidden Markov Model (HMM)-based synthesis. The experiments show that the suggested signal model is comparable to STRAIGHT or even better in some scenarios. They also reveal some limitations of the harmonic framework itself in the case of high fundamental frequencies.es
dc.description.sponsorshipG. Degottex has been funded by the Swiss National Science Foundation (SNSF) (grants PBSKP2_134325, PBSKP2_140021), Switzerland, and the Foundation for Research and Technology-Hellas (FORTH), Heraklion, Greece. D. Erro has been funded by the Basque Government (BER2TEK, IE12-333) and the Spanish Ministry of Economy and Competitiveness (SpeechTech4All, TEC2012-38939-C03-03).es
dc.language.isoenges
dc.publisherSpringer International Publishinges
dc.rightsinfo:eu-repo/semantics/openAccesses
dc.subjectspeech synthesises
dc.subjectharmonic modeles
dc.subjectPhase modelinges
dc.subjectvoice transformationes
dc.subjectparametric speech synthesises
dc.subjectgroup delay functionses
dc.subjectspectral envelopees
dc.subjecttime-scalees
dc.subjectvocoderes
dc.subjectHMMes
dc.subjectextractiones
dc.subjectinstantses
dc.subjectsoundses
dc.subjectaudioes
dc.subjectwavees
dc.titleA uniform phase representation for the harmonic model in speech synthesis applicationses
dc.typeinfo:eu-repo/semantics/articlees
dc.rights.holder© 2014 Degottex and Erro; licensee Springer. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly creditedes
dc.relation.publisherversionhttp://www.asmp.eurasipjournals.com/content/2014/1/38/abstractes
dc.identifier.doi10.1186/s13636-014-0038-1
dc.departamentoesIngeniería de comunicacioneses_ES
dc.departamentoeuKomunikazioen ingeniaritzaes_ES
dc.subject.categoriaACOUSTICS
dc.subject.categoriaELECTRICAL AND ELECTRONIC ENGINEERING


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record