CSLU Seminar Information


During the Spring 2013 term, the seminar will meet on Mondays at 10am in PC401

E-mail Eric Morley (morleye [at] gmail dot com) to reserve a seminar date.

Upcoming Seminar

Date: Monday, May 20

Speaker: Brian Bush

Title: Estimating Phoneme Formant Targets and Coarticulation Parameters of Conversational and Clear Speech

Abstract: We present a data-driven formant model and methodology for discovering its parameters, namely phoneme targets and coarticulation functions for consonant-vowel-consonant (CVC) words from fully-automatic formant data. The model uses formant targets that are speaker dependent, but independent of speaking style and phonemic context. We used a global error measure to search for optimal formant targets for all phonemes, including classes of sounds where formants are not directly observable. Analysis of coarticulation parameters found significant differences in parameters between clear and conversational speech. Estimated formant targets were largely in agreement with acoustic-phonetic expectations. An intelligibility test validated that resynthesized CVC words using modeled formant trajectories were nearly as intelligible as resynthesized CVC words using observed formant trajectories.

Related Info and Links

Other CSLU Events

  • Lunch-n-Leisure: Enjoy lunch with your colleagues, every school day at 12:30 in the lab/lounge.


DateSpeakerTitleNotes

Spring 2013

06/10/2013Guillaume Thibault  
06/05/2013Maider Lehr Thesis proposal; on Wednesday
06/03/2013Géza Kiss  
05/20/2013Brian BushEstimating Phoneme Formant Targets and Coarticulation Parameters of Conversational and Clear SpeechPractice talk for ICASSP poster session
05/13/2013Meysam AsgariImproving the Accuracy and the Robustness of Harmonic Model for Pitch Estimation 
05/06/2013Alireza BayestehtashkEfficient and Accurate Multivariate Class-Conditional Densities Using Copula 
05/02/2013Andrew FowlerAutotyping and Improved Bayesian Inference for Binary Typing Systems with Brain Computer InterfacesRPE; On Thursday
04/29/2013Mahsa LangaraniPitch decomposition for recombinant synthesis 
04/22/2013Eric MorleyThe Utility of Manual and Automatic Linguistic Error Codes for Identifying Neurodevelopmental Disorders 
04/15/2013Ranjani RamakrishnanPredicting methylation levels at unique locations on the Genome Using Next-gen sequencing data 
04/08/2013Amanda SteadDiscourse in Aging & Dementia: What real speech tells us about cognitive change 

Spring 2013 Seminar Abstracts

Speaker: Brian Bush

Title: Estimating Phoneme Formant Targets and Coarticulation Parameters of Conversational and Clear Speech

Abstract: We present a data-driven formant model and methodology for discovering its parameters, namely phoneme targets and coarticulation functions for consonant-vowel-consonant (CVC) words from fully-automatic formant data. The model uses formant targets that are speaker dependent, but independent of speaking style and phonemic context. We used a global error measure to search for optimal formant targets for all phonemes, including classes of sounds where formants are not directly observable. Analysis of coarticulation parameters found significant differences in parameters between clear and conversational speech. Estimated formant targets were largely in agreement with acoustic-phonetic expectations. An intelligibility test validated that resynthesized CVC words using modeled formant trajectories were nearly as intelligible as resynthesized CVC words using observed formant trajectories.

Speaker: Meysam Asgari

Title: Improving the Accuracy and the Robustness of Harmonic Model for Pitch Estimation

Abstract: Accurate and robust estimation of pitch plays a central role in speech processing. Various methods in time, frequency and cepstral domain have been proposed for generating pitch candidates. Most algorithms excel when the background noise is minimal or for specific types of background noise. In this work, our aim is to improve the robustness and accuracy of pitch estimation across a wide variety of background noise conditions. For this we have chosen to adopt, the harmonic model of speech, a model that has gained considerable attention recently. We address two major weakness of this model. The problem of pitch halving and doubling, and the need to specify the number of harmonics. We exploit the energy of frequency in the neighborhood to alleviate halving and doubling. Using a model complexity term with a BIC criterion, we chose the optimal number of harmonics. We evaluated our proposed pitch estimation method with other state of the art techniques on Keele data set in terms of gross pitch error and fine pitch error. Through extensive experiments on several noisy conditions, we demonstrate that the proposed improvements provide substantial gains over other popular methods under different noise levels and environments.

Speaker: Alireza Bayestehtashk

Title: Efficient and Accurate Multivariate Class-Conditional Densities Using Copula

Abstract: There is a clear dichotomy between univariate and multivariate generative models for continuous random variables. Univariate densities can be modeled accurately and efficiently using nonparametric kernel density estimators, which unfortunately cannot be easily extended to multivariate case. Gaussian mixture models on the other hand have become the workhorse for multivariate densities because they capture multivariate dependencies effectively and efficiently. However, the multivariate Gaussian mixture models impose a particular form on the marginal, a Gaussian mixture model. This is a strong assumption on the marginal and is violated in many practical applications. In this paper, we propose a simple generative method based on copula model that takes advantage of the accuracy of the nonparametric univariate density estimator and the multivariate dependencies captured in the Gaussian mixture model. This alleviates the aforementioned limitations. We show that the proposed generative model consistently outperforms Gaussian mixture models on classification tasks from UCI repository, with performance often comparable and sometimes better than Support Vector Machine (SVM).

Speaker: Andrew Fowler

Title: Autotyping and Improved Bayesian Inference for Binary Typing Systems with Brain Computer Interfaces

Abstract: RSVP Keyboard is a successful typing system for people with severe physical disabilities, specifically those with locked-in syndrome (LIS). It uses signals from an electroencephalogram (EEG) combined with information from an n-gram language model to select letters to be typed. The main shortcoming of the system as it exists today is that it does not keep track of past EEG observations, i.e. observations made of brain signals while the user was in a different part of a typed message. We present a system for taking all past observations into account in a principled Bayesian manner, and show that this method results in an over 20% increase in simulated typing speed. We also show that our method allows for better calculation of the probability of the backspace symbol, an important feature. Finally, we demonstrate the utility of automatically typing certain letters in certain contexts, a technique that allows for increased typing speed under our new method.

Speaker: Mahsa Langarani

Title: Pitch decomposition for recombinant synthesis

Abstract: Recombinant synthesis is text-to-speech synthesis where both acoustic and prosodic units are stored in a database to get more natural sounding speech. Prosodic units in this case are phrase curves and accent curves. The problem of extracting these curves from the raw F0 curves is not easy to solve. The first hurdle is to start with a robust F0 curve. We discuss an approach to deal with pitch halving and doubling errors by using normalized cross-correlation to compute F0 candidates and applying the Viterbi algorithm to find the best path through the candidates. We show an improvement in F0 curve estimation over the standard get_f0 method present in Snack. Next, we discuss a new method for decomposing F0 curves into phrase and accent curves. We assume phrase curves are piecewise linear. We can model accent curves using skewed normal distributions and sigmoid functions to deal with the end of an interrogative utterance. Number of parameters depends on the number of feet and phrases. All of the parameters are optimized at the same time using the Sequential Least Squares Programming optimization algorithm.

Speaker: Eric Morley

Title: The Utility of Manual and Automatic Linguistic Error Codes for Identifying Neurodevelopmental Disorders

Abstract: We investigate the utility of linguistic features for automatically differentiating between children with varying combinations of two potentially comorbid neurodevelopmental disorders: autism spectrum disorder and specific language impairment. We find that certain manual codes for linguistic errors are useful for distinguishing between diagnostic groups. We investigate the relationship between coding detail and diagnostic classification performance, and find that a simple coding scheme is of high diagnostic utility. We propose a simple method to automate the pared down coding scheme, and find that these automatic codes are of diagnostic utility.

Speaker: Ranjani Ramakrishnan

Title: Predicting methylation levels at unique locations on the Genome Using Next-gen sequencing data

Abstract: In this talk I present the approach that we are taking to predict methylation levels at CpG sites on the genome using read count data from precipitation experiments. We use data from rhesus macaques that have been exposed to ethanol and that have had methylation levels measured using multiple techniques, followed by next-gen sequence analysis. I will talk about the external data sources that we are using, in addition to our experimental data, that we include in the model. I will present the results of our approach and compare it to an SVM-based classification approach.

Speaker: Amanda Stead

Title: Discourse in Aging & Dementia: What real speech tells us about cognitive change

Abstract: To engage in discourse, multiple cognitive systems must engage across various portions of the process. Discourse analysis has been a growing area of investigation in aging and dementia; however, because of some of its qualitative aspects and unpredictable nature, many researchers have yet to see discourse for its true clinical utility. Different types of discourse rely on different types of cognitive support systems. This talk presents the potential clinical utility of discourse in the study of aging and dementia as well as what features of language deteriorate at certain stages of cognitive decline and how discourse can be used to discover them.

Winter 2013



2012 Seminars

2011 Seminars

2010 Seminars