Lexical emergence from context: exploring unsupervised learning approaches on large multimodal language corpora

PhD Student: William Harvard
Laboratories: LIDILEM / LIG

An important task children have to face is to recognize and memorize the words they perceive from surrounding speech, so that they can use the same words to build their own utterances. Confronted with a continuous flow of speech, the young child must therefore extract meaningful units from the speech and carry out what is called lexical segmentation. The aim of this thesis is to deepen several aspects of the segmentation process through computational simulations based on unsupervised machine learning methods applied to large language corpora. We will analyze the emergence of the lexicon in context, through the unsupervised processing of large speech corpora paired to visual scenes. Computational models for lexical segmentation and acquisition have already been proposed, but they are usually limited, as they are only able to process strings of symbols. The first originality of this thesis is to propose models which can be directly applied to the speech signal. For this purpose, we will use novel deep learning approaches such as "encoder-decoder" and "end-to-end" neural architectures. The second originality of this thesis is the desire to study the emergence of the lexicon in context. For this purpose, we will rely on pre-existing multimodal corpora, in particular the synthetic corpus called SPEECH-COCO, consisting of more than 600,000 orally described visual scenes as well as the DylNet corpus, collected from interactions of children in nursery school.

 


Published on June 11, 2018