Neural Higher-Order Factors in Conditional Random Fields for Phoneme Classification

Neural Higher-Order Factors in Conditional Random Fields for Phoneme Classification

Abstract

We explore neural higher-order input-dependent factors inlinear-chain conditional random fields (LC-CRFs) for sequencelabeling. It is a fusion of two powerful models as higher-orderLC-CRFs with linear factors are well-established for sequencelabeling tasks, but they lack to model non-linear dependencies.Therefore, we present neural higher-order input-dependent fac-tors which map sub-sequences of inputs to sub-sequences ofoutputs using distinct multilayer perceptron sub-networks. Thisis important in many tasks, in particular, for phoneme classifi-cation where the phone representation strongly depends on thecontext phonemes. Experimental results for phoneme classifi-cation with LC-CRFs and neural higher-order factors confirmthis fact and we achieve the best ever reported phoneme clas-sification performance on TIMIT, i.e. a phoneme error rate of15:8%. Furthermore, we show that the success is not obviousas linear high-order factors degrade phoneme classification per-formance on TIMIT.

Grafik Top
Authors
  • Ratajczak, Martin
  • Tschiatschek, Sebastian
  • Pernkopf, Franz
Grafik Top
Shortfacts
Category
Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title
Conference of the International Speech Communication Association (INTERSPEECH)
Divisions
Data Mining and Machine Learning
Event Location
Dresden, Germany
Event Type
Conference
Event Dates
06.-10.09.2015
Series Name
INTERSPEECH-2015, , 16th Annual Conference of the International Speech Communication Association
ISSN/ISBN
1990-9770
Page Range
pp. 2137-2141
Date
6 September 2015
Official URL
https://www.tschiatschek.net/files/ratajczak15neur...
Export
Grafik Top