Out‐Of‐Vocabulary and Out‐Of‐Language recordings

 >>OOV Data<<        >>OOL Data>>

  • Two data  sets  of  audio  recordings  have  been  produced  by  project  partner  BUT.  The first  one  is  a  set  of  utterances  containing  Out‐Of‐Vocabulary  (OOV)  words  and  non‐speech  sounds, and  the  second  one  contains  English  In‐Language  (IL)  spontaneous speech  featuring intermittent  switches  to  a  foreign  language  (Out‐Of‐Language – OOL). Segmentation and Annotation was on a 10s chunk basis. The OOV recordings and the OOL recordings as well as detailed information are available.
  • Furthermore,  a  NN‐based  OOV  word  detection  visualization  tool  bundled  with  the OOV recordings  shown  on  the  review  in  2009  has  been  made  available  on  reviewers and partners request. It can be run under Linux and Windows, and be recompiled.

>>Detection/ Visualization Tool<<          >>Hybrid Word/Sub-word Recognition<<