Speaker Adaptation

From Olympus
Jump to: navigation, search

The Olympus Prompt Recorder enables you to build acoustic models that are tailored to a specific person's speech. Using an adapted acoustic model has been shown to reduce recognition errors. These instructions assume you are working on Windows (XP/Vista/7). For more information on Sphinx's adaptation tools, and for a Linux-based solution, see the Sphinx tutorial at http://cmusphinx.sourceforge.net/wiki/tutorialadapt

Contents

Installing Python

The Olympus Prompt Recorder is written mostly in Python. Specifically, Python 2.5.1 or higher is required.

If you need to install Python, then download the appropriate Python installer .exe from python.org. You can install Python 2.5 from here: http://www.python.org/download/releases/2.5.2/

Checking DLLs (optional)

If there are issues running the Olympus Prompt Recorder, make sure you have the license for the following DLLs if you distribute any of them, and make sure you don't distribute files belonging to the operating system.

  USER32.dll - C:\WINDOWS\system32\USER32.dll
  IMM32.dll - C:\WINDOWS\system32\IMM32.dll
  SHELL32.dll - C:\WINDOWS\system32\SHELL32.dll
  comdlg32.dll - C:\WINDOWS\system32\comdlg32.dll
  COMCTL32.dll - C:\WINDOWS\system32\COMCTL32.dll
  ADVAPI32.dll - C:\WINDOWS\system32\ADVAPI32.dll
  GDI32.dll - C:\WINDOWS\system32\GDI32.dll
  KERNEL32.dll - C:\WINDOWS\system32\KERNEL32.dll

Configuring Settings for the Olympus Prompt Recorder

Make sure the OLYMPUS_ROOT Windows environment variable is set. This is the location where Olympus is installed. For example, OLYMPUS_ROOT could be equal to "D:\Olympus" if this is the location of the repository, and that the next level contains the folders "Agents", "bin", "Build", "Configurations", "Libraries", "Resources", and "Tools". Recording is set by default to "mono". If this causes problems, you can switch your sound card in the Olympus Prompt Recorder menu.

Running the Olympus Prompt Recorder

Open a command line window (Start Button -> Run.. -> Type "cmd"). Change to the directory $OLYMPUS_ROOT\Tools\Adaptation (e.g., D:\Olympus\Tools\Adaptation).

From $OLYMPUS_ROOT\Tools\Adaptation:

  > python runadapt.py userName baseModel
  userName = name of user you'd like to enroll (e.g., "bob" "alice")
  baseModel = wsj_all_sc.cd_semi_5000 (this is the default model used by TeamTalk)

An example prompt:

  > python runadapt.py hbovik wsj_all_sc.cd_semi_5000

Proceed through the 128 recordings in a quiet area.

Note: Whenever you start the recorder, you must select a new and unique userName such that previous audio and acoustic models are not overwritten. You may check the folder $OLYMPUS_ROOT\Tools\Adaptation\ArcPromptRec\Recorded for all recorded voices.

Olympus Prompt Recorder UI Instructions

On the display, you'll see a prompt near the bottom. That is the script area - this is where you will read off the script in order to record your speech.

  1. To begin recording, press the Record button. Note that the Record button changes to a 'Stop' button once you press it. Speak the script prompt aloud into your microphone. Try to keep the microphone a few inches away from your mouth, and angle it so that you don't directly speak into it. Once you are finished recording, press 'Stop'.
  2. If the recording is good, press 'Yes'. Otherwise press 'No' and re-record. After a successful recording, the recorder will transition to the next script prompt to read.

If audio clipping (audio that is excessively loud) frequently occurs, reduce the volume of your microphone. If you want to skip a prompt and come back to it later, press "Skip". To return back to a prompt, press "Back". You must record all 138 prompts for adaptation to work properly.

When you are finished recording, close the Olympus Prompt Recorder window to let the script finish performing adaptation. Do not close the script window. The adapted acoustic model will exist in the $OLYMPUS_ROOT\Resources\DecoderConfig directory.

Troubleshooting

Make sure your Windows default audio recording and playback devices work properly. These will be the ones used by default for the Olympus Prompt Recorder. You can test your default configuration by going to Accessories->Entertainment->Voice Recorder to make sure you can playback a recorded voice.

If there are issues involving "TkSnack", an audio library for Python, install TkSnack from here and follow the README install: http://www.speech.kth.se/snack/dist/snack2210-py.zip

Personal tools