How The CMU Communicator Architecture Works

From Olympus
Jump to: navigation, search


File Structure:

Agents: component servers (source code) Bin: executables Cepstral: Synthesis -- external, do not change Galaxy: Galaxy -- external, do not change Logs: logs Documents: documentation Resources: language model, grammars, acoustic models

General Architecture (Galaxy):

HUB connected to each of these blocks directly thru sockets: - Gentner [Telephony] - Audio Server [launches sphinxes on its own and sends all recognitions onwards] - Phoenix [Uses Grammar resources] - Helios [Confidence annotation for recognition results] - Backend Server (STUB) [communicates with actual !RoomLine Backend] ! !RoomLine Backend (Perl) [communicates with DB etc.] - !DateTime [figures out what "next Wendesday" means in terms of a real date] - !RavenClaw DM [does the actual Dialog Manager stuff] - !NlgServer (STUB) [communicates with Rosetta] ! Rosetta (Perl) [generates utterance text] - Kalliope [Cepstral] - TTYSphinx [the terminal version of speechIn/speechOut]

In addition, there's the !ProcessMonitor which handles starting and monitoring of everything.

The orchestration of the Hub with each of the individual modules is done through the .pgm file (e.g. RoomLine-hub-desktop.pgm). There are a couple of different aspects of this file:

1. Servers

Gives a unique handle for each server, and the port thru which communication will happen. Operations describe the function calls that may be made to these modules through the Galaxy architecture.

2. Programs/Rules

I'm not too clear on what programs are. Rules in programs (like main) tell Galaxy what the Hub should do when it gets a certain message. For instance, in the following example:

;; Choose one of the parses based on confidence, and annotate
;; input with other features
;; ---
RULE: !:parse & :numparses & :parses --> helios.choose_parse
IN:  :numparses :parses :input_source :utts :npow :pow
OUT: :parse :input_features
LOG_OUT: :parse :input_features

The RULE line describes the conditions under which the rule is to be fired. In this case it's if there is no slot filled called parse, and there are slots filled called numparses and parses, then fire the rule choose_parse in the helios module. The values/parameters that will go into this function call are the values of the slots listed in 'IN', namely the values for slots 'numparses', 'parses', ..., 'pow'. This function should output values for slots 'parse and 'input_features'. The logfile should log these slot values too.

In the following example, the 'IN' field takes in one item with a different format.

RULE: :beginout --> DialogManager.cancel_inactivity_timeout
IN: (:why "beginout")

Here, it means that the 'why' slot is hardwired to have the value "beginout" when it gets sent to the "cancel_inactivity_timeout" function in the "!DialogManager" module.

Finally, in the following example:

;; If we have a confidence annotated hypothesis, but
;; no parse, call phoenix to parse it
;; ---
RULE: :confhyps & !:parses --> phoenix.phoenixparse
IN: (:utts :confhyps) :npow :pow
OUT: :numparses :parses :input_source
LOG_OUT: :parses

The first item in 'IN' means simply that the slot 'utts' should be given the value in the 'confhyps' slot.

Personal tools