Marshall R. Mayberry, III, and Risto Miikkulainen (1999).
SARDSRN: A Neural Network Shift-Reduce Parser.
In Proceedings of the 16th Annual International Joint Conference
on Artificial Intelligence (IJCAI-99, Stockholm, Sweden), Denver:
Morgan Kaufmann, 1999.
The networks have been trained on a corpus of 263 sentences with one embedded
relative clause. Validation and testing are performed on the remaining
173 sentences from the grammar
Nouns |
---|
N(0) -> boy |
N(1) -> girl |
N(2) -> dog |
N(3) -> cat |
|
---|
Verbs |
---|
V(0,0) -> liked, saw |
V(0,1) -> liked, saw |
V(0,2) -> liked |
V(0,3) -> chased |
V(1,0) -> liked, saw |
V(1,1) -> liked, saw |
V(1,2) -> liked |
V(1,3) -> chased |
V(2,0) -> bit |
V(2,1) -> bit |
V(2,2) -> bit |
V(2,3) -> bit, chased |
V(3,0) -> saw |
V(3,1) -> saw |
V(3,3) -> chased |
|
---|
Rule Schemata |
---|
S ->NP(n) VP(n,m) |
VP(n,m) -> V(n,m) NP(m) |
NP(n) -> the N(n) |
RC(n) -> who VP(n,m) |
NP(n) -> the N(n) RC(n) |
RC(n) -> whom NP(m) V(m,n) |
|
---|
The task is shift-reduce parsing. The networks read a sequence of input
word representations into output patterns representing the parse results,
which are syntactic constituents. At each time step, a copy of the hidden
layer (or previous outputs for NARX networks) is saved and used as input
during the next step, together with the next word. In this way each new
word is interpreted in the context of the entire sequence so far, and the
parse result is gradually formed at the output. Reductions are RAAM
representations, which have been trained beforehand. The RAAM
representations can be examined in the separate popup window by clicking the
left mouse button on the output layer, which will send that representation
through the decoder and display the RAAM components of the representation in
the two assemblies labelled "Left" and "Right". Clicking the left mouse
button on either of the decoder output assemblies (the ones above the "Left"
and "Right" targets will further decode the representations. The initial
RAAM representation is always saved, so it can be restored by clicking on the
middle mouse button if another path through the decoding process is desired.
Given this background, we can understand the components of the demo:
- Main Window:
- Menubar:
- RUN through all the examples; (The display rate is set at 1,
so this will still show every step of the parsing process.)
- STEP through a particular example; (Also the right mouse
button)
- CLEAR everything to start over again;
- RESET the network (NOTE: this will randomize the weights;
you will need to run the demo again to get the previous weights);
- QUIT the demo.
- An entry widget giving the current path for the grammar being
investigated ("experiments/sardsrn/parse01_demo", by default).
Using the cursor, you can change the file to "parse01_train" to
see how the network performs on the training data.
- The Network: from top to bottom, left to right, the layers are:
- input: the input words.
- map: the SARDNET sequential SOM.
- previous outputs: a layer to hold up to six prior outputs
for NARX and SARDNARX.
- previous hidden: holds the previous hidden layer for SRN
and SARDSRN.
- hidden: the hidden layer for all networks
- output: the output where shifts and reductions are
determined.
- target: the proper targets.
There are four networks that can be examined:
- SRN (SRN on, SARD and NARX off): only the previous hidden
layer is used.
- SARDSRN (SRN on, SARD on, and NARX off): both the previous
hidden layer and the map are used.
- NARX (SRN on, SARD off and NARX on): only the previous six
outputs are used.
- SARDNARX (SRN, SARD and NARX all on): the map and the previous
six outputs are used.
The three components (SRN, SARD, and NARX) can be activated, tested,
and trained separately or together via the checkbuttons below the
menubar. For example, the network's performance without the SARD
component can be tested by clicking on the checkbutton itself,
thereby deactivating it. Clicking it (so that the checkbutton is
activated) will reactivate the SARDNET input. For the purposes of
the demo, a checkbutton labelled "Fulltest" in the RAAM window
allows full testing of the network to be performed (see next item).
If turned off, the network will run faster.
- A Status line showing the error signals from the SRN and SARD
components of the network, and the combined error of the two
components. This latter error also shows the average error per
output unit per epoch upon completion of each epoch (normally
just one epoch is needed for testing).
- RAAM Popup Window: This window only shows the decoder part of the
original RAAM network used to derive the partial parse representations.
It is used here to allow recursive decoding of these representations,
as well as displaying the statistics from "FullTest".
This window also displays the statistics showing the average error
per output unit upon decoding, as well as "Leaferror" and "Mismatch"
statistics give additional, more stringent, measures of the
network's performance, and are part of the test suite performed by
the "FullTest" routine. "Leaferror" shows the leaf error, i.e.,
the error measured by descending through the leaves of the RAAM
representations in addition to the output-target error. The
"Mismatch" measure shows the number of actual lexical mismatches
that occur as the leaves of the RAAM representations are unwound.
If the RAAM representation at the output of the SARDSRN network
only superficially encodes the parse result, then descending the
RAAM representation tree will reveal that the noise in the leaves
themselves is too high. See
SARDSRN: A Neural Network Shift-Reduce Parser for details.
- Inspect Map (SARD): This small window will display the vectors
associated with each unit in the feature map (the big black square
in the main window). Simply using the left mouse button over the
feature map will display a vector representation, and the unit
number and coordinates of the unit within the map. This is useful
to get an idea of how the map is (self-)organized following training.
The activations of the units themselves are distributed across the spectrum
with a rough breakout (depending on how many colors are actually allocated
on your monitor) as follows:
- 0.00-0.25: black to dark to light blue
- 0.25-0.50: light blue to olive
- 0.50-0.75: olive to dark violet
- 0.75-1.00: dark violet to white
For the demo to work, you must be running X11 (not Windows, Macintosh,
or NeXTStep), and you must not have a firewall preventing all remote
access to your screen. The demo program runs on net.cs.utexas.edu, with
the display over the internet on your X11 screen. Click
here to set up the demo. Send comments, bug reports etc. to
martym@cs.utexas.edu
Back to UTCS Neural Networks home page
martym@cs.utexas.edu
Last update: 1.17 2000/06/24 04:27:43 jbednar