
----------------------------- README ------------------------------------

This software was originally envisioned as an English-Esperanto translator.
Due to time constraints, the author was not able to achieve his goal.
Nevertheless, various components of this system may still be useful as
independent programs for sentence parsing and analysis.  

Please be warned that the author does not claim to be proficient in Esperanto.
Unexpected failures of programs can be attributed to author's poor 
programming skills as well as his poor language understanding 
(including English. sigh)  The author, however, assumes no responsibility
for possible hardware, software, or mental damages induced by this package.
For greater details, please read the file COPYING.

I am sorry for not being able to provide an Esperanto version of this README
file.  I welcome comments and suggestions on programs as well as this very
document. 

               May 3, 1995
               Jui-Yuan Fred Hsu, juiyuan@cs.cornell.edu, (607) 844-3697 
               (607) 255-1041 (Upson Hall 323)   
               http://www.cs.cornell.edu/Info/People/fred/



HARDWARE AND SOFTWARE PLATFORM

I have only run this software under my own environment.
I am working on a Sun Sparc station running SunOS 4.1.
C++ programs are compiled using gnu g++ 2.5.8 and Lisp programs run
under Lucid Common Lisp/Sparc 4.1.



INTRODUCTION

The translation of a sentence between two languages involves basically 
two processes.  The first part takes a sentence from the source language
and turn it into some sort of logical form, and then convert it from logical
form to target language.  I have made use of a Bottom-Up chart parser
that comes with "Natural Language Understanding" by James Allen.  Given 
lexicon entries, grammar rules and semantic information for a specific 
language, the BU parser can parse a sentence, and produce corresponding
logical form.  The NLP package also includes a simple sentence generator
that converts logical form into original sentence.  The parser and the
generator are writen in LISP, and work fine under Lucid Common Lisp.

Lexicon and Grammar rules for English processing is taken from a class
assignment where YiChen Chen, my brother Richard and I worked as a group.
The grammars understand a great deal of English sentences, and produce 
human readable logical forms.  But the lexicon was too small.
So far, one cannot generate sentences from above mentioned logical forms. 

Lexicon and Grammar rules for Esperanto processing is fully working. 
It does not encompass all Esperanto rules, but the lexicon is fairly 
complete (relatively), and it parses and generates Esperanto sentences.

The parser, however, cannot process affixes.  It lackes the ability to 
break words into basic morphemes.  This weakness can be tolerated for 
English sentences, but for Esperanto, it is a fatal blow, as Esperanto
relies heavily on affixes.  I have thus written a program to parse 
"words" into morphemes.  

I have taken the dictionary of Esp-Eng words compiled by Neal McBurnett
in 1992 as a base for "word" parser and Esperanto BU parsing.  
This list contains grammatical tags for each word.  

At this point in time, the translator as a whole does not work, 
as one can easily observe.  But the components can be used independently, 
and will be described in coming sections. 



INSTALLATION

1. ftp and download a copy of the software under the name "translator.tar.Z"

2. run "uncompress translator.tar.Z", and the file becomes "translator.tar"

3. run "tar xvf translator.tar", and a new directory translator/ will be 
   created and files are unpacked from the tar file onto this directory.

4. "cd translator"

5. Customize "./Makefile"

   Specifically, you may need to modify CC, INCLUDE_PATH, and LIBS for
   the C++ programs to compile and link correctly. 

   Make sure your Common Lisp executable name is correctly identified
   by LISP_BIN 

7. "make all".  (optionally, if this is not the first you "make", 
    you may want to run "make clean" to make sure old object files are removed)



DIRECTORIES AND FILES


esperanto/   			main directory

  README    			this file

  COPYING			GNU public license information

  TODO				to-do list 

  Makefile 			main makefile

  allen/ 			BU parser and generator from James Allen's book
  				I have modified some of his code

  vortaro/			Dictionary directory

    esp-angla-vortoj.txt	main Esperanto dictionary, from Neal McBurnett
    suppl.vortaro		supplementary dictionary
    
    filter.cc.txt		filter file to retrieve useful lexical entries
    filter.lisp.txt

    exclude.cc.txt		filter file to exclude unwanted entries
    exclude.lisp.txt
    
    lex.for.cc			[final usable lexicon entries]
    lex.for.lisp

  tools/			tool C++ libraries

  src/				C++ programs

    common.h			common definitions and routines
    common.cc

    buildlex.cc			build lexicon files in the vortaro/ directory

    lispify.cc			PARSE WORDS INTO BASIC MORPHEMES (C++)

    glue.cc			GLUE MORPHEMES BACK TO WORDS (C++)

  eng/				ENGLISH GRAMMAR for BU parser (Lisp)
  
    eng.lisp
  
  esp/				ESPERANTO GRAMMAR for BU parse (Lisp)

    esp.lisp			LOOK AT THIS FILE!

  bin/				Executable directory

    buildlex*			one-time setup of lexicon files

    lispify*			filter: input Esperanto sentence, output
    				morphemes.  Output can be fed into Lisp 
    				
    translate*			filter(script): send morphenes to BU parser, 
    				obtain logical form, then apply Esperanto 
    				sentence generator to get back the original
    				Esperanto sentence (Esperanto-Esperanto).
    
    glue*			filter: glue morphemes back to words
    
    run*			given a file containing an Esperanto sentence, 
    				does lispify | translate | glue
    
    sentence?			Esperanto testing sentences
    
    lispify?.test		testing files for lispify
    				"lispify" can parse more sentences than
    				BU parser can understand

  doc/				Postscript documentation. 

    english.semantics.ps	on English grammar and lexicon
    translator.ps		on Esperanto-English translator



SAMPLE RUNS


   While trying out this software, please keep in mind that it is
   just a classwork.  Do not keep your expectations too high.
   To find out the kinds of sentences it can process, please look at
   the testing section at the bottom of file esp.lisp.


   Following are illustrations of parsings.
   Some samples contain very long sentences.
   Users are not encouraged to try sentences of such length, 
   unless a great deal of patience resides in user's heart.


------------------------------------------------------------------------

~translator/bin>  run sentence1
 
   -- runing Esperanto-Esperanto on sentence 
      < de mi sur tablo al ni letero estas skribita >

   -- result: 

   LETERO ESTAS SKRIBITA AL NI DE MI SUR TABLO

------------------------------------------------------------------------

~translator/bin>  cat sentence2

   la libroj estas bonaj 


~translator/bin>  cat sentence2 | lispify

   la libr +o +j est +as bon +a +j 


~translator/bin>  cat sentence2 | lispify | translate

   ;;; Lucid Common Lisp/SPARC
   ;;; Development Environment Version 4.1 DBCS, 12 October 1992
   ;;; Copyright (C) 1985, 1986, 1987, 1988, 1992 by Lucid, Inc.
   ;;; All Rights Reserved

   [ommitted...]

   #<Package "USER" 53CB8E>
   > 
   (LA LIBR +O +J EST +AS BON +A +J)
   > 
   ;;; Loading source file "../allen/loadFunction"
   ;;; Loading source file "/amd/sundown/b/juiyuan/cornell/674/translator/allen

   [ommitted...]


   Semantic Interpretation

   (LA LIBR +O +J EST +AS BON +A +J)

   <S ((SEM (SENT= 
	 (VP= (TRANS N) (V-FORM PRESENT) (V-TENSE SIMPLE) (VOICE ACTIVE)
              (AGR_N P) (S-INFO ((VERB= (LEX EST) (V-FORM PRESENT)
                                	(TRANS N) (S-COMP -) (S-PREF -) 
                                	(S-VPRT -) (S-SUFF -)) 
                        	 (ADJ=  (A-INFO N) (LEX BON) (SUB S_ADJ) 
                                	(AGR_N P) (AGR_DO N) (S-COMP -) 
                                	(S-PREF -) (S-VPRT -) (S-SUFF -)))))

	 (S-SUB (NP= ((NOUN DEF - (NOUN= (LEX LIBR) (AGR_P 3) (AGR_N P) 
                        	  (AGR_DO N) (S-COMP -) (S-PREF -) (S-VPRT -)
                        	  (S-SUFF -))) -)))

	 (S-DO -) (S-IO -) (S-AGENT -) (S-PP -) 
	 (DO N)   (IO N)   (SUBJ Y)    (AGENT N))
   ))>


   [ommitted...]


   result

   LA LIBR +O +J EST +AS BON +A +J 


~translator/bin>  cat sentence2 | lispify | translate | glue

   [ommitted...]

   LA LIBR +O +J EST +AS BON +A +J 

   LA LIBROJ ESTAS BONAJ    


------------------------------------------------------------------------

~translator/bin>  run sentence4
 
   -- runing Esperanto-Esperanto on sentence 
   < l' lernanto estas esperanta l' libron >

   -- result: 

   LA LERNANTO ESTAS ESPERANTA LA LIBRON

---------------------------------------------------------------

~translator/bin>  cat sentence3

 viaj eksgepatretoj estas donintaj tiun cxi en mia malgrandega domego  

 
~translator/bin> cat sentence3 | lispify

   vi +a +j eks+ ge+ patr +et +o +j est +as don +int +a +j 
   
   tiu +n cxi en mi +a mal+ grand +eg +a dom +eg +o 


~translator/bin>  cat sentence3 | lispify | translate

   [ommitted...]

   Semantic Interpretation

   (VI +A +J EKS+ GE+ PATR +ET +O +J EST +AS DON +INT +A +J 

   TIU +N CXI EN MI +A MAL+ GRAND +EG +A DOM +EG +O)


   <S ((SEM (SENT= 
   
     (VP= (TRANS Y) (V-FORM PRESENT) (V-TENSE PERF) (VOICE ACTIVE) (AGR_N P) 
          (S-INFO ((VERB= (LEX EST) (V-FORM PRESENT) (TRANS N) (S-COMP -) 
                          (S-PREF -) (S-VPRT -) (S-SUFF -))
                   (ADJ= (A-INFO PERF) (LEX DON) (SUB S_VERB_TRAN) (AGR_N P) 
                	 (AGR_DO N) (S-COMP -) (S-PREF -) (S-VPRT (+INT -))
                	 (S-SUFF -)))))

     (S-SUB (NP= ((NOUN INDEF ((ADJ= (A-INFO PRONOUN) (LEX VI) (SUB -) (AGR_N P)
                                     (AGR_DO N) (S-COMP -) (S-PREF -) (S-VPRT -)
                                     (S-SUFF -)) -)
                              (NOUN= (LEX PATR) (AGR_P 3) (AGR_N P) (AGR_DO N)
                              (S-COMP -) (S-PREF (EKS+ (GE+ -))) (S-VPRT -) 
                              (S-SUFF (+ET -)))) -)))

     (S-DO (NP= ((PRON-DEMON= (LEX TIU) (AGR_P 3) (AGR_N S) (AGR_DO Y)
                              (NEAR Y)) -)))

     (S-IO -) 
     (S-AGENT -) 

     (S-PP ((PP= 
             (LEX EN) (MOVE NO) 
             (INFO (NP= ((NOUN INDEF 
                      ((ADJ=  (A-INFO PRONOUN) (LEX MI) (SUB -)
                              (AGR_N S) (AGR_DO N) (S-COMP -)
                              (S-PREF -) (S-VPRT -) (S-SUFF -))
                      ((ADJ= (A-INFO N) (LEX GRAND) (SUB S_ADJ) 
                             (AGR_N S) (AGR_DO N) (S-COMP -)
                             (S-PREF (MAL+ -)) (S-VPRT -) (S-SUFF (+EG -))) -))
                      (NOUN= (LEX DOM) (AGR_P 3) (AGR_N S) (AGR_DO N) (S-COMP -)
                             (S-PREF -) (S-VPRT -) (S-SUFF (+EG -)))) -)))) -))

     (DO Y) (IO N) (SUBJ Y) (AGENT N))))>


~translator/bin>  cat sentence3 | lispify | translate | glue


~translator/bin>  run sentence3

 
   -- runing Esperanto-Esperanto on sentence 

   < viaj eksgepatretoj estas donintaj tiun cxi en mia malgrandega domego >

   -- result: 

   VIAJ EKSGEPATRETOJ ESTAS DONINTAJ CXI TIUN EN MIA MALGRANDEGA DOMEGO
 


---------------------------------------------------------------

~translator/bin>  cat sentence5 | lispify | translate

   [ommitted...]

   Semantic Interpretation

   (DE EN LA TABL +O SALT +OS LA KAT +O)

   <S ((SEM (SENT= 
       (VP= (TRANS N) (V-FORM FUTURE) (V-TENSE SIMPLE) (VOICE ACTIVE) (AGR_N S)
            (S-INFO (VERB= (LEX SALT) (V-FORM FUTURE) (TRANS N) (S-COMP -) 
                           (S-PREF -) (S-VPRT -) (S-SUFF -))))

       (S-SUB (NP= ((NOUN DEF - (NOUN= (LEX KAT) (AGR_P 3) (AGR_N S) 
                                       (AGR_DO N) (S-COMP -) (S-PREF -) 
                                       (S-VPRT -) (S-SUFF -)
                        	 )) -)))                                    
       (S-DO -) 
       (S-IO -) 
       (S-AGENT -) 

       (S-PP ((PP= (LEX EN) (MOVE FROM) 
                   (INFO (NP= ((NOUN DEF - (NOUN= (LEX TABL) (AGR_P 3) (AGR_N S)
                                           (AGR_DO N) (S-COMP -) (S-PREF -) 
                                           (S-VPRT -) (S-SUFF -))) -)))) -))

       (DO N) (IO N) (SUBJ Y) (AGENT N))))>


   result

   LA KAT +O SALT +OS EL LA TABL +O 

---------------------------------------------------------------

~translator/bin>  run sentence6
 
   -- runing Esperanto-Esperanto on sentence < ili iras okcidente >

   -- result: 

   ILI IRAS OKCIDENTE

---------------------------------------------------------------

~translator/bin>  run sentence7
 
   -- runing Esperanto-Esperanto on sentence

    < al fred hsu de la universitato de cornell 
      viaj eksgepatretoj estas donintaj tiun cxi en mia malgrandega domego >

   -- result: 

   VIAJ EKSGEPATRETOJ ESTAS DONINTAJ CXI TIUN 
   AL FRED HSU DE LA UNIVERSITATO DE CORNELL EN MIA MALGRANDEGA DOMEGO


---------------------------------------------------------------

~translator/bin>  run sentence8
 
   -- runing Esperanto-Esperanto on sentence 
   < sur la tablo al ni skribas leteron mi >

   -- result: 

   MI SKRIBAS LETERON AL NI SUR LA TABLO

---------------------------------------------------------------

~translator/bin>  run sentence9
 
   -- runing Esperanto-Esperanto on sentence 
   < letero estas skribita de mi al ni sur la tablo >

   -- result: 

   LETERO ESTAS SKRIBITA AL NI DE MI SUR LA TABLO
 

