IRLIST Digest ISSN 1064-6965 April 25,1994 Volume XI, Number 17 Issue 210 ********************************************************** I. QUERIES 1. Strategies of Experienced Searchers III. NOTICES A. Publications 1. Machine Translation: Special Issue: Call for Submissions 2. TEI Guidelines: P3 Publication C. Miscellaneous 1. Graduate Open House, Drexel University 2. M.SC., Language, Speech, and Auditory Processing, U. Sheffield ********************************************************** I. QUERIES I.A.1. Fr: Yakov Bar Re: Strategies of Experienced Searchers Shalom ! I would appreciate very much your help. I am conducting a research project on the behaviors and habits of experienced searchers. In order to advance a particular aspect of my research I need to investigate the thought processes of the searcher. I want to know the initial search strategy one would compose for the following patent searches, before going online. Include the logical connectors, e.g. AND, OR, NOT. 1. Patents registered worldwide for devices, mechanisms, techniques and/or other solutions to assist a driver of a moving vehicle from falling asleep at the wheel. 2. Patents registered worldwide for devices, mechanisms, techniques and/or other solutions to the problem of children becoming lost or separated from their parents in a public place. Thanks for all your assistance. Menahem Dolinsky Information Service Manager Messer - Center for Innovative Ideas 33 HaHashmal St. Tel Aviv, ISRAEL matimop9@ccsg.tau.ac.il ********************************************************** III. NOTICES III.A.1. Fr: Judith Klavans Re: Machine Translation Special Issue - Call for Submissions THE MACHINE TRANSLATION JOURNAL SPECIAL ISSUE ON BUILDING LEXICONS FOR MACHINE TRANSLATION Editor: Sergei Nirenburg Guest Editors: Bonnie J. Dorr and Judith L. Klavans The Journal of Machine Translation is planning a Special Issue on the Lexicon in Machine Translation (MT). The lexicon plays a central role in any MT system, regardless of the theoretical foundations upon which the system is based. However, it is only recently that MT researchers have begun to focus more specifically on issues that concern the lexicon, e.g., the automatic construction of cross-linguistically valid lexical-semantic and knowledge-based representations for use by multi-lingual systems. The need for large dictionaries is overwhelming in any natural language application, but the problem is especially difficult for MT because of cross-linguistic divergences and mismatches that arise from the perspective of the lexicon. Furthermore, scaling up dictionaries is an essential requirement for MT that can no longer be dismissed; researchers need to move from toy-dictionary MT systems into larger-scale MT systems so that they will be in a better position to demonstrate the validity of the theoretical underpinnings of their systems. The intent of this Issue is to address critical issues concerning the automatic and semi-automatic acquisition of lexical representations for MT dictionaries. Among traditional approaches to constructing dictionaries for natural language applications has been the massaging of on-line dictionaries that are primarily intended for human consumption. Given that many natural language applications have focused primarily on syntactic information that can be extracted from the lexicon, these methods have constituted a reasonable first-pass approach to the problem. However, it is now widely accepted that natural language processing in general, and MT in particular, requires language-independent conceptual information in order to successfully process a wide range of phenomena in more than one language. Thus, the task of lexicon construction has become a much more difficult problem as researchers endeavor to extend the concept base to support more phenomena and additional languages. Added to this is the standard size, coverage, efficiency trade-off, combined with the fundamental question of anticipated vs actual functionality. High-quality original research papers are invited on issues relevant to this topic including, but not limited to: - Lexical levels required by a machine translation (syntactic, lexical semantic, ontological, etc.) and interdependencies between these levels. - Automatic procedures for the construction of lexical representations. - Semi-automatic methods for the acquisition of lexical knowledge. - Use of existing resources and aids for transforming these resources into appropriate representations for MT. - Augmentation of statistically driven corpus analysis with linguistically motivated techniques for extracting lexical knowledge. - Role of bilingual dictionaries, including example sentences and phrases. Extraction of information from pairwise data in dictionaries. - MT mappings (transfer, interlingual, statistically based, memory-based, etc.) and the effect of these mappings on the representation that is used in the lexicon. - Language universals in the lexicon and the construction of an interlingua for MT. - Incorporation of lexical/non-lexical knowledge for selection of suitable candidates for target constructions in MT. - Accommodation of MT divergences and mismatches in the lexicon; implication for automatic construction of lexicons. DEADLINE for submission of articles: July 15, 1994 Articles may be submitted in hard-copy, electronic (either plain text or .ps format) to either guest editor. If submitting hard-copy, please send four copies of the paper. Bonnie J. Dorr Judith L. Klavans Department of Computer Science Department of Computer Science A.V. Williams Building Mudd Building Room 420 University of Maryland 520 W. 120th Street College Park, MD 20742 New York, New York 10027 Email: bonnie@umiacs.umd.edu Email: klavans@cs.columbia.edu Fax: 301-314-9658 Fax: 914-478-1802 ********** III.A.2. Fr: Lou Burnard Re: TEI Guidelines : Publication Date for P3 announced The Editors of the Text Encoding Initiative take great pleasure in announcing that TEI P3, the ACH/ALLC/ACL Guidelines for Text Encoding for Interchange, will be officially published by the three sponsoring organizations on 16 May 1994. Full ordering information is given below. TEI P3 is a completed and much revised version of the P2 draft which we have been publishing in fascicle form since April 1992. It contains 1300 pages of detailed analysis of problem areas, presentation of the TEI tag sets, exhaustive examples and full reference documentation. The TEI Guidelines constitute one of the fullest and most systematic attempts yet undertaken to describe the whole range of texts of interest to scholars in our research communities, using the international standard SGML. With their completion a new chapter opens in the use of computers in research and teaching. Part I of the Guidelines contains introductory chapters on the TEI DTD structure, on SGML and on the TEI itself, together with a full description of the header and core tag sets, and of the default text structure. Part II describes the base tag sets for prose, verse, drama, transcriptions of spoken text, print dictionaries, and terminological databases. In Part III, ten additional tag sets are described (for linkage and alignment; simple analysis; feature structure analysis; certainty and responsibility; transcriptions of primary sources; critical apparatus; names and dates; graphs, networks, and trees; tables, formulae, and graphics; and language corpora). Part IV describes three auxiliary DTDs: for independent headers, writing system declarations, feature system declarations, and tag set documentation. A number of technical topics are addressed in part V, specifically, the notion of conformance, methods of modifying the TEI DTD, rules for interchange of TEI-conforming documents, and ways of handling multiple hierarchies. Part VI comprises a complete alphabetical list of all classes, entities, and elements defined in the TEI DTD, each of which is independently documented. The volume includes reference material on obtaining and using the TEI DTD and WSDs, a formal specification of the TEI interchange format, a bibliography, and an index. On publication, the TEI Guidelines will be available in paper form from each of the addresses below, at the price of $75 (50 pounds or 7500 yen). A discount price is available to members of the three sponsoring organizations ($50, 35 pounds, or 5000 yen). Orders may be placed at any time using the form at the end of this message. The Guidelines will also be made available in electronic form by anonymous FTP by the publication date, if not before: watch this list for further announcements! The editors and steering committee would like to take this opportunity of expressing our thanks to our funders, (the US National Endowment of the Humanities, DG XIII of the European Commission, the Andrew W. Mellon Foundation, and the Social Science and Humanities Research Council of Canada) and also to the many institutions and software houses who have provided support during the long process of bringing these Guidelines to fruition. Our chief thanks however go to all of those volunteers from within the research community who have worked with us over the last five years of drafting, re-drafting, revising, criticizing, discussing, and revising yet again. If there is merit in the results, it belongs to the community which identified the need for this work and encouraged and made it possible for us to carry it out. We would also like to record our personal indebtedness to Don Walker, to whose memory the work is dedicated. We hope that you will find these Guidelines useful and look forward to your comments for their future improvement! C. M. Sperberg-McQueen Lou Burnard Paris, 24 April 1994 FOR COMPLETE INFORMATION, CONTACT: EITHER C. M. Sperberg McQueen University of Illinois at Chicago Academic Computing Center (M/C 135) 1940 W. Taylor, Rm. 124 Chicago IL 60612-7352 U.S.A. fax: +1 (312) 668 6834 OR TEI Orders Oxford University Computing Services 13 Banbury Road Oxford OX2 6NN fax +44 (865) 273275 OR Prof. Syun Tutiya Department of Philosophy Chiba University 1-33 Yayoi-cho Inage Chiba Chiba 263 JAPAN fax: +81 (43) 256-7032 ********** III.C.1. Fr: Maryellen McDonald, Program Manager" Re: Graduate Open House, Drexel University INTERESTED IN THE INFORMATION PROFESSION? The College of Information Studies Drexel University invites you to attend an OPEN HOUSE for graduate students on Saturday, May 7, 1994, 2:00-4:30 P.M. in the Rush Building, Drexel Campus, Philadelphia, Pennsylvania. This is an opportunity to: * Talk informally with faculty, students, and alumni about careers, admission requirements, financial aid, curriculum, etc. * Learn about the M.S. in library and information science, the new M.S. in information systems (M.S.I.S.), and the post-master's and Ph.D. programs * See the facilities and enjoy the refreshments. For reservations, call: (215) 895-2474 or FAX: (215) 895-2494 Maryellen McDonald Program Manager College of Information Studies Drexel University Philadelphia, PA 19104 Office (215) 895-2483 FAX (215) 895-2494 INTERNET: mcdonalm@duvm.ocs.drexel.edu ********** III.C.2. Fr: Paul Mc Kevitt Re: M.SC., Language, Speech, and Auditory Processing, U. Sheffield M.SC. in LANGUAGE, SPEECH AND AUDITORY PROCESSING ONE-YEAR M.SC. COURSE Department of Computer Science in collaboration with Institute for Language, Speech and Hearing (ILASH) Department of Information Studies Department of Psychology Speech Science Unit UNIVERSITY OF SHEFFIELD United Kingdom THE AIMS OF THE COURSE: This advanced M.Sc. programme provides a sound professional education and research training in new areas of information technology concerned with computer perception and processing of human language in all its forms. It is designed to provide an academic and practical grounding in part of what is known in Europe as `The Language Industry'. It aims to provide training for further research in this rapidly growing field in this Department or elsewhere. Language, speech and auditory processing is an inherently interdisciplinary field, involving elements of linguistics, phonetics, computer science, signal processing and artificial intelligence. Graduates generally come into the field with training in a subset of these disciplines, which will vary from person to person. One role of this Master's degree is to fill out the profile of each student in the areas which are appropriate for that person. We therefore aim for a wide choice of modules which can be tailored to individual needs. The course also provides skills in demand in today's world of language and information in electronic publishing, political/economic and scientific information handling, computer aids to translation, speech technology, composition, language learning, and legal retrieval and information handling etc. This course is offered subject to final approval by the University Senate. THE ACADEMIC PROFILE: The Department has a substantial research base in these areas, which has now resulted in University funding for ILASH: the Institute for Language Speech and Hearing, with which the MSc. is associated. ILASH has its own machines and support staff, and academic staff attached to it from nine departments. Sheffield is a node on the EU-funded ELSNET (European Network in Language and Speech) network and participates in many Europe-wide programmes that give opportunities to link to work across the Community. We are coordinating the 11-laboratory Human Capital and Mobility (HCM) EU network SPHERE: `Representations in Speech and Hearing' We also participate in EU ERASMUS programmes in speech and language where students can complete their dissertations abroad. STAFF: The course teaching will draw on staff in the Computer Science Department and other Departments in the University. The following is a list of current Computer Science academic staff working in Language, Speech and Hearing together with their research interests: *Guy Brown: auditory models, sound source separation, audition, speech *Martin Cooke: auditory models, sound source separation, audition, speech *Robert Gaizauskas: logical models of natural language texts, information extraction from corpora *Phil Green: Speech perception, automatic speech recognition. *Mark Hepple: Computational linguistics, grammatical formalisms, parsing, categorial grammar *Mike Holcombe: formal models of NLP, formal models of user modelling visual formal specification languages *Jim McGregor: user modelling, parsing, Prolog, tutoring systems *Paul Mc Kevitt: pragmatics, intentions, natural language dialogue, revision in dialogue, user-computer interfaces, hyper/multimedia, user modelling, integration of speech, language and vision processing *Bob Minors: Modelling arguments in discourse, illogic of argumentation, belief processing *Amanda Sharkey: Connectionist and cognitive models of language: language acquisition, symbol grounding, parsing, translation. *Noel Sharkey: Connectionist Natural Language Processing, Neural Network models of Cognition, Neural Representations underlying language and thought, Sensory and Action grounding of concepts. *Tony Simons: machine translation, syntactic, chart, and object-oriented parsing *Yorick Wilks: artificial intelligence, natural language understanding, belief pragmatics, lexical computation, parsing, information extraction. ENTRANCE REQUIREMENTS: Applicants will normally be expected to have, or be expected to obtain before joining the programme, a 2-2 or better in any subject, but those with degrees in computing, mathematics, psychology, physics, electrical engineering, linguistics, phonetics and cognitive science will be preferred. Work in an information service, computer department, advanced publishing environment or anything similar is considered advantageous, but candidates without such experience will be given equal consideration. International student applicants whose first language is not English will be required to provide evidence of English language competence. STRUCTURE AND CONTENT: The course consists of a taught part for two University Semesters, followed by examinations and then a project examined by dissertation and oral examination. The taught part of the course will consist of twelve modules. (A module occupies 1 semester and typically breaks down into 20 lecture hours and 10 practical/tutorial hours). Since? we aim to cater for students coming from multidisciplinary backgrounds, we endeavour to make the course as flexible as possible. Students choose six core modules and six electives. The advice and approval of tutors must be sought before deciding on the choice of elective. The six core modules are 'Natural Language Processing (I and II),' `Speech and Hearing (I and II),' and `Research topics in speech and language' (I and II). `The latter consists of a series of guest lectures and local seminars which students must attend, discuss, analyse and write essays on. Such modules are valuable both for technical content and for research skills, since understanding the research of others is a valuable asset which requires practise. The Elective modules offered from year to year depend upon the availability of staff and the trends in research and professional practice. Among possible electives modules are (with other departments noted where the courses are theirs): `(Psych/CS) Language and Logic', `Knowledge Engineering (I and II)'. `Data Structures', `Connectionism', `Graphics and HCI', `Machine Reasoning ', `Functional Programming', `Logic Programming', `(Speech Science) Phonetics', `(IS) Information Resources I', `(IS) Information Storage and Retrieval I', `(IS) Computers and Information II', `(IS) Information Storage and Retrieval II', and `(IS) Scientific and Technological Information'. The period from June to 31st August will be devoted to the preparation of a supervised dissertation to be submitted on or before 30th September. ASSESSMENT: Students will be required to pass continuous assessment and examinations for all twelve modules, and produce an acceptable dissertation. These three hurdles will be independent, in that to pass a student must pass all of them and to get a distinction a student must at least approach distinction standard in all of the continuous assessment, the examinations and the dissertation. FEES: The University charges the standard fees 2260 for EU and 7360 for non EU students (Figures in Pounds Sterling). SHEFFIELD: Sheffield is one of the friendliest cities in Britain and is well-situated, having the best and closest surrounding countryside of any major city. The Peak District National Park is only minutes away. It is a good city for walkers, runners, and climbers. It has two theatres, the Crucible and Lyceum. The Lyceum, a beautiful Victorian theatre, has recently been renovated. Also, the city has three mulitplex cinemas. There is a library theatre which shows more artistic films. The city has a number of museums many of which demonstrate Sheffield's industrial past, and there are a number of Galleries in the City, including the Mapping Gallery and Ruskin. A number of important 'stately homes' are close to Sheffield, such as Chatsworth House and Hardwicke Hall. By 1995 Sheffield will be served by a 'supertram' system: the line to the Meadowhall shopping and leisure complex is already open. Sheffield has outstanding sporting facilities, many constructed for the World Student Games in 1991. We have an olympic standard swimming pool and sports complex that is regularly used for international competition. The Sheffield Arena, is becoming an increasingly important venue for touring rock bands. ENQUIRIES AND APPLICATIONS: Please send enquiries and requests for application forms to: Ms. Liz Compton M.Sc. Admissions Department of Computer Science Regent Court 211 Portobello Street University of Sheffield GB- S1 4DP, Sheffield England. E-mail: liz@dcs.shef.ac.uk Fax: 44 742 780972 Phone: 44 742 825590 ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch calur@uccmvsa.ucop.edu or calur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.ucop.edu or nancy.gusack@ucop.edu Mary Engle meeur@uccmvsa.ucop.edu or mary.engle@ucop.edu The IRLIST Archives is now set up for anonymous FTP, as well as via the LISTSERV. Using anonymous FTP via the host dla.ucop.edu, the files will be found in the directory pub/irl, stored in subdirectories by year (e.g., /pub/irl/1993). Using LISTSERV, send the message INDEX IR-L to LISTSERV@UCCVMA.BITNET. To get a specific issue listed in the Index, send the message GET IR-L LOGYYMM, where YY is the year and MM is the numeric month in which the issue was mailed, to LISTSERV@UCCVMA (Bitnet) or LISTSERV@UCCVMA.UCOP.EDU. You will receive the issues for the entire month you have requested. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack or Mary Engle for more information on IRLIST. THE OPINIONS EXPRESSED IN IRLIST DO NOT REPRESENT THOSE OF THE EDITORS OR THE UNIVERSITY OF CALIFORNIA. AUTHORS ASSUME FULL RESPONSIBILITY FOR THE CONTENTS OF THEIR SUBMISSIONS TO IRLIST.