IRList Digest Monday, 13 Jan 1986 Volume 2 : Issue 1 Today's Topics: Query - Good text for Natural Language Processing Course? Faster way to get dictionary than from Oxford Text Archive? (Reply) Collins dictionary and speed of Oxford delivery Full text retrieval experiments? Discussion - FTP, Back Issues, BITNET searching of IRList Announcement - CRTNET extract about 'Freeware' Information-Retrieval System Call for Papers - 3rd Symp. Logic Programming, IEEE, 9/21-5, Salt Lake City ---------------------------------------------------------------------- Date: Wed, 1 Jan 86 11:56:32 est From: fox@vtcs1.VT Subject: Query on Text for Natural Language Processing Course For a 1 semester course on Natural Language Processing, are there any single texts, or pairs of texts, that people have found to work well? One suggestion has been to use either a) Simmons, Computations from the English - or b) Schank & Riesbeck, Inside Computer Understanding and a general book like Tennant, Natural Language Processing Does that sound reasonable? Any other suggestions? Have I missed some new book that would be ideal? Thanks for your comments, Ed Fox ------------------------------ From: James Peterson Date: Tue, 31 Dec 85 07:51:17 cst Subject: Collins dictionary Your recent note in IRList indicated "we have spent months trying to decode the typesetting and cleaning things up and will give the final result back to the Archive for redistribution." Can you tell us when the cleaned up version will be available, and is there a "faster" way to get it than going thru the Archive. I am thinking about the postage and shipping delays in dealing with the UK mainly. jim ------------------------------ From: fox@vtcs1.VT Date: Sun, 5 Jan 86 14:38 EST To: peterson at mcc Subject: Collins Jim - I appreciate your report earlier on your dictionary work and understand your desire for haste. Actually the processing time for us for Oxford files has been very good - biggest delay is on this end getting check in pounds and all the related paperwork. I have a student working on this and he should be finished by April at which time we can send it to Oxford. You can contact them to get permission for me to send it directly to you if you like, but the arrangement with them is to use it for research here so I can't do otherwise without authorization. Regards, Ed ------------------------------ From: vtcs1::"fraenkel@wisdom"@vtcs1.VT From: Aviezri Fraenkel Date: Thu, 2 Jan 86 16:17:09 -0200 Subject: Full text retrieval Remember the Blair & Maron CACM paper and your inquiry of me about our tests of full text systems? [Note: I was wondering about similar experiments to that described in March 85 CACM. - Ed] I suppose you know of the Lexis and Nexis full text systems (70 billion characters or so). Perhaps they ran some tests... Sorry for being so slow with this information. Best, Aviezri Fraenkel [Question: Does anyone know of results by these organizations or others to share? - Ed] ------------------------------ From: fox (Ed Fox) Date: Sat, 11 Jan 86 17:41:12 est Subject: IRList archives, FTP, BITNET searching for ARPANET digests There appears to be some confusion on IRList archives and on search access. First, in Vol 1 Issue 12, 20 Sept., Christopher announced that anonymous FTP will give access to IRList archives in [SUMEX-AIM.arpa]IRLIST.TXT Second, I keep archives here and will be happy to mail missing issues on request. Vol. 1 had issues 1-28. Third, there are archives on BITNET. In IRList V1 Issue 15 of 19 Oct (sorry, but it was labeled Sep in hdr), Henry Nussbacher announced that there is a Spires database that can be searched by submitting queries, for several ARPAnet digest including IRList. If one sends a msg to database%bitnic.bitnet@wiscvm.arpa (or wiscvm.wisc.edu if you can) with 1st 3 lines as follows: help help arpanet help design you will get back 3 files explaining further. In IRList issue 23, Werner Uhrig asked how to get back files since his attempt to use the BITNET system did not succeed. I don't know if Henry and Werner have been in touch, but did send the 3 line msg and received instructions, without a hitch, so I trust things do work. Please let me know if there are any problems for questions. Regards, Ed ------------------------------ From: T3B%PSUVM.BITNET%wiscvm.wisc.edu@CSNET-RELAY Date: Fri, 3 Jan 86 10:07 EST Subject: CRTNET NEWSLETTER 20 [edited extract - Ed] The Abstract SEARCH Program A 'Freeware' Information-Retrieval System Gerald M. Santoro December - 1985 The Abstract SEARCH Program is a simple information-retrieval system for the IBM-PC microcomputer and close compatibles. I developed SEARCH last year to keep track of articles in microcomputer journals, file drawer contents and the contents of ad-hoc notebooks. The operating theory behind SEARCH is quite simple. The user uses a file editor (one is provided with SEARCH) to create one or more files containing abstracts. Each abstract contains text describing the item in question. The text may be up to 18 lines long. In the case of a journal article this text may contain the subject, author and some descriptive keywords, along with a page reference. All abstracts are grouped under one-line text strings which serve as locators. In the case of journal articles the locator may be the volume, date and issue of a given journal. Whever an abstract is selected by SEARCH as meeting the users retrieval criteria, its corresponding locator line is also presented. The locator then provides the necessary link to the actual article. To retrieve information, the user runs SEARCH and provides the program with a search domain and a search object. The search domain is the names of all files to be searched. (SEARCH will accept up to 50 files) The search object is a list of strings which define the criteria to be met by the desired abstracts. The individual strings must match exactly with strings in the abstracts, and may be related by Boolean 'AND' and 'OR' operators. (SEARCH will accept up to 10 strings in the search object) The output generated by SEARCH will contain the locators and abstracts for all abstracts in the search domain which meet the criteria established by the search object. This output may be directed to an attached printer or to a disk file. If it is directed to a disk file it is written in a form which allows it to be subsequently read as an abstract file. This allows 'subset' files to be created from 'master' abstract files. SEARCH has one major drawback: since it reads all of the text in all of the abstracts it can be rather slow. For that reason I built 3 operating modes into SEARCH. The simplest is interactive mode. In interactive mode SEARCH will display an abstract 'hit' and wait for the user to indicate whether this abstract is to be saved to the output or ignored. This mode may be useful for quick searches of small files. The second operating mode is batch mode. In batch mode the user specifies the search domain, search object and output destination. SEARCH then saves all 'hits' to the output destination while the user does something supposedly more constructive than watching the PC juggle bytes. The third operating mode is command file mode. In this mode the user creates a file (using a file editor) which contains a number of search problems. Each problem is a specification of search domain, search object and output destination. When SEARCH is put into this mode it will ask for the name of the command file. It will then open this file and work on each of the problems until the command file has been completely processed. To illustrate the utility of this system, I will describe how it is used at the Microcomputer Information and Support Center at Penn State. Over the past year we have created abstract files containing references to articles in a number of microcomputer-related journals. We have also created abstract files containing references to vendor literature in physical filing cabinets. When a client comes in with a consulting need we can use SEARCH and our abstract files to quickly access the technical information cross-referenced by the system. For example, if a question arises regarding ink-jet printers we can run SEARCH and in a few minutes have a list of journal articles and vendor literature relating to ink-jet printers. Since our master abstract files are organized by journal, and since they have grown rather large, a command file is periodically run to create subset abstract files for printers, modems, displays and various types of application software. This provides for topic-oriented domain files at the users discretion. Although SEARCH is not a general-purpose database management system, it does have some applicability to simple database problems. I am aware of users maintaining databases of potato seedlings, criminal sentencing reports and private business referrals under SEARCH. SEARCH is distributed as a 'Freeware' package. I have copyrighted the system and allow anyone to copy, distribute and use it as long as 2 rules are always observed: (1) the copyright notice shall not be removed from the software or the documentation, and (2) the software and documentation may not be distributed in modified form without my prior written permission. I do include the BASIC source files for the program, in case any users want to make modifications. However I will not provide programming assistance to those wishing to do so. I also ask for a $10 donation in case the user feels that the software is worth it. However, a donation is not required to use the program. Anyone wanting to obtain a copy of SEARCH should contact me by electronic mail or US Mail at the address below. ******* Gerry Santoro ******** * Microcomputer Information & Support Center *********** * Penn State University ******** *** * 101 Computer Building ********** *** * University Park, PA 16802 *************** * (814) 863-4356 ************ * ********* * GMS @ PSUVM (bitnet) ******* ** * santoro @ penn-state (csnet) ***** ** * ...!psuvax1!psuvm.bitnet!gms (uucp) **** * gms%psuvm.bitnet@wscvm.arpa (arpa) *** * ******* ------------------------------ From: Bob Keller Date: Mon, 6 Jan 86 20:33:58 MST Subject: Symposium on Logic Programming '86 SLP Call for Papers Third Symposium on Logic Programming Sponsored by the IEEE Computer Society September 21-25, 1986 Westin Hotel Utah Salt Lake City, UT The conference solicits papers on all areas of logic programming, including, but not confined to: Applications of logic programming Computer architectures for logic programming Databases and logic programming Logic programming and other language forms New language features Logic programming systems and implementation Parallel logic programming models Performance Theory Please submit full papers, indicating accomplishments of substance and novelty, and including appropriate citations of related work. The suggested page limit is 25 double-spaced pages. Send eight copies of your manuscript no later than 15 March 1986 to: Robert M. Keller SLP '86 Program Chairperson Department of Computer Science University of Utah Salt Lake City, UT 84112 Acceptances will be mailed by 30 April 1986. Camera-ready copy will be due by 30 June 1986. Conference Chairperson Exhibits Chairperson Gary Lindstrom, University of Utah Ross Overbeek, Argonne National Lab. Tutorials Chairperson Local Arrangements Chairperson George Luger, University of New Mexico Thomas C. Henderson, University of Utah Program Committee Francois Bancilhon, MCC William Kornfeld, Quintus Systems John Conery, University of Oregon Gary Lindstrom, University of Utah Al Despain, U.C. Berkeley George Luger, University of New Mexico Herve Gallaire, ECRC, Munich Rikio Onai, ICOT/NTT, Tokyo Seif Haridi, SICS, Sweden Ross Overbeek, Argonne National Lab. Lynette Hirschman, SDC, Paoli Mark Stickel, SRI International Peter Kogge, IBM, Owego Sten Ake Tarnlund, Uppsala University