IRList Digest Tuesday, 11 November 1986 Volume 2 : Issue 58 Today's Topics: Query - References on text processing/NLP approaches to IR Announcement - New Digest for Writers and Educators Seminar - Knowledge Gateways: Building Blocks and Beyond Article - Y. Choueka of RESPONSA Project at Bellcore on Sabbatical News addresses are ARPANET: fox%vt@csnet-relay.arpa BITNET: foxea@vtvax3.bitnet CSNET: fox@vt UUCPNET: seismo!vtisr1!irlistrq ---------------------------------------------------------------------- Date: Mon, 3 Nov 86 13:18:01 CST From: Raymonde Guindon Subject: references/bibliography for text processing approaches Cc: guindon%sw.mcc.com@mcc.com I am looking for any references on text processing/ natural language processing approaches to retrieval of information from documents. If such biblio has not been done, I'd volunteer to create it from the received responses. One example of such work is by Michael Mauldin (IRlist, May 22). Thanks Raymonde Guindon [Note: this is a broad subject area! When you say "text processing" you open up discussion to most of the information retrieval work done in past decades, and when you mention NLP that might suggest a wide area of AI work. Can you clarify, or do you really want everything implied by your statement? Issues 52-54 had references on automatic indexing; are you focusing only on retrieval or on document analysis and indexing too? Are you concerned with bibliographic references or on abstracts or on full text or on passages in full text?. I suggest that whatever you produce be somehow organized -- maybe you might make it in the form of a course outline with references attached at the leaves of the outline tree. I suggest that along with people sending you items, that we have more discussion by IRList participants on how to develop such bibliographies as an ongoing service to the community. I believe Dr. J. Deken of NSF was interested in development of such bibliographies, and that a truly cooperative effort might be very beneficial. I would be happy to have a very long bibliography published occassionally in SIGIR Forum if that will help the process. - Ed] ------------------------------ Date: Thu, 30 Oct 86 09:09 EDT From: (Composition Digest) Organization: University of Louisville Subject: New Moderated Digest for Writers and Educators Announced This is to announce a new interest group devoted to the study of computers and writing, specifically writing instruction in computer-based classrooms. We are interested in articles pertaining to, but not limited by, the following topics: Human/Factors research and writing environments Text editor design Natural Language adjuncts to writing instruction Computers and the soft sciences Psychological effects of computer writing/instruction Composition theory applied to computer-based instruction Anecdotal accounts of computer writing experiences Using the NET in the classroom Computer-based conferences Public domain software for the classroom Reviews of writing and editing packages Conference announcements and proceedings Telecommunications and its effects on language Writing without paper Computers and hearing impaired students Computers and learning disabled students Computers and basic writers Computers and humanists Computers and writing professionals This will be a moderated newsgroup with issues released weekly. It will be a forum for writing professionals (those who must use computers for their writing) and computing professionals (those who design the hardware and software that writers depend upon) to meet and discuss issues relevant to both fields, but we welcome notes from novice computer writers. Notes and requests to be included on a mailing list should be sent to BITNET: compos01@ulkyvx.bitnet ARPA: compos01%ulkyvx.bitnet@wiscvm.arpa ------------------------------ Date: Mon, 13 Oct 86 01:47:00 EDT From: seismo!allegra!hoqam!wbf Subject: Misc., Seminar Ed, Thanks for forwarding the messages about the reuse paper. Have you sent copies to the two people who were interested, or should I? By the way, the paper has been accepted for the HICCS conference next January. Brian and I are planning to rewrite parts of the paper, and also do further work in this area. I'll keep you posted. [Note: the paper mentioned appeared in Issues 47-49 and should have been received by all. - Ed] .. Don Hawkins and Louis Levy of the library system were down last week and gave us an interesting talk on knowledge gateways. I've included the abstract of their talk. .. Bill ======================================== QAC RESEARCH SEMINAR TITLE: Knowledge Gateways: Building Blocks and Beyond SPEAKERS: Donald T. Hawkins and Louise R. Levy AT&T Bell Laboratories Murray Hill, NJ 07974 DATE: Friday, October 10, 1986 TIME: 2 pm LOCATION: HO 2N-431 ABSTRACT Technological advances over the past two decades have made data retrieval faster and easier. Some progress has even been made towards increasing the relevancy of the data obtained by the information user. Meanwhile, a whole industry providing access to business, professional, and sci/tech information electronically has sprung up. Despite these activities, information sources remain scattered, hard to find, and difficult to access. It remains the task of technology and visionary individuals to build knowledge gateways capable of leading knowledge-seekers to the needed information, wherever it may be stored. This talk will discuss the following topics relating to gateways: o Definition o State of the art o Technologies (building blocks) o Examples of some gateways o Visions of the future SPONSOR: Bill Frakes HO 2H-530 x7186 hoqam!wbf ------------------------------ Date: Wed, 22 Oct 86 15:37:48 edt From: choueka@thunder.bellcore.com (Yaacov Choueka) Subject: Re: research in IR Thanks for your note and sorry for my late response. I found what I was looking for in a note by Salton in Forum 1980. [Note: this refers back to the question in Issue 55 - Ed] .. I am in Bellcore since sept., and will be here for one year. I certainly hope to be able to visit a few inst. in the US, and to have some talks about problems of mutual interest. Were exactly in Virginia are you located? By the way it is possible now to connect to the database in Israel from anywhere in the US using a simple PC with a modem and a regular tel. line,via Telenet or Tymnet,and I am connected ,for example , from Bellcore. so live demonstrations of the system can be given. I am appending a memo sent reecently to a few people in Bellcore,that will remind you of us. Best regards. LET'S GET ACQUAINTED I just arrived from Israel, on a sabbatical from the dept. of Math. and Comp. Sc. at Bar. Ilan Univ., invited by Don Walker to spend a year of research at Bellcore.(You can easily recognize me by the knitted "Kippah" I have usually on my head). Judging from the few days I am already here, I do have wild expectations for a very interesting and fruitful year. I am already overimpressed by this Garden of Eden of equipment and hardware, in which so many Suns are shining, and in which no fruit seems to be forbidden, except maybe for (what else) Apples... I am sure I am going to learn a lot here, although, hopefully, it will be a mutually enriching experience. I am bringing with me almost twenty years of experience in teaching and research in computer science, some of it (in the early years) in finite automata and formal languages theory, but most of it in information retrieval, computational linguistics and text processing. I was part of the team that initiated the RESPONSA project back in 1966, and I serve as its Director and Principal Investigator since 1975.This is basically a full-text retrieval system, one of the very first to be operational on a sizable data base. The batch version was ready in 1968,and the On-line version in 1980.The database consists today of some seventy million words of running text in Hebrew, representing major parts of the Jewish Heritage and classical writings. The largest part of the database is the Responsa ("questions and answers") collections, containing the full and unaltered text of 250 volumes (50,000 documents) of Rabbinical "answers", each document being in fact a juridical decision given according to the Jewish-Talmudic legal system, and related to an actual case presented to a Rabbinical court or brought to the attention of a prominent Rabbi, from various Jewish communities all over the world. The database, spanning about a thousand years and originating from more than thirty different countries, is a fantastic storehouse of information on Jewish law ,history, philosophy, ethics, local customs and folklore. The system is available On-line 24 hours a day, six days a week (never on Saturdays and Jewish Holidays !) and can be accessed by PC's with regular telephone lines via telecommunications networks (Tymnet or Telenet, Isranet, etc.).It is routinely accessed by tens of workstations in Israel, as well as from Chicago, Los-Angeles and London. It is expected that in the next couple of years hundreds of terminals will be connected to the system in major universities, libraries, information centers, Jewish institutions, etc., in the United States and in Europe. I hope to be able to connect to the system and to demonstrate it here in Bellcore soon. The software, by the way, has been adapted for English databases too, and can handle quite large ones (several hundreds of megabytes of text). Besides serving the real needs of a real community of users, the system is also used as an experimental laboratory for information retrieval, computational linguistics and text processing. Three Ph.d. and about fifteen graduate theses were written in this environment. Some of the areas researched are: feedback in document retrieval systems, text compression (on both experimental and theoretical levels), character manipulation, spelling errors' detection and correction, automatic lemmatization, mechanical resolution of morphological ambiguity, retrieval of collocations and idioms, expert systems for citations' retrieval, etc. . Among the many unique features available in the system that distinguish it from currently available full-text packages is a full-fledged morphological component embedded in the retrieval part, as well as a subsystem that gives correct and full morphological analysis of any word in Hebrew (the number of which is estimated to be in the order of a hundred million). I am now preparing a short report on the Project with some more details on the research associated with it, including references to published papers. The report will be distributed to the 21230 division ,but if you know of anybody else who might be interested in it, please let me know. I will be happy to discuss any of papers to be mentioned in this report with anyone who would be interested in this, as well as to have informal talks with small groups if there is any echo to this note... . "The beginning of wisdom is to acquire wisdom..." (Proverbs) Yaacov Choueka (pronounced Shweka) MRE-2A325, #8295175.