IRLIST Digest December 1989 Volume VI Number 2 Issue 2 *************************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch lynch@postgres.berkeley.edu calur@uccmvsa.bitnet Mary Engle engle@cmsa.berkeley.edu meeur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.bitnet The IRLIST Archives will be set up for anonymous FTP, and the address will be announced in future issues. These files are not to be sold or used for commercial purposes. Contact Mary Engle or Nancy Gusack for more information on IRLIST. The opinions expressed in IRLIST do not represent those of the editors or the University of California. Authors assume full responsibility for the contents of their submissions to IRLIST. Continued from Volume VI Number 1, Issue 1 *************************************************************** I. NOTICES (last minute urgent additions!) A.6. Interactive Digital Video-Technology and Applications for the Next Generation of Multimedia Systems, December 13, 1989 A.7. Hypertext Standardization Workshop, January 16-18, 1990 II. QUERIES: Questions and answers / Requests for information B.6. Finder LAYO Resources B.7. Large-Capacity Graphics Program B.8. Mac SNOBOL B.9. Hypertext B.10. Multimedia Information Management Systems B.11. Test Collections B.12. Pascal Programs - Metric Analysis B.13. UNIX Retrieval Engine B.14. Free-Text Search Software B.15. Test Collections B.16. Digital Data to Holography B.17. Neural Nets and IR B.18. ASCII Dictionary B.19. Personal Name Matching Algorithms III. JOB ANNOUNCEMENTS: 1. Institute of Systems Science, National University of Singapore 2. AI/Computer Science, Griffith University, Australia *************************************************************** I. NOTICES: I.A.6. Fr: Fazli Can Re: Multimedia Systems MIAMI UNIVERSITY Systems Analysis Department Oxford, OH 45056 ANNOUNCES A Faculty Colloquium Presentation By Guest Speaker EDWARD A. FOX Department of Computer Science Virginia Polytechnic Institute and State University INTERACTIVE DIGITAL VIDEO-TECHNOLOGY AND APPLICATIONS FOR THE NEXT GENERATION OF MULTIMEDIA SYSTEMS Date: Wednesday, December 13, 1989 Time: 4:00 pm Place:Room 135 Kreger Hall ABSTRACT Due to revolutions in storage and processing with digital computer systems, we are at the threshold of a new generation of interactive technologies that integrate text, images, graphics, animation, motion video, and audio for the purposes of education, entertainment, point-of-information, design, simulations, and a variety of other purposes. This presentation will explain some of the basic technology involved, including the use of CD-ROM systems, and focus of DVI, Digital Video Interactive, now being developed byIntel, whereby an hour of full motion video can be played back froma comp[act disc. The presentation will include slides, videotapes and discussion, and will be oriented toward interested parties with background in computer science, library science, and educational technologies. ********** I.A.7. Fr: furuta@mimsy.umd.edu (Richard Furuta) Re: Hypertext Standardization Workshop, January 16-18, 1990 - CALL FOR PAPERS - Hypertext Standardization Workshop sponsored by the National Institute of Standards and Technology National Computer Systems Laboratory January 16-18, 1990 Hypertext and Hypermedia technologies have reached the point where it makes sense to consider their potential for formal standardization. A number of authors have stated requirements for hypertext standards and some have offered definitions and initial specifications for consideration. In several cases, specialized standardization efforts have already been initiated through interested organizations. The purpose of this workshop is to provide a forum for presentation of existing and proposed approaches to hypertext standardization in a setting where authors can expect immediate feedback and possible definitive action on their ideas. Workshop goals are to consider hypertext system definitions, to identify viable approaches for pursuing standards, to seek commonality among alternatives wherever possible, and to make progress towards a coordinated plan for standards development, i.e., hypertext reference model. A workshop proceedings will capture position papers provided as input, as well as the deliberations and conclusions of working groups set up in response. An additional workshop output could be the initial draft of a candida e hy pertext reference model and an organizational structure for its further development. We are seeking a small number of detailed position papers that address hypertext standardization. Papers may focus on global issues such as abstract specifications, classification schemes, interface techniques, exchange mechanisms, or discussion of which hypertext components may or may not be appropriate for standardization. Papers that propose specific requirements, definitions, or component specifications, or demonstrate how hypertext standards might interface with existing graphics, image, document, database, or language standards are also welcome. Even papers that question the wisdom of hypertext standardization or assert that standardization is premature are welcome. Papers with general interest and substantial content will be selected for presentation at a workshop plenary session and be the subject of a follow-on working group session. Other contributions will be identified for discussion and consideration by one or more working groups. Papers, or detailed abstracts, should be submitted before December 8, 1989, to: Hypertext Standardization Workshop Attn: Leonard Gallagher National Institute of Standards and Technology Technology Bldg 225, Room A-266 Gaithersburg, MD 20899 Telephone: 301-975-3251 Facsimile: 301-590-0932 E-mail: gallagher@ise.ncsl.nist.gov Authors of papers selected for presentation at the plenary session will be notified by December 21, 1989. All other contributions will be identified in a document register and made available for working group discussions. Workshop sessions will be held at NIST in the main Administration building. All parties interested in Hypertext/Hypermedia standardization are urged to attend. Participants must register in advance for this workshop and pay a modest registration fee to cover meeting expenses. Registration details will be provided in a separate announcement, but interested parties may write to the above address or contact Dan Benigni at 301-975-3266 or Jean Baronas at 301-975-3338. NIST is located in Gaithersburg, MD about 15 miles North of Washington, DC on Interstate 270 and 45 to 55 minutes driving distance from any of the three Washington/Baltimore area airports. A number of hotels ranging from very economical to full business class are located nearby. ********** II. QUERIES: II.B.6. Fr: Tom Coradeschi Re: Finder LAYO Resources Back when system release 6 came out, there was a fair amount of discussion as to the purpose of several of the radio button selections in the Finder LAYO resource. At the very end, are four: Use Phys Icon, Copy Inherit, New Fold Inherit and Title Click. I know that Title Click allows you to pop one level up in the hierarchy, by clicking on the title bar of an open folder to move up to its parent, but what do the others do? I seem to recall some mutterings that they only worked on SEs or MacIIs, but now that I have an SE (finally!!), they still do what they did on my Mac+. Nothing. Suggestions? Tom C Bill the Cat sez: "Remember. If some weirdo in a blue suit offers you some MS-DOS. JUST SAY NO!" ARPA: tcora@pica.army.mil UUCP:...!{uunet,rutgers}!pica.army.mil!tcora ********** II.B.7. Fr: jpl06!john@jato.jpl.nasa.go Re: Looking for large-capacity graphics program I am looking for a Mac application that will allow me to display and scroll through a very large time series (> 100,000 y-values, equally paced in the time index, t). I would like to be able to display all-or-part of the time series, zooming in-out (perhaps defining the region to be displayed via a click-and-drag box around the "interesting" part?). It would also be nice to be able to force plotting scales, mark points as "interesting" or "to-be-deleted", and have a "bad-data" flag associated with each point (or a time interval) so that points thus flagged are not plotted. Has anyone heard of or used such a program? Thanks in advance. John Armstrong jpl06!john@jato.jpl.nasa.go ********** II.B.8. Fr: "Jeff Balvanz" Re: Looking for Mac SNOBOL To all: we are looking for a Macintosh implementation of either SPITBOL or SNOBOL for the Macintosh. A check of our software catalogs indicated no interpreters for either commerically available. Does anyone know of anyone offering either language for the Mac, either commercial or public domain? Thanks to all for the assistance. Jeff Balvanz Senior Technical Consultant Microcomputer Services Iowa State University Computation Center BITNET: GR.JLB@ISUMVS (preferred) INTERNET: GMMPC@CCVAX.IASTATE.EDU PHONE: (515) 294-8683 USMail: 104 ATANASOFF* HALL, ISU, AMES, IA 50011 *Inventor of the digital electronic computer. ********** II.B.9. Fr: Ben Shneiderman Re: Information search I have not heard from Bernard Rous with plans to give professional credit for the Hypertext on Hypertext project to myself, Nicole, etc. I consider this extremely important in setting a proper precedent for the field. I hope to hear from you with some response to this issue. We continue to have fun with the hypertext issue and are pursuing several directions: - multi-window browsing - automatic import - graphic browser ideas - more elaborate search strategies plus some fun hypertext creation projects...our Hypertext Hands-On! project will be published by Addison Wesley in March...it will be a book/disk with Greg Kearsley. Best wishes for 1989...Ben ********** II.B.10. Fr: cordes@dbsinf6.bitnet Re: Keywords: Multimedia Information Management Systems (Products and Prototypes) Hello out there in the IR community, I'm searching for some information about multimedia information management systems. In particular, I'm searching for so called kernel systems like non standard dbms or object bases (like ORION), which are able to support multimedia applications. My special interests are prototypes or products which are available, running under UNIX on SUN 3/xxx, SUN 4/xxx or Apple Mac II. Thanx in advance Cheers Ralf Cordes Ralf Cordes Postal Addreess: Technische Universitaet Braunschweig Inst. fuer Betriebssysteme und Rechnerverbund Bueltenweg 74/75 D -- 3300 Braunschweig WEST -- GERMANY USENET: mcvax!unido!infbs!cordes BITNET: CORDES@DBSINF6.BITNET ********** II.B.11. Fr: LEWIS@cs.umass.EDU Re: For IRLIST: Query on availability of test collections In December I attended the Workshop on Evaluation of NLP Systems, and found there was interest on the part of several NLP researchers in obtaining IR test collections to be used in experiments on natural langauge analysis. In response to this I made sure that all participants received information on the CD-ROM that Ed has put together, but I also want to create a list of other sources that are available. So if you have a test collection you would be willing to distribute, I would appreciate it if you could send me the following information: some basics about the collection (number of queries, number of documents, type of text), address and name of the contact to get the collection, medium it is available through (tape format, CD-ROM, FTP, etc.), and any charges or conditions associated with obtaining the collection. Also please specify whether you want this information distributed only to the (approximately 50) participants in the NLP workshop, or whether you are willing to have it posted on IRLIST as well. (There has been some concern from institutions I've contacted directly about getting swamped with requests. I consider this unlikely, but I'm being very careful to respect people's wishes.) It is worth pointing out that getting standard test collections into the hands of NLP researchers will hopefully result in parsed versions of some of these collections, thus allowing IR researchers without access to NLP software to do research on NLP applications to IR. Best, Dave David D. Lewis ph. 413-545-0728 Computer and Information Science (COINS) Dept. University of Massachusetts, Amherst Amherst, MA 01003 USA BITNET: lewis@ umass ARPA/MIL/CS/INTERnet: lewis@cs.umass.edu ********** II.B.12. Fr: pruitt@PEDEV.Columbia.NCR.COM (Daniel Pruitt) Re: Pascal Programs -- Metric Analysis Summary: Need working programs from ugrad/grad classes. My thesis is a Complexity Analysis of Software. I am using very basic metrics that were developed by Halstead and McCabe. I need working Pascal programs to collect various complexity metrics. I intend to compare the results of my data with results published in the literature. This study is merely data exploratory analysis. I am not attempting to validate these measures, but to see if I come up with similar results. The programs I need must be programs developed in an undergraduate or a graduate class. Please send a description of the class along the Pascal source code. Daniel Pruitt ...!ncrcae!pedev!pruitt pruitt@pedev.columbia.ncr.com Work: (803) 739-7360 Home: (803) 254-7024 ********** II.B.13. Fr: "Robin C. Cover" Re: UNIX RETRIEVAL ENGINE SMART I posted a query on the HUMANIST discussion group asking for recommendations on a good UNIX-based full-text retrieval engine. I should have started with you first. Someone mentioned that you implemented SMART under UNIX. Was that BDS, I assume? Unfortunately, we have HP-UX (2.1) running on an HP 9000/840s, and it's System V with a few BSD extensions. In any case: (a) could you point me to some information on SMART (b) could you indicate whether you think we could run it under HP-UX? (c) is it affordable? Alternately, do you know of other candidates we might look at? The applications would be pretty general (ascii text documents with very simple structure, like electronic digests, bibliographies, etc. But if the engine is really good, I would press it into service for searching in literary text too, using finer levels of control on proximity, precedence and document structure. I am also looking forward to your experimental CDROM, for which I sent a request some months ago. Thank you for your help in above details. I am grateful. Professor Robin C. Cover Director, Academic Computing (Dallas Seminary) 3909 Swiss Avenue Dallas, TX 75204 214/296-1783(h), 824-3094(w) zrcc1001@smuvm1.BITNET convex!txsil!robin.UUCP killer!dtseap!robin.UUCP ********** II.B.14. Fr: Beng Re: Information search I would like to find out from the other subscribers whether they know of any free-text search software package. I am designing a database applications (1000 records daily) where the records are not stored by any indexing or assigned keywords. I heard of "N-grams" and "superimposed-coding" whereby a record signature is generated automatically by the system when a text record is inserted into the database. It is possible to search on any word inside the record. I am very interested in acquiring such a software. Thank you very much for your attention. Regards, Beng ********** II.B.15. Fr: Kim Young Hwan Re: Information search I am writing to you to inquire about the availability of test collections of document with queries and related hierarchical thesaurus. For past five months, I have been working on designing knowledge based Information Retrieval model based on the hierarchical thesaurus. Hierarchical thesaurus represents the 'generalization" relationship among index terms. I developed the matching algorithm based on conceptual distance, using hierrchical thesaurus (knowledge base). I now need to evaluate my model and matching algorithm. Therefore I need test collections under which I compare the performance of my model with existing model and the output of human retrieval ranking. But here in Korea, I can't find a relevant test collection. If you can send me some test collections, it will be very helpful to my thesis research. If you don't have, would you inform me where I can get the test collections. I am particulary interested in getting two test collections. One is indexed by binary indexing scheme and the other is indexed by weighted indexing scheme. Sample query with vector representation is enough. Of course associated hierarchiacl thesaurus is also needed. Because I want to compare with the output of human retrieval ranking, the collection about computer science or artificial inrelligence is prefered. The computing facility I can use are UNIX based workstations (SUN-3, HP-9000,..), minicomputer (VAX-11), and LISP machine (Symbolics 3650). If needed, don't hesitate to bill me expenses. Thanks in advance. I am looking forward to hearing good news from you soon. Sincerely. Young Whan Kim. ********** II.B.16. Fr: dave davis Re: Information search My particular interest is in translating digitial data into holographic form for storage on optical media. I have done some searching in the technical literature, but I haven't found anything that is close to this idea. I suspect that storing a holographic image of information could take advantage of holograms built-in redundancy. For example, damage to a part of the data would result in very little practical information loss. It is also possible that greater densities of storage might be achievable. I am interested in knowing if anyone has a similar research interest within the IR interest group. Dave Davis MITRE Corp. 7525 Colshire Dr. McLean, VA 22102 ********** II.B.17. Fr: munnari!trlamct.oz.au!andrew@uunet.UU.NET (Andrew Jennings) Re: neural nets and information retrieval What I need: Being one of the world's most disorganised people, I have information and letters, papers stacked everywhere within my home directory. At least all of the stuff is on the computer. What I'd really like is a computer based tool that looks for the file, letter or document I'm after. Grep based on keywords is too dumb. What I'd like is a tool that selects likely documents based on their content, given some NL from me. It doesn't matter too much if I get two documents and one of them is not what I want. Possible links: Someone suggested to me that neural nets may be useful here. If someone has already developed such a thing: please mail me. If you are doing research in this, again I'd like to hear from you. I'll summarise replies. ARPA: andrew%trlamct.oz@uunet.uu.net fax : 61 3 543 8863 Andrew Jennings AI Systems Telecom Australia Research Labs ********** II.B.18. Fr: beng Re: ASCII DICTIONARY I am interested in acquiring a machine-readable dict/thes/synonyms for my projects. Please could you kindly inform me if you knew of any existing products available on the market. I am particularly interested in products that allow the spell checking, word searching facilities to be incorporated into my programs. Thank you for your attention. Best Regards, Beng ********** II.B.19. Fr: Christine Borgman, UCLA: IIN4CLB@UCLAMVS.BITNET Re: Personal name matching algorithms I am doing research on the problem of matching variant forms of personal names in medium to large size databases. The immediate problem is methods of bringing together variant forms of artists' names (mostly European painters) in a museum database. Not only do the names include all the variants that have been introduced over several centuries, but the forms changed as the paintings were sold in different countries -- German names became Italianized, etc. We need to be able to recognize whether two names may indeed be the same artist. While the immediate problem applies to museum collections, it is a problem critical to information retrieval in libraries (author names) and to other personal-name databases such as police, government, credit, airlines, personnel files, etc. Through the literature and my own experience I have identified some n-gram matching algorithms, phonetic algorithms (e.g. Soundex), and data-driven techniques that check for known likely errors (transposition, omissions, insertions, etc.). I want to identify any operational algorithms for personal name matching that might be applicable to the artists' name problem, as well as any new theoretical approaches. If anyone is working on this problem, has an operational or experimental algorithm, or knows anyone who does, I'd be pleased to hear from you soon! I will post a summary of any messages I receive. Thanks in advance. *************************************************************** III. JOB ANNOUNCEMENTS III.1. Fr: Desai Re: Help in recruitment "The Institute of Systems Science (ISS) is an applied research unit attached to the National University of Singapore. It has 50 people in its R&D group and is poised for further growth. Its research focus is primarily to build prototypes of multimedia, multilingual systems. In the past it has developed many interesting prototypes including multimedia DBMS, Visual Language interfac es to database systems, Data models, Computer aided Chinese Input System, Anima tion languages, hypermedia etc. We are now planning to launch a major project in Multimedia Information Retrieval. The scenario is to store hundreds of thousands of images and their descriptions and build a system that will allow retrieval of suitable pictures based on a variety of queries. The system is likely to include optical juke boxes, free text search software, image browsers and other relevant features. We are now looking for graduates who have specialized in information retrieval techniques or systems. Singapore is located very strategically and enjoys a sunny climate through out the year. Temperatures range from 28 to 35 degrees celsius. It is a gateway to the rest of Asia. Singapore provides the best of east and west - go od amenities and a host of different cultures. Singapore dollar is very steady and is valued roughly at 50 US cents. Our compensation is one of the best in the region. For more details please contact Desai Narasimhalu, BITNET ID ISSAD at NUSVM." I hope that it is not too long. Thanks for your help. With best regards Desai ********** III.2. Fr: Emma Pease Re: Job announcement The following might be of interest to people involved in CSLI. The University where I work in Australia is currently looking for more faculty. There are 2 chairs in AI/Computer Science, plus positions for lecturers and senior lecturers (which are like assistant and associate professors). The ad follows. If anyone would like more details you can email to me at ford@csli. GRIFFITH UNIVERSITY Computing and Information Technology Brisbane, Queensland, Australia TWO CHAIRS SENIOR LECTURER/LECTURERS Griffith University has an innovative and rapidly expanding degree programme in Computing and Information Technology. The University is currently looking to make a number of senior academic appointments to support new initiatives in Artificial Intelligence, Software Engineering, and other aspects of this programme. Currently there are twenty four faculty positions associated with teaching in the School of Computing and Information Technology's degree programme, the Bachelor of Informatics Degree. Major areas of study offered in the degree programme are Artificial Intelligence, Software Engineering, and Microelectronics and Information Systems. Honours, MPhil and PhD degree programmes are also offered. Major areas of research include programming methodology, the formal aspects of program specification and derivation, declarative programming, theoretical data flow studies, expert systems, natural language processing, artificial intelligence, computing education, determinants of success and failure in systems development, human-computer interaction and the social, ethical, and organizational implications of information technology. POSITIONS AVAILABLE: % Professor (2 positions Artificial Intelligence/Computing Science) % Senior Lecturer % Lecturer several positions AREAS WHERE POSITIONS ARE AVAILABLE: % Artificial Intelligence/Cognitive Science (principally Knowledge Engineering /Expert Systems) % Software Engineering % Project Management % Networking/Distributed Systems % Organizational Systems % Organizational Psychology % Human Computer Interaction % Social and Philosophical Aspects of Computing % Data Base Systems Applications from other areas of Computing Science will also be considered. Applications from both women and men are encouraged. Employment benefits include parental leave and the possibility of access to full-time child care and after school care. Applications, which should include a curriculum vitae and the names and addresses of at least two referees, should be addressed to the Secretary, Senior Selection Committee (Professorial applicants) or the Divisional Administrator, Division of Science and Technology (other senior appointments), Griffith University, Nathan, Queensland, 4111, Australia. Telephone enquiries should be directed to the Head of School, School of Computing and Information Technology (Professor Geoff Dromey) on (07) 274 8010. Closing date for the Chair Posts is May 31, 1989. INFORMATION FOR APPLICANTS (Contains some information relevant only to the Chair positions -- but of interest to others) Griffith University is located on the southern outskirts of Brisbane (11 km from the City Centre) in the State of Queensland, Australia. Brisbane has a population of 1,171,340. The University, which is growing rapidly, currently has nearly 5,000 students. The Division is the basic academic organizational unit at Griffith University rather than the traditional discipline-based department. Each Division brings together faculty staff with training in different academic disciplines, who focus their research and teaching activities on problems of interest to the Division. The existing five Divisions within the University are: Australian Environmental Studies, Commerce and Administration, Humanities, Asian and International Studies, and Science and Technology. There are two Schools within the Division of Science and Technology, the School of Computing and Information Technology and the School of Science. The School of Computing and Information Technology is responsible for a Bachelor of Informatics degree while the School of Science offers a Bachelor of Science degree. MPhil and PhD programmes are also offered. There are over 400 students enrolled in the Bachelor of Informatics programme. The Bachelor of Informatics degree programme in the School of Computing and Information Technology involves 3 years of full-time study. In the foundation year students concentrate their studies in three primary areas - mathematics and programming methodology, group dynamics and communication skills, and the social and organizational implications of Information Technology. In the second and third year, students do core studies plus a specialization in either Software Engineering, Artificial Intelligence, or Microelectronics and Information Systems or specializations in any two of these areas. A new specialization in Microelectronics is planned. The staff responsible for teaching in the School of Computing and Information Technology have recently moved into a new Technology Building. The Technology Building is shared with staff engaged in Microelectronics and Biotechnology teaching and research. Current research interests of staff include the formal aspects of program specification and derivation, declarative programming, theoretical data flow studies, expert systems, natural language processing, artificial intelligence, computing education, determinants of success and failure in systems development, human-computer interaction and the social, ethical, and organizational implications of information technology. The State Government Department of Industry Development is establishing a Research Park next to the University campus which is designed to provide both an incubation centre and a setting for research laboratories of "high-tech" companies. A close liaison between the University and firms located in the Park is seen as highly desirable. The professorial salary in Australia is presently $58,348 per annum. The University will assist with the cost of fares and removal expenses and housing. The successful applicant will be required to join the Superannuation Scheme for Australian Universities. Form of Application Applicants are requested to set out the following information in tabular form: (a) Full name, address and telephone number (also contact address for the next three months), telex, FAX, and/or electronic mail if available. (b) Country of permanent residence. (c) Date and place of birth. (d) Citizenship. (e) Marital status and names and dates of birth of dependent family. (f) Present appointment. Notice required. (g) Details of education and professional training and qualifications, including summary of undergraduate academic record. (h) Details of teaching, research and industrial experience. (i) Research interests and list of publications. (i) Any other relevant information such as offices held in professional bodies, community service, etc. (j) Name and address of at least three referees to whom the University may write direct. Candidates are also asked to state their views on interdisciplinary and problem-oriented teaching and research; and to indicate the contribution they would expect to make in a team engaged in work of this kind. Applications should be submitted by 31 May 1989 to: The Secretary Senior Selection Committee Griffith University Nathan Qld 4111 AUSTRALIA *************************************************************** Continued in Volume VI Number 3, Issue 3 *************************************************************** l 219/l