IRLIST Digest ISSN 1064-6965 January 15, 1996 Volume XIII, Number 3 Issue 290 ********************************************************** II. JOBS 1. U. North Texas: Dean, SLIS III. NOTICES B. Meetings 1. Digital Image Access and Retrieval '96 2. Relevance in Knowledge Representation and Reasoning IV. PROJECTS A. Abstracts 1. IR-Related Dissertation Abstracts ********************************************************** II. JOBS II.1. Fr: Dr. Rollie R. Schafer Re: U. North Texas: Dean, SLIS The University of North Texas invites applications and nominations for the position of Dean of the School of Library and Information Sciences. This 12-month, tenured faculty appointment will be available June 1, 1996. The Dean reports to the Provost, directs the operation and administration of the School, and serves as its advocate within the University and outside to its external constituencies. As head of the Interdisciplinary Ph.D. Program in Information Science, the Dean coordinates the participation of Business Computer Information Systems, Communication Studies, Computer Science, Library and Information Science, and Technology and Cognition. The successful candidate must articulate a vision of library and information sciences that demonstrates a broad understanding of the various interdisciplinary components of the field. He or she should have a distinguished record of teaching and research and experience in external funding. The successful candidate must have superior management, leadership, and interpersonal skills. A Ph.D. or applicable terminal degree is essential. The University of North Texas, located in the Dalles-Fort Worth metroplex, is a metropolitan research university with an enrollment of over 25,000,, approximately one-fourth of whom are graduate students. The School of Library and Information Sciences, established in 1939, has 16 full-time faculty and 400 masters and 60 doctoral students. It operates thriving off-campus programs in Houston and Lubbock and a highly successful continuing education program. The University will receive nominations and applications until the application is filled. Please send a curriculum vitae and a brief vision statement to Dr. Rollie R. Schafer, SLIS Dean Search Committee, University of North Texas, P.O. Box 13707 Denton, TX 76203-6707. ********************************************************** III. NOTICES III.B.1. Fr: Bryan Heidorn Re: DPC '96 - Digital Image Access and Retrieval PRELIMINARY ANNOUNCEMENT The 33rd Annual Clinic on Library Applications of Data Processing: Digital Image Access and Retrieval at the Beckman Institute for Advanced Science and Technology, The University of Illinois at Urbana-Champaign March 24-26, 1996 Sponsored by the Graduate School of Library and Information Science and the Beckman Institute. In the last several years digital images have changed from being an expensive high technology oddity in libraries into a necessity. Libraries are being asked to house and index large digital collections. These collections are now being created in nearly every art and science. Digital image technologies can also improve efficiency in preservation, interlibrary loan and classroom support. This conference will explore the digital image technology and many facets of its' impact on the libraries of today and tomorrow. WHO SHOULD ATTEND: This conference will be of interest to librarians, academic computing staff, digital collection developers, and educators who use visual media. Sunday, March 24, 1996 Keynote Address: Howard Besser, Visiting Associate Professor, School of Information & Library Studies, University of Michigan. Monday, March 25 8am-5pm Technical Sessions 5-7pm Dinner 7-9pm Demonstrations Tuesday, March 26 Technical Sessions Panel Discussion on Impacts on Libraries The full conference program will be published on February 1. GENERAL INFORMATION: LOCATION: All conference events will take place in the Beckman Institute, a new, high-tech interdisciplinary research institute located on the campus of the University of Illinois, 405 N. Matthews, Urbana, Illinois. REGISTRATION AND FEES: The fee for the conference is $340 ($380 after March 4, 1996), which includes the Sunday night dinner, refreshments, and a copy of the Clinic proceedings. REFUNDS: Refunds will be made if you find that you cannot attend and you notify us in writing by March 4, 1996. You must cancel your own hotel reservations. FOR COMPLETE INFORMATION: DPC '96 Graduate School of Library and Information Science University of Illinois at Urbana-Champaign 501 E. Daniel Street Champaign, IL 61820-6211 ********** III.B.2. Fr: Russell Greiner Re: Relevance in Knowledge Representation and Reasoning KR'96 Pre-Conference Workshop on Relevance in Knowledge Representation and Reasoning 3-4 November, 1996 Boston, Massachusetts http://www.research.att.com/orgs/ssr/people/levy/rrr-cfp.html Essentially all reasoning systems use a corpus of information to reach appropriate conclusions. For example, deductive systems use initial theories (possibly encoded as predicate calculus statements) from which they draw conclusions, probabilistic systems use prior distributions (possibly encoded as a Bayesian network) to compute event probabilities, and abductive processes produce explanations based on both background theories and observations. With too little information, these systems clearly cannot work correctly. Surprisingly, too *much* information is also problematic, as it too can cause significant degradation in system performance. It is therefore critical to determine what information is irrelevant, to know what can be ignored or downplayed when considering a specific task (e.g., a specific query, or distribution of queries, to the system, or a specific observation to be explained). In some cases, ignoring irrelevant information is needed in order to draw the correct conclusions. There are many forms of irrelevance. In some contexts, the initial theory may include more information than the task requires, or information at a level of granularity that is more detailed than necessary. Here, the system may perform more effectively if it ignores or deletes certain irrelevant facts or if it ignores certain distinctions made in the representation. Another flavor of irrelevance arises during the course of reasoning: A reasoning process can ignore certain intermediate results, once it has established that they will not contribute to the eventual answer. This workshop follows the very eclectic 1994 Relevance Symposium, which investigated the notion of relevance across various fields of Artificial Intelligence and Computer Science. The current workshop, however, will focus on the use of relevance in knowledge representation and reasoning, specifically, on understanding different forms of irrelevance, and exploiting this "relevance information" to improve the performance of reasoning systems. Submissions are requested in areas relating to relevance in KR&R, including, but not limited to, the following: o Speeding up inference using relevance reasoning. o Relevance in probabilistic reasoning. o Relevance in explanation. o Relationships between relevance and belief revision and updates. o Relevance reasoning as a basis for abstraction and reformulation. o Using relevance of information to enable drawing appropriate conclusions. o Applications of relevance reasoning. o Reasoning about relevance of information, and foundations of relevance reasoning. SUBMISSION INFORMATION: Authors wishing to present a paper should submit an extended abstract of at most 5000 words. Accepted participants will be invited to submit full papers for the workshop proceedings, which will be distributed to the workshop participants. Persons wishing to attend the workshop and not to present papers should submit a 1--2 page research summary that includes a list of relevant publications. Authors are encouraged to submit PostScript versions of their paper by email to either Russ Greiner (greiner@scr.siemens.com) or Alon Levy (levy@research.att.com). Authors unable to submit by email should send 4 copies of their paper to the address below. All submissions should be received by July 8, 1996. Please be sure to include e-mail address, telephone number and mailing address of the principal author. In case of multiple authors, please indicate which authors wish to participate. Notification of acceptance or rejection will be mailed to the principal author by August 16, 1996. Camera-ready copies of papers accepted for inclusion in the proceedings will be due September 17, 1996. ADDRESS FOR HARDCOPY SUBMISSIONS: Russell Greiner Siemens Corporate Research, Inc 755 College Road East Princeton, NJ 08540-6632 IMPORTANT DATES: - Submissions due: July 8, 1996. - Notification of acceptance August 16, 1996. - Final version due September 17, 1996. - Workshop dates November 3-4, 1996. ********************************************************** IV. PROJECTS IV.A.1. Fr: Susanne M. Humphrey Re: Selected IR-Related Dissertation Abstracts The following are citations selected by title and abstract as being of potential interest to the Information Retrieval (IR) community, resulting from a computer search, using the CDP/Online system, of the Dissertation Abstracts International (DAI) database produced by University Microfilms International (UMI). Included are accession number (AN); author (AU); title (TI); degree, institution, year, number of pages (IN); UMI order number (DD); reference to the published DAI (SO); abstract (AB); one or more DAI subject descriptors chosen by the author (DE); thesis adviser (AR); and dates associated with the monthly update file (UP). Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International, Dissertation Copies, Post Office Box 1764, Ann Arbor, MI 48106; telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042, for Canada: 1-800-343-5299; fax: 313-973-1540. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided. Dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright by University Microfilms International), and may not be reproduced without their prior permission. AN AAI9517468 AU Jantan, Jaafar. TI DIFFERENCES AND SIMILARITIES IN TEACHERS' INFORMATION EXPLORATION STRATEGIES FOR LESSON PLANNING USING THE PHYSICS INFOMALL: A LARGE PHYSICS DATABASE ON CD-ROM. IN Thesis (PH.D.)--KANSAS STATE UNIVERSITY, 1994, 166p. DD Order Number: AAI9517468. SO Dissertation Abstracts International. Volume: 56-01, Section: B, page: 0286. AB Physics teachers in high schools and elsewhere have a wide range of specialization and teaching experiences. A recent survey found seventy-five percent of physics teachers do not earn their degrees in physics. Physics InfoMall, a resource on CD-ROM was developed to provide these teachers and other physics educators access to a wide variety of teaching materials. Along with a search engine, Physics InfoMall included materials such as textbooks, reference books, laboratory and demonstration books and activities, indexes and selected articles to physics education journals, pamphlets and other documents related to teaching. Field test versions of the Physics InfoMall were distributed to teachers around the country and feedback on the use of the Physics InfoMall for lesson planning was collected over a period of a year. This study investigated how teachers with different specialization and experiences used the Physics InfoMall. Specifically, this study probed into teachers' choice of lesson components, stores entered, query modes and information seeking categories when preparing lesson plans. Overall, demonstrations, laboratories, lectures, and teacher background readings were the primary lesson components chosen and these choices were highly reflected in the stores teachers entered for shopping. Additionally, teachers queried for information using Boolean searching but failed to refine their searches by searching in selected fields of documents. Comparisons were made between crossover teachers (little or no physics background) and prepared teachers (significant physics background), between teachers with different teaching experiences and between crossover and prepared teachers with similar teaching experiences. Differences were tested for statistical significance using z-test for proportion, ANOVA, Chi Square and t-test for independent samples. Crossover teachers queried materials for lecture more than prepared teachers but no differences were observed in their query modes. On average, crossover teachers chose more lesson components and spent more time than prepared teachers for each observation. Teaching experiences were found to have no influence on choice of lesson components, stores entered, query modes and information seeking categories. However, comparisons between crossover and prepared teachers with similar teaching experiences revealed a number of variables showing significant differences. DE Physics, General. Education, Sciences. Education, Technology. AR Zollman, Dean. UP 9506. Revised: 950629. AN AAI9516842 AU Hull, David A. TI INFORMATION RETRIEVAL USING STATISTICAL CLASSIFICATION. IN Thesis (PH.D.)--STANFORD UNIVERSITY, 1995, 166p. DD Order Number: AAI9516842. SO Dissertation Abstracts International. Volume: 56-01, Section: B, page: 0329. AB In the classical information retrieval (IR) problem, the system must find all documents in a collection that are related to a topic defined by a user's query. A common approach to the IR problem is to represent documents and the query as vectors of term frequencies and rank the documents in the collection according to their inner product similarity with respect to the query. When a sample of evaluated documents is available in addition to the query (often called routing), the problem can be attacked using techniques based on statistical classification. In order for statistical classification to be a feasible approach, the system must produce a relatively small set of high quality feature variables. It turns out that individual words, due to their quantity and ambiguity, are not optimal features. Previous work has focused on a technique known as Latent Semantic Indexing (LSI), which applies the singular value decomposition to a term-document matrix and represents terms and documents by linear combinations of orthogonal indexing variables. The research presented in this thesis accomplishes the following goals. It provides a thorough discussion of evaluation in information retrieval experiments. It introduces the concept of a local LSI decomposition. LSI is used separately on a set of documents in the local region surrounding each query, creating query-specific feature variables and making the LSI technique feasible for very large document collections. It applies the classification technique known as Discriminant Analysis to the routing problem and presents experimental results on two text collections. It demonstrates that using a local LSI decomposition improves retrieval performance and represents documents using a relatively small number of feature variables. It finds that Discriminant Analysis sometimes leads to additional performance gains but that more research is needed to determine the optimal size and shape of the local region. DE Statistics. Information Science. AR Friedman, Jerome. UP 9506. Revised: 950629. AN AAI9514439 AU Liu, Qianhong. TI AN OFFICE DOCUMENT RETRIEVAL SYSTEM WITH THE CAPABILITY OF PROCESSING INCOMPLETE AND VAGUE QUERIES (INCOMPLETE QUERIES, QUERIES). IN Thesis (PH.D.)--NEW JERSEY INSTITUTE OF TECHNOLOGY, 1994, 181p. DD Order Number: AAI9514439. SO Dissertation Abstracts International. Volume: 56-01, Section: B, page: 0346. AB TEXPROS (TEXt PROcessing System) is an intelligent document processing system. The system is a combination of filing and retrieval systems, which supports storing, classifying, categorizing, retrieving and reproducing documents, as well as extracting, browsing, retrieving and synthesizing information from a variety of documents. This dissertation presents a retrieval system for TEXPROS, which is capable of processing incomplete or vague queries and providing semantically meaningful responses to the users. The design of the retrieval system is highly integrated with various mechanisms for achieving these goals. First, a system catalog including a thesaurus is used to store the knowledge about the database. Secondly, there is a query transformation mechanism which consists of context construction and algebraic query formulation modules. Given an incomplete query, the context construction module searches the system for the required terms and constructs a query that has a complete representation. The resulting query is then formulated into an algebraic query. Thirdly, in practice, the user may not have a precise notion of what he is looking for. A browsing mechanism is employed for such situations to assist the user in the retrieval process. With the browser, vague queries can be entered into the system until sufficient information is obtained to the extent that the user is able to construct a query for his request. Finally, when processing of queries responds with an empty answer to the user, a query generalization mechanism is used to give the user a cooperative explanation for the empty answer. The generalizations of any given failed queries (i.e., with an empty answer) are derived by applying both the folder and type substitutions and weakening the search criteria in the original query. An efficient way is investigated for determining whether the empty answer is genuine and whether the original query reflects erroneous presuppositions, and therefore answering any failed query with a meaningful and cooperative response. It incorporates with a methodical approach to reducing the search space of generalized subqueries by analyzing the results of executing the query generalization and by efficiently applying the possible substitutions in a query to generate a small subset of relevant subqueries which are to be evaluated. DE Computer Science. Information Science. AR Ng, Peter A. UP 9506. Revised: 950629. AN AAI9514595 AU Wang, Peiling. TI A COGNITIVE MODEL OF DOCUMENT SELECTION OF REAL USERS OF INFORMATION RETRIEVAL SYSTEMS. IN Thesis (PH.D.)--UNIVERSITY OF MARYLAND COLLEGE PARK, 1994, 248p. DD Order Number: AAI9514595. SO Dissertation Abstracts International. Volume: 56-01, Section: A, page: 0017. AB Purpose. This is an exploratory study to examine document selection behavior of real users of bibliographic information retrieval (IR) systems. The purpose of the study is to build a model of the document selection process which can be used in improving the design of IR systems. Methods. Twenty-five faculty and students from an academic department submitted search requests related to their work. After a reference interview, the researcher conducted online searches on DIALOG. The retrieved documents were printed out in full record format and presented to the users for selection. Participants went through the list and selected documents in the presence of the researcher; they were asked to read and think aloud. These concurrent verbal reports were audio-taped, transcribed, and analyzed. Results. Document selection is conceptualized as a decision-making process in which users process document information elements (DIEs), apply criteria, and make decisions on whether the retrieved documents should be obtained. Document selection is situational, multidimensional, dynamic, and cognitive. Four document values adapted from the consumer choice literature are tentatively supported by the data: epistemic value, functional value, social value, and conditional value. Users employed the following criteria, listed by decreasing importance; DIEs and personal knowledge used to judge each criterion are given in parentheses: topicality (title, abstract, geographic location), orientation/level (title, abstract, author, journal), quality (author, journal, document type), subject area (author's subject area, journal), novelty (title, author), recency (publication date), authority (author), and relation/origin (author). The combinations of DIEs used and their sequencing varied from user to user and for the same user from document to document. To minimize cognitive effort, users apply decision rules in processing information and in balancing decisions among alternatives: elimination, multiple criteria, dominance, scarcity, "satisfice," and chain rules. Implications. These results suggest that the DIEs should be displayed to support document selection decisions; search output should be organized to facilitate decisions. A knowledge-based system incorporating knowledge about authors, organizations, journals, and subjects, including evaluations specific to each individual user, can help users in both document selection and IR. DE Library Science. Information Science. Education, Psychology. AR Soergel, Dagobert. UP 9506. Revised: 950629. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests and submissions to: NCGUR@UCCMVSA.UCOP.EDU Editorial Staff: Clifford Lynch calur@uccmvsa.ucop.edu Nancy Gusack ncgur@uccmvsa.ucop.edu The IRLIST Archives is set up for anonymous FTP. Using anonymous FTP via the host dla.ucop.edu, the files will be found in the directory pub/irl, stored in subdirectories by year (e.g., /pub/irl/1993). These files are not to be sold or used for commercial purposes. Contact Nancy Gusack for more information on IRLIST. THE OPINIONS EXPRESSED IN IRLIST DO NOT REPRESENT THOSE OF THE EDITORS OR THE UNIVERSITY OF CALIFORNIA. AUTHORS ASSUME FULL RESPONSIBILITY FOR