본문으로 이동

사용자:KYPark/인터넷

위키배움터

PREFACE

The automatic SMART document retrieval system was designed at Harvard University between 1961 and 1964, and has been operating on IBM 7094 and 360 equipment both at Harvard and at Cornell University for several years. The system takes documents and search request in the natural language, performs a fully automatic content analysis of the texts using one of several dozen programmed language analysis methods, matches analyzed documents with analyzed search requests, and retrieves for user's attention those stored items believed to be most similar to the submitted queries.

Unlike the other computer-based retrieval system, the SMART system does not rely on manually assigned keywords or index terms for the identification of documents and search requests, nor does it use primarily the frequency of occurrence of certain words or phrases included in the texts of documents. Instead, an attempt is made to go beyond simple word-matching procedures by using a variety of intellectual aids in the form of synonym dictionaries, hierarchical arrangements of subject identifiers, statistical and syntactic phrase generation methods, and the like, in order to obtain content identifications useful for the retrieval process.

By comparing the retrieval performance obtained with the various programmed procedures, the SMART system can be used a a unique experimental tool for the evaluation in a controlled laboratory environment of many fully automatic language analysis methods, In addition, the system has been used to simulate a user-environment by making it possible for the user to participate in the search process. Specifically, the system utilizes feedback information supplied by the user during the search to construct improved search formulations, and to generate document representations reflecting the interests of the user population. By combining automatic text processing methods with interactive search and retrieval techniques, the SMART system may then lead to the design and implementation of modern information services of the type which may become current in operational environment some years hence.

[...]

1

THE SMART PROJECT - STATUS REPORT AND PLANS

G. SALTON

1-1 INTRODUCTION

The SMART document retrieval system has been operating on an IBM 7094 computer since the end of 1964 and on an IBM 360/65 since 1968. The system takes documents and search requests in English, performs a fully automatic content analysis of the texts, matches analyzed documents with analyzed search requests, and retrieve those stored terms believed to be most similar to the queries. Among the language analysis procedures incorporated into the system are word suffix cutoff methods, thesaurus lookup procedures, phrase generation methods, and others. These analysis methods are used to reduce document and query texts into a form actually utilized during the search and retrieval process.

[...]

— GERARD SALTON, Editor. THE SMART RETRIEVAL SYSTEM: Experiments in Automatic Document Processing. 1971. Prentice-Hall, Inc. Englewood Cliffs, N.J.