An Apple patent (number 7,823,214) for a system for ranking the relevance of information objects accessed by computer users has appeared at the US Patent & Trademark Office. It's directed to information access in multiuser computer systems, and more particularly to a system for ranking the relevance of information that is accessed via a computer.
Information presented to a user via an information access system is ranked according to a prediction of the likely degree of relevance to the user's interests. A profile of interests is stored for each user having access to the system. Items of information to be presented to a user are ranked according to their likely degree of relevance to that user and displayed in order of ranking. The prediction of relevance is carried out by combining data pertaining to the content of each item of information with other data regarding correlations of interests between users.
A value indicative of the content of a document can be added to another value which defines user correlation, to produce a ranking score for a document. Alternatively, multiple regression analysis or evolutionary programming can be carried out with respect to various factors pertaining to document content and user correlation, to generate a prediction of relevance. The user correlation data is obtained from feedback information provided by users when they retrieve items of information. Preferably, the user provides an indication of interest in each document which he or she retrieves from the system. The inventors are Daniel E. Rose, Jeremy J. Borstein, Kevin Tiene and Dulce B. Ponceleon.
Here's Apple's background and summary of the invention: "The use of computers to obtain and/or exchange information is becoming quite widespread. Currently, there are three prevalent types of systems that can be employed to distribute information via computers. One of these systems comprises electronic mail, also known as e-mail, in which a user receives messages, such as documents, that have been specifically sent to his or her electronic mailbox. Typically, to receive the documents, no explicit action is required on the user's part, except to access the mailbox itself. In most systems, the user is informed whenever new messages have been sent to his or her mailbox, enabling them to be read in a timely fashion.
"Another medium that is used to distribute information is an electronic bulletin board system. In such a system, users can post documents or files to directories corresponding to specific topics, where they can be viewed by other users who need not be explicitly designated. In order to view the documents, the other users must actively select and open the directories containing topics of interest. Articles and other items of information posted to bulletin board systems typically expire after some time period, and are then deleted.
"The third form of information exchange is by means of text retrieval from static data bases, which are typically accessed through dial-up services. A group of users, or a service bureau, can place documents of common interest on a file server. Using a text searching tool, individual users can locate documents matching a specific topical query. Some services of this type enable users to search personal databases, as well as databases of other users.
"As the use of these types of systems becomes ever more common, the amount of information presented to users can reach the point of becoming unmanageable. For example, users of electronic mail services are increasingly finding that they receive more mail than they can usefully handle. Part of this problem is due to the fact that junk mail of no particular interest is regularly sent in bulk to lists of user accounts. In order to view messages of interest, the user may be required to sift through a large volume of undesirable mail.
"Similarly, in bulletin board systems, the number of documents in a particular topical category at any given time can be quite significant. The user must try to identify documents of interest on the basis of cryptic titles. As a result, an opportunity to view documents that are critically relevant may be missed if the user cannot take the time to view all documents in the category.
"Along similar lines, in a text retrieval system, a broadly framed query can result in the identification of a large number of documents for the user to view. In an effort to reduce the number of documents, the user may modify the query to narrow its scope. In doing so, however, documents of interest may be eliminated because they do not exactly match the modified query.
"In the past, some information access systems, particularly e-mail systems, have provided the user with the ability to have incoming information filtered, so that only items of interest would be presented to the user. The filtering was carried out on the basis of objective criteria specified by the user. Any messages not meeting the filtering criteria would be blocked. There is always the danger in such an objective approach that potentially relevant items of information can be missed. It is desirable, therefore, to employ a system for predicting the likely relevance of items of information to a particular user, so that the items of interest can be ranked and the need to deal with large amounts of irrelevant information can be avoided.
"Some types of relevance predictors have already been proposed. For example, the contents of a document can be examined to make a determination as to whether a user might find that document to be of interest, based on user-supplied information. While approaches of this type have some utility, they are limited because the prediction of relevance is made only on the basis of one attribute, e.g., word content. It is desirable to improve upon existing relevance predicting techniques, and provide a system which takes into account a variety of attributes that are relevant to a user's likely interest in a particular item of information. In this regard, it is particularly desirable to provide an information relevance predicting technique which utilizes community feedback as one of the factors in the prediction.
"In accordance with the present invention, information to be presented to a user via an information access system is ranked according to a prediction of the likely degree of relevance to the user's interests. A profile of interests is stored for each user having access to the system. Using this profile, items of information to be presented to the user, e.g., messages in an electronic mail network or documents within a particular bulletin board category, are ranked according to their likely degree of relevance and displayed with an indication of their relative ranking. For example, they can be displayed in order of rank.
"The prediction of relevance is carried out by combining data pertaining to one or more attributes of each item of information with other data regarding correlations of interests between users. For example, a value indicative of the content of a document can be added to another value which defines user correlation, to produce a ranking score for a document. Other information evaluation techniques, such as multiple regression analysis or evolutionary programming, can alternatively be employed to evaluate various factors pertaining to document content and user correlation, and thereby generate a prediction of relevance.
"The user correlation data is obtained through feedback information provided by users when they retrieve items of information. Preferably, the user provides an indication of interest in each document which he or she retrieves from the system.
"The relevance predicting technique of the present invention is applicable to all different types of information access systems. For example, it can be employed to filter messages provided to a user in an electronic mail system and search results obtained through an on-line text retrieval service. Similarly, it can be employed to route relevant documents to users in a bulletin board system."