CONTENTS
Preface ... xv
1 Introduction
1.1 What Is Information? ... 1
1.2 What Is Information Retrieval? ... 2
1.3 How Does Information Retrieval Work? ... 4
1.4 Who Uses Information Retrieval ... 15
1.5 What Are the Problems in IRS Design and Use? ... 17
1.6 A Brief History of Information Retrieval ... 19
2 Data, Information, and Knowledge
2.1 Introduction ... 34
2.2 Definitions ... 35
2.3 Metadata ... 41
2.4 Knowledge Base ... 43
2.5 Credence, Justified Belief, and Point of View ... 46
2.6 Summary ... 48
3 Representation of Information
3.1 Information to Be Represented ... 49
3.2 Types of Representation ... 53
3.3 Characteristics of Information Representations ... 60
3.4 Relationships among Entities and Attribute Values ... 62
3.5 Summary ... 67
4 Attribute Content and Values
4.1 Types of Attribute Symbols ... 68
4.2 Class Relationships ... 71
4.3 Transformations of Values ... 77
4.4 Uniqueness of Values ... 89
4.5 Ambiguity of Attribute Values ... 91
4.6 Indexing of Text ... 93
4.7 Control of Vocabulary ... 95
4.8 The Importance of Point of View ... 97
4.9 Summary ... 98
5 Models of Virtual Data Structure
5.1 The Concept of Models of Data ... 99
5.2 Basic Data Elements and Structures ... 102
5.3 The Common Structural Models ... 108
5.4 Applications of the Basic Models ... 114
5.5 The Entity-Relationship Model ... 117
5.6 Summary ... 121
6 The Physical Structure of Data
6.1 Introduction to Physical Structures ... 122
6.2 Record Structures and Their Effects ... 123
6.3 Basic Concepts of File Structure ... 127
6.4 Organizational Methods ... 128
6.5 Parsing of Data Elements ... 143
6.6 Combination Structures ... 145
6.7 Summary ... 150
7 Querying the Information Retrieval System
7.1 Introduction ... 152
7.2 Language Types ... 153
7.3 Query Logic ... 155
7.4 Functions Performed ... 163
7.5 The Basis for Charging for Searches ... 174
8 Interpretation and Execution of Query Statements
8.1 Problems of Query Language Interpretation ... 176
8.2 Executing Retrieval Commands ... 183
8.3 Executing Record Analysis and Presentation Commands ... 191
8.4 Executing Other Commands ... 195
8.5 Feedback to Users and Error Messages ... 199
9 Text Searching
9.1 The Special Problems of Text Searching ... 203
9.2 Some Characteristics of Text and Their Applications ... 204
9.3 Command Language for Text Searching ... 211
9.4 Term Weighting ... 215
9.5 Word Association Techniques ... 218
9.6 Text or Record Association Techniques ... 221
9.7 Other Processes with Words of a Text ... 232
10 System-Computed Relevance and Ranking
10.1 The Retrieval Status Value (rsv) ... 238
10.2 Ranking ... 239
10.3 Methods of Evaluating the rsv ... 239
11 Search Feedback and Iteration
11.1 Basic Concepts of Feedback and Iteration ... 246
11.2 Command Sequences ... 247
11.3 Information Available as Feedback ... 249
11.4 Adjustments in the Search ... 255
11.5 Feedback from User to System ... 257
12 Multidatabase Searching and Mapping
12.1 Basic Concepts ... 261
12.2 Multidatabase Search ... 262
12.3 Mapping ... 268
12.4 Value of Mapping ... 270
13 Search Strategy
13.1 The Nature of Searching Reconsidered ... 272
13.2 The Nature of Search Strategy ... 275
13.3 Types of Strategies ... 278
13.4 Tactics ... 281
13.5 Summary ... 282
14 The Information Retrieval System Interface
14.1 General Model of Message Flow ... 283
14.2 Sources of Ambiguity ... 286
14.3 The Role of a Search Intermediary ... 287
14.4 Automated Search Mediation ... 290
14.5 The User Interface as a Component of All Systems ... 295
15 A Sampling of Information Retrieval Systems
15.1 Introduction ... 296
15.2 Dialog ... 296
15.3 Alta Vista ... 302
15.4 Northern Light Technology ... 305
15.5 The Canadian Encyclopedia and X-Portal??... 306
15.6 Summary ... 309
16 Measurement and Evaluation
16.1 Basics of Measurement ... 310
16.2 Relevance, Value, and Utility ... 314
16.3 Measures Based on Relevance ... 321
16.4 Measures of Process ... 328
16.5 Measures of Outcome ... 333
16.6 Measures of Environment ... 336
16.7 Conclusion ... 337
Bibliography ... 338
Recommended Reading ... 349
Index ... 351
Preface ... xv
1 Introduction
1.1 What Is Information? ... 1
1.2 What Is Information Retrieval? ... 2
1.3 How Does Information Retrieval Work? ... 4
1.4 Who Uses Information Retrieval ... 15
1.5 What Are the Problems in IRS Design and Use? ... 17
1.6 A Brief History of Information Retrieval ... 19
2 Data, Information, and Knowledge
2.1 Introduction ... 34
2.2 Definitions ... 35
2.3 Metadata ... 41
2.4 Knowledge Base ... 43
2.5 Credence, Justified Belief, and Point of View ... 46
2.6 Summary ... 48
3 Representation of Information
3.1 Information to Be Represented ... 49
3.2 Types of Representation ... 53
3.3 Characteristics of Information Representations ... 60
3.4 Relationships among Entities and Attribute Values ... 62
3.5 Summary ... 67
4 Attribute Content and Values
4.1 Types of Attribute Symbols ... 68
4.2 Class Relationships ... 71
4.3 Transformations of Values ... 77
4.4 Uniqueness of Values ... 89
4.5 Ambiguity of Attribute Values ... 91
4.6 Indexing of Text ... 93
4.7 Control of Vocabulary ... 95
4.8 The Importance of Point of View ... 97
4.9 Summary ... 98
5 Models of Virtual Data Structure
5.1 The Concept of Models of Data ... 99
5.2 Basic Data Elements and Structures ... 102
5.3 The Common Structural Models ... 108
5.4 Applications of the Basic Models ... 114
5.5 The Entity-Relationship Model ... 117
5.6 Summary ... 121
6 The Physical Structure of Data
6.1 Introduction to Physical Structures ... 122
6.2 Record Structures and Their Effects ... 123
6.3 Basic Concepts of File Structure ... 127
6.4 Organizational Methods ... 128
6.5 Parsing of Data Elements ... 143
6.6 Combination Structures ... 145
6.7 Summary ... 150
7 Querying the Information Retrieval System
7.1 Introduction ... 152
7.2 Language Types ... 153
7.3 Query Logic ... 155
7.4 Functions Performed ... 163
7.5 The Basis for Charging for Searches ... 174
8 Interpretation and Execution of Query Statements
8.1 Problems of Query Language Interpretation ... 176
8.2 Executing Retrieval Commands ... 183
8.3 Executing Record Analysis and Presentation Commands ... 191
8.4 Executing Other Commands ... 195
8.5 Feedback to Users and Error Messages ... 199
9 Text Searching
9.1 The Special Problems of Text Searching ... 203
9.2 Some Characteristics of Text and Their Applications ... 204
9.3 Command Language for Text Searching ... 211
9.4 Term Weighting ... 215
9.5 Word Association Techniques ... 218
9.6 Text or Record Association Techniques ... 221
9.7 Other Processes with Words of a Text ... 232
10 System-Computed Relevance and Ranking
10.1 The Retrieval Status Value (rsv) ... 238
10.2 Ranking ... 239
10.3 Methods of Evaluating the rsv ... 239
11 Search Feedback and Iteration
11.1 Basic Concepts of Feedback and Iteration ... 246
11.2 Command Sequences ... 247
11.3 Information Available as Feedback ... 249
11.4 Adjustments in the Search ... 255
11.5 Feedback from User to System ... 257
12 Multidatabase Searching and Mapping
12.1 Basic Concepts ... 261
12.2 Multidatabase Search ... 262
12.3 Mapping ... 268
12.4 Value of Mapping ... 270
13 Search Strategy
13.1 The Nature of Searching Reconsidered ... 272
13.2 The Nature of Search Strategy ... 275
13.3 Types of Strategies ... 278
13.4 Tactics ... 281
13.5 Summary ... 282
14 The Information Retrieval System Interface
14.1 General Model of Message Flow ... 283
14.2 Sources of Ambiguity ... 286
14.3 The Role of a Search Intermediary ... 287
14.4 Automated Search Mediation ... 290
14.5 The User Interface as a Component of All Systems ... 295
15 A Sampling of Information Retrieval Systems
15.1 Introduction ... 296
15.2 Dialog ... 296
15.3 Alta Vista ... 302
15.4 Northern Light Technology ... 305
15.5 The Canadian Encyclopedia and X-Portal??... 306
15.6 Summary ... 309
16 Measurement and Evaluation
16.1 Basics of Measurement ... 310
16.2 Relevance, Value, and Utility ... 314
16.3 Measures Based on Relevance ... 321
16.4 Measures of Process ... 328
16.5 Measures of Outcome ... 333
16.6 Measures of Environment ... 336
16.7 Conclusion ... 337
Bibliography ... 338
Recommended Reading ... 349
Index ... 351