Data mining: concepts adn techques / (Record no. 3701)
[ view plain ]
000 -LEADER | |
---|---|
fixed length control field | 14899nam a2200145 4500 |
040 ## - CATALOGING SOURCE | |
Transcribing agency | CUS |
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER | |
Classification number | 006.312 |
Item number | HAN/D |
100 ## - MAIN ENTRY--PERSONAL NAME | |
Personal name | Han, Jiawei |
245 ## - TITLE STATEMENT | |
Title | Data mining: concepts adn techques / |
Statement of responsibility, etc. | Jiawei Han and Micheline Kamber |
250 ## - EDITION STATEMENT | |
Edition statement | 2nd ed. |
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT) | |
Place of publication, distribution, etc. | Amsterdam : |
Name of publisher, distributor, etc. | Elsevier, |
Date of publication, distribution, etc. | 2006. |
300 ## - PHYSICAL DESCRIPTION | |
Extent | xxviii, 770 p. |
Other physical details | ill. ; |
505 ## - FORMATTED CONTENTS NOTE | |
Formatted contents note | Chapter I Introduction I<br/>I . I What Motivated Data Mining? Why is it important? i<br/>1 .2 So, What is Data Mining? 5<br/>1 .3 Data Mining—On What Kind of Data? 9<br/>1 .3.1 Relational Databases 10<br/>1.3.2 Data Warehouses 12<br/>1 .3.3 Transactional Databases 14<br/>1.3.4 Advanced Data and Information Systems and Advanced<br/>Applications 15<br/>1 .4 Data Mining Functionaiities—What Kinds of Patterns Can Be<br/>Mined? 2i<br/>1 .4.1 Concept/Class Description: Characterization and<br/>Discrimination 21<br/>1.4.2 Mining Frequent Patterns, Associations, and Correlations 23<br/>1.4.3 Classification and Prediction 24<br/>1 .4.4 Cluster Analysis 25<br/>1 .4.5 Outlier Analysis 26<br/>1 .4.6 Evolution Analysis 27<br/>1 .5 Are Aii of the Patterns interesting? 27<br/>1 .6 Ciassification of Data Mining Systems 29<br/>1.7 Data Mining Task Primitives 3 i<br/>1 .8 integration of a Data Mining System with<br/>a Database or Data Warehouse System 34<br/>1 .9 Major issues in Data Mining 36<br/>1.10 Summary 39<br/>Exercises 40<br/>Bibliographic Notes 42<br/>Chapter 2 Data Preprocessing 47<br/>2.1 Why Preprocess the Data? 48<br/>2.2 Descriptive Data Summarization 51<br/>2.2.1 Measuring the Central Tendency 51<br/>2.2.2 Measuring the Dispersion of Data 53<br/>2.2.3 Graphic Displays of Basic Descriptive Data Summaries 56<br/>2.3 Data Cleaning 61<br/>2.3.1 Missing Values 61<br/>2.3.2 Noisy Data 62<br/>2.3.3 Data Cleaning as a Process 65<br/>2.4 Data Integration and Transformation 67<br/>2.4.1 Data Integration 67<br/>2.4.2 Data Transformation 70<br/>2.5 Data Reduction 72<br/>2.5.1 Data Cube Aggregation 73<br/>2.5.2 Attribute Subset Selection 75<br/>2.5.3 Dimensionality Reduction 77<br/>2.5.4 Numerosity Reduction 80<br/>2.6 Data Discretization and Concept Hierarchy Generation 86<br/>2.6.1 Discretization and Concept Hierarchy Generation for<br/>Numerical Data 88<br/>2.6.2 Concept Hierarchy Generation for Categorical Data 94<br/>2.7 Summary 97<br/>Exercises 97<br/>Bibliographic Notes 101<br/>Chapter 3 Data Warehouse and OLAP Technology: An Overview 105<br/>3.1 What Is a Data Warehouse? 105<br/>3.1.1 Differences between Operational Database Systems<br/>and Data Warehouses 108<br/>3.1.2 But, Why Have a Separate Data Warehouse? 109<br/>3.2 A Multidimensional Data Model 110<br/>3.2.1 From Tables and Spreadsheets to Data Cubes 1 10<br/>3.2.2 Stars, Snowflakes, and Fact Constellations:<br/>Schemas for Multidimensional Databases 1 14<br/>3.2.3 Examples for Defining Star, Snowflake,<br/>and Fact Constellation Schemas 1 17<br/>3.2.4 Measures: Their Categorization and Computation I 19<br/>3.2.5 Concept Hierarchies 121<br/>3.2.6 OLAP Operations in the Multidimensional Data Model 123<br/>3.2.7 A Stamet Query Model for Querying<br/>Multidimensional Databases 126<br/>3.3 Data Warehouse Architecture 127<br/>3.3.1 Steps for the Design and Construction of Data Warehouses 128<br/>3.3.2 A Three-Tier Data Warehouse Architecture 130<br/>3.3.3 Data Warehouse Back-End Tools and Utilities 134<br/>3.3.4 Metadata Repository 134<br/>3.3.5 Types of OLAP Servers: ROLAP versus MOLAP<br/>versus HOLAP 135<br/>3.4 Data Warehouse Implementation 137<br/>3.4.1 Efficient Computation of Data Cubes 137<br/>3.4.2 Indexing OLAP Data 141<br/>3.4.3 Efficient Processing of O AP Queries 144<br/>3.5 From Data Warehousing to Data Mining 146<br/>3.5.1 Data Warehouse Usage 146<br/>3.5.2 From On-Line Analytical Processing<br/>to On-Line Analytical Mining 148<br/>3.6 Summary 150<br/>Exercises 152<br/>Bibliographic Notes 154<br/>Chapter 4 Data Cube Computation and Data Generalization 157<br/>4.1 Efficient Methods for Data Cube Computation 157<br/>4.1.1 A Road Map for the Materialization of Different Kinds<br/>of Cubes 158<br/>4.1 .2 Multiway Array Aggregation for Full Cube Computation 164<br/>4.1.3 BUG Computing Iceberg Cubes from the Apex Cuboid<br/>Downward 168<br/>4.1.4 Star-cubing: Computing Iceberg Cubes Using<br/>a Dynamic Star-tree Structure 173<br/>4.1 .5 Precomputing Shell Fragments for Fast High-Dimensional<br/>OLAP 178<br/>4.1.6 Computing Cubes with Complex Iceberg Conditions 187<br/>4.2 Further Development of Data Cube and OLAP<br/>Technology 189<br/>4.2.1 Discovery-Driven Exploration of Data Cubes 189<br/>4.2.2 Complex Aggregation at Multiple Granularity:<br/>Multifeature Cubes 192<br/>4.2.3 Constrained Gradient Analysis in Data Cubes 195<br/>4.3 Attribute-Oriented Induction—An Alternative<br/>Method for Data Generalization and Concept Description 198<br/>4.3.1 Attribute-Oriented induction for Data Characterization 199<br/>4.3.2 Efficient Implementation of Attribute-Oriented Induction 205<br/>4.3.3 Presentation of the Derived Generalization 206<br/>4.3.4 Mining Class Comparisons: Discriminating between<br/>Different Classes 210<br/>4.3.5 Class Description: Presentation of Both Characterization<br/>and Comparison 215<br/>4.4 Summary 218<br/>Exercises 219<br/>Bibliographic Notes 223<br/>Chapter 5 Mining Frequent Patterns, Associations, and Correlations 227<br/>5.1 Basic Concepts and a Road Map 227<br/>5.1 .1 Market Basket Analysis: A Motivating Example 228<br/>5.1 .2 Frequent Itemsets, Closed Itemsets, and Association Rules 230<br/>5.1 .3 Frequent Pattern Mining: A Road Map 232<br/>5.2 Efficient and Scalable Frequent Itemset Mining Methods 234<br/>5.2. 1 The Apriori Algorithm: Finding Frequent Itemsets Using<br/>Candidate Generation 234<br/>5.2.2 Generating Association Rules from Frequent Itemsets 239<br/>5.2.3 Improving the Efficiency of Apriori 240<br/>5.2.4 Mining Frequent Itemsets without Candidate Generation 242<br/>5.2.5 Mining Frequent Itemsets Using Vertical Data Format 245<br/>5.2.6 Mining Closed Frequent Itemsets 248<br/>5.3 Mining Various Kinds of Association Rules 250<br/>5.3. 1 Mining Multilevel Association Rules 250<br/>5.3.2 Mining Multidimensional Association Rules<br/>from Relational Databases and Data Warehouses 254<br/>5 4 From Association Mining to Correlation Analysis 259<br/>5.4. 1 Strong Rules Are Not Necessarily Interesting: An Example 260<br/>5.4.2 From Association Analysis to Correlation Analysis 261<br/>5.5 Constraint-Based Association Mining 265<br/>5.5.1 Metarule-Guided Mining of Association Rules 266<br/>5.5.2 Constraint Pushing: Mining Guided by Rule Constraints 267<br/>5.6 Summary 272<br/>Exercises 274<br/>Bibliographic Notes 280<br/>Chapter 6 Classification and Prediction 285<br/>6.1 What Is Classification? What Is Prediction? 285<br/>6.2 Issues Regarding Classification and Prediction 289<br/>6.2.1 Preparing the Data for Classification and Prediction 289<br/>6.2.2 Comparing Classification and Prediction Methods 290<br/>6.3 Classification by Decision Tree Induction 291<br/>6.3.1 Decision Tree Induction 292<br/>6.3.2 Attribute Selection Measures 296<br/>6.3.3 Tree Pruning 304<br/>6.3.4 Scalability and Decision Tree Induction 306<br/>6.4 Bayesian Classification 310<br/>6.4.1 Bayes' Theorem 310<br/>6.4.2 NaTve Bayesian Classification 31 I<br/>6.4.3 Bayesian Belief Networks 315<br/>6.4.4 Training Bayesian Belief Networks 317<br/>6.5 Rule-Based Classification 318<br/>6.5.1 Using IF-THEN Rules for Classification 319<br/>6.5.2 Rule Extraction from a Decision Tree 321<br/>6.5.3 Rule Induction Using a Sequential Covering Algorithm 322<br/>6.6 Classification by Backpropagation 327<br/>6.6.1 A Multilayer Feed-Forward Neural Network 328<br/>6.6.2 Defining a Network Topology 329<br/>6.6.3 Backpropagation 329<br/>6.6.4 Inside the Black Box: Backpropagation and Interpretability 334<br/>6.7 Support Vector Machines 337<br/>6.7.1 The Case When the Data Are Linearly Separable 337<br/>6.7.2 The Case When the Data Are Linearly Inseparable 342<br/>6.8 Associative Classification: Classification by Association<br/>Rule Analysis 344<br/>6.9 Lazy Learners (or Learning from Your Neighbors) 347<br/>6.9.1 A:-Nearest-Neighbor Classifiers 348<br/>6.9.2 Case-Based Reasoning 350<br/>6.10 Other Classification Methods 351<br/>6.10.1 Genetic Algorithms 351<br/>6.10.2 Rough Set Approach 351<br/>6.10.3 Fuzzy Set Approaches 352<br/>6.1 I Prediction 354<br/>6.1 I.I Linear Regression 355<br/>6.1 1 .2 Nonlinear Regression 357<br/>6.1 1 .3 Other Regression-Based Methods 358<br/>6.12 Accuracy and Error Measures 359<br/>6.12.1 Classifier Accuracy Measures 360<br/>6.12.2 Predictor Error Measures 362<br/>6.13 Evaluating the Accuracy of a Classifier or Predictor 363<br/>6.13.1 Holdout Method and Random Subsampling 364<br/>6.13.2 Cross-validation 364<br/>6.13.3 Bootstrap 365<br/>6.14 Ensemble Methods—increasing the Accuracy 366<br/>6.14.1 Bagging 366<br/>6.14.2 Boosting 367<br/>6.15 Model Selection 370<br/>6.15.1 Estimating Confidence Intervals 370<br/>6.15.2 ROC Curves 372<br/>6. i 6 Summary 373<br/>Exercises 375<br/>Bibliographic Notes 378<br/>Chapter 7 Cluster Analysis 383<br/>7.1 What is Cluster Analysis? 383<br/>7.2 Types of Data in Cluster Analysis 386<br/>7.2.1 Interval-Scaled Variables 387<br/>7.2.2 Binary Variables 389<br/>7.2.3 Categorical, Ordinal, and Ratio-Scaled Variables 392<br/>7.2.4 Variables of Mixed Types 395<br/>7.2.5 Vector Objects 397<br/>7.3 A Categorization of Major Clustering Methods 398<br/>7.4 Partitioning Methods 401<br/>7.4.1 Classical Partitioning Methods: it-Means and fc-Medoids 402<br/>7.4.2 Partitioning Methods in Large Databases: From<br/>/:-Medoids to CLARANS 407<br/>7.5 Hierarchical Methods 408<br/>7.5. 1 Agglomerative and Divisive Hierarchical Clustering 408<br/>7.5.2 BIRCH: Balanced Iterative Reducing and Clustering<br/>Using Hierarchies 412<br/>7.5.3 ROCK: A Hierarchical Clustering Algorithm for<br/>Categorical Attributes 414<br/>7.5.4 Chameleon: A Hierarchical Clustering Algorithm<br/>Using Dynamic Modeling 416<br/>7.6 Density-Based Methods 418<br/>7.6.1 DBSCAN: A Density-Based Clustering Method Based on<br/>Connected Regions with Sufficiently High Density 418<br/>7.6.2 OPTICS: Ordering Points to Identify the Clustering<br/>Structure 420<br/>7.6.3 DENCLUE: Clustering Based on Density<br/>Distribution Functions 422<br/>7.7 Grid-Based Methods 424<br/>7.7.1 STING: STatistical INfomnation Grid 425<br/>7.7.2 WaveCluster: Clustering Using Wavelet Transformation 427<br/>7.8 Model-Based Clustering Methods 429<br/>7.8.1 Expectation-Maximization 429<br/>7.8.2 Conceptual Clustering 431<br/>7.8.3 Neural Network Approach 433<br/>7.9 Clustering High-Dimensional Data 434<br/>7.9.1 CLIQUE: A Dimension-Growth Subspace Clustering Method 436<br/>7.9.2 PROCLUS: A Dimension-Reduction Subspace Clustering<br/>Method 439<br/>7.9.3 Frequent Pattern-Based Clustering Methods 440<br/>7.10 Constraint-Based Cluster Analysis 444<br/>7.10.1 Clustering with Obstacle Objects 446<br/>7.10.2 User-Constrained Cluster Analysis 448<br/>7.10.3 Semi-Supervised Cluster Analysis 449<br/>7.1 I Outlier Analysis 451<br/>7.1 I.I Statistical Distribution-Based Outlier Detection 452<br/>7.1 1 .2 Distance-Based Outlier Detection 454<br/>7.1 1 .3 Density-Based Local Outlier Detection 455<br/>7.1 1 .4 Deviation-Based Outlier Detection 458<br/>7.12 Summary 460<br/>Exercises 461<br/>Bibliographic Notes 464<br/>Chapter 8 Mining Stream, Time-Series, and Sequence Data 467<br/>8.1 Mining Data Streams 468<br/>8.1 .1 Methodologies for Stream Data Processing and<br/>Stream Data Systems 469<br/>8.1 .2 Stream OLAP and Stream Data Cubes 474<br/>8.1.3 Frequent-Pattern Mining in Data Streams 479<br/>8.1.4 Classification of Dynamic Data Streams 481<br/>8.1.5 Clustering Evolving Data Streams 486<br/>8.2 Mining Time-Series Data 489<br/>8.2.1 Trend Analysis 490<br/>8.2.2 Similarity Search in Time-Series Analysis 493<br/>83 Mining Sequence Patterns In Transactionai Databases 498<br/>83.1 Sequential Pattern Mining: Concepts and Primitives 498<br/>8.3.2 Scalable Methods for Mining Sequential Patterns 500<br/>8.3.3 Constraint-Based Mining of Sequential Patterns 509<br/>83.4 Periodicity Analysis for Time-Related Sequence Data 512<br/>8.4 Mining Sequence Patterns In Biological Data 513<br/>8.4.1 Alignment of Biological Sequences 514<br/>8.4.2 Hidden Markov Model for Biological Sequence Analysis 518<br/>8.5 Summary 527<br/>Exercises 528<br/>Bibliographic Notes 531<br/>Chapter 9 Graph Mining, Social Network Analysis, and Multlrelational<br/>Data Mining 535<br/>9.1 Graph Mining 535<br/>9.1.1 Methods for Mining Frequent Subgraphs 536<br/>9.1.2 Mining Variant and Constrained Substructure Patterns 545<br/>9.1.3 Applications: Graph Indexing, Similarity Search. Classification,<br/>and Clustering 551<br/>9.2 Social Network Analysis 556<br/>9.2.1 What Is a Social Network? 556<br/>9.2.2 Characteristics of Social Networks 557<br/>9.2.3 Link Mining: Tasks and Challenges 561<br/>9.2.4 Mining on Social Networks 565<br/>9.3 Multlrelational Data Mining 571<br/>9.3.1 What Is Multirelational Data Mining? 571<br/>9.3.2 ILP Approach to Multirelational Classification 573<br/>9.3.3 Tuple ID Propagation 575<br/>9.3.4 Multirelational Classification Using Tuple ID Propagation 577<br/>9.3.5 Multirelational Clustering with User Guidance 580<br/>9.4 Summary 584<br/>Exercises 586<br/>Bibliographic Notes 587<br/>Chapter 10 Mining Object, Spatial, Multimedia, Text, and Web Data 591<br/>10.1 Multidimensional Analysis and Descriptive Mining of Complex<br/>Data Objects 591<br/>10.1.1 Generalization of Structured Data 592<br/>10.1.2 Aggregation and Approximation in Spatial and Multimedia Data<br/>Generalization 593<br/>10.1.3 Generalization of Object Identifiers and Class/Subclass<br/>Hierarchies 594<br/>10.1 .4 Generalization of Class Composition Hierarchies 595<br/>10.1.5 Construction and Mining of Object Cubes 596<br/>10.1.6 Generalization-Based Mining of Plan Databases by<br/>Divide-and-Conquer 596<br/>10.2 Spatial Data Mining 600<br/>10.2.1 Spatial Data Cube Construction and Spatial OLAP 601<br/>10.2.2 Mining Spatial Association and Co-location Patterns 605<br/>10.2.3 Spatial Clustering Methods 606<br/>10.2.4 Spatial Classification and Spatial Trend Analysis 606<br/>10.2.5 Mining Raster Databases 607<br/>10.3 Multimedia Data Mining 607<br/>10.3.1 Similarity Search in Multimedia Data 608<br/>10.3.2 Multidimensional Analysis of Multimedia Data 609<br/>10.3.3 Classification and Prediction Analysis of Multimedia Data 61<br/>10.3.4 Mining Associations in Multimedia Data 612<br/>10.3.5 Audio and Video Data Mining 613<br/>10.4 Text Mining 614<br/>10.4.1 Text Data Analysis and Information Retrieval 615<br/>10.4.2 Dimensionality Reduction for Text 621<br/>10.4.3 Text Mining Approaches 624<br/>10.5 Mining the World Wide Web 628<br/>10.5.1 Mining the Web Page Layout Structure 630<br/>10.5.2 Mining the Web's Link Structures to Identify<br/>Authoritative Web Pages 631<br/>10.5.3 Mining Multimedia Data on the Web 637<br/>10.5.4 Automatic Classification of Web Documents 638<br/>10.5.5 Web Usage Mining 640<br/>10.6 Summary 641<br/>Exercises 642<br/>Bibliographic Notes 645<br/>Chapter I I Applications and Trends in Data Mining 649<br/>I I. I Data Mining Applications 649<br/>I.I.I Data Mining for Financial Data Analysis 649<br/>1 . 1 .2 Data Mining for the Retail Industry 651<br/>1 . 1 .3 Data Mining for the Telecommunication Industry 652<br/>1. 1 .4 Data Mining for Biological Data Analysis 654<br/>1. 1.5 Data Mining in Other Scientific Applications 657<br/>1 . 1 .6 Data Mining for Intrusion Detection 658<br/>i 1.2 Data Mining System Products and Research 1»rototypes 660<br/>1 1.2. i How to Choose a Data Mining System 660<br/>1 1 .2.2 Examples of Commercial Data Mining Systems 663<br/>1 1.3 Additional Themes on Data Mining 665<br/>I 1.3.1 Theoretical Foundations of Data Mining 665<br/>1 1 .3.2 Statistical Data Mining 666<br/>1 1 .3.3 Visual and Audio Data Mining 667<br/>I 1.3.4 Data Mining and Collaborative Filtering 670<br/>1 1 .4 Social impacts of Data Mining 675<br/>1 1 .4. 1 Ubiquitous and Invisible Data Mining 675<br/>I 1.4.2 Data Mining, Privacy, and Data Security 678<br/>I 1.5 Trends in Data Mining 681<br/>1 1.6 Summary 684<br/>Exercises 685<br/>Bibliographic Notes 687 |
942 ## - ADDED ENTRY ELEMENTS (KOHA) | |
Koha item type | GN Books |
Withdrawn status | Lost status | Damaged status | Not for loan | Home library | Current library | Shelving location | Date acquired | Full call number | Accession number | Date last seen | Date last checked out | Koha item type |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Central Library, Sikkim University | Central Library, Sikkim University | General Book Section | 26/06/2016 | 006.312 HAN/D | P21220 | 15/05/2023 | 13/04/2023 | General Books |