Details

Data Mining


Data Mining

Concepts, Models, Methods, and Algorithms
3. Aufl.

von: Mehmed Kantardzic

103,99 €

Verlag: Wiley
Format: PDF
Veröffentl.: 21.10.2019
ISBN/EAN: 9781119515982
Sprache: englisch
Anzahl Seiten: 672

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<p><b>Presents the latest techniques for analyzing and extracting information from large amounts of data in high-dimensional data spaces</b></p> <p>The revised and updated third edition of <i>Data Mining</i> contains in one volume an introduction to a systematic approach to the analysis of large data sets that integrates results from disciplines such as statistics, artificial intelligence, data bases, pattern recognition, and computer visualization. Advances in deep learning technology have opened an entire new spectrum of applications. The author—a noted expert on the topic—explains the basic concepts, models, and methodologies that have been developed in recent years.</p> <p>This new edition introduces and expands on many topics, as well as providing revised sections on software tools and data mining applications. Additional changes include an updated list of references for further study, and an extended list of problems and questions that relate to each chapter.This third edition presents new and expanded information that:</p> <p>•    Explores big data and cloud computing</p> <p>•    Examines deep learning</p> <p>•    Includes information on convolutional neural networks (CNN)</p> <p>•    Offers reinforcement learning</p> <p>•    Contains semi-supervised learning and S3VM</p> <p>•    Reviews model evaluation for unbalanced data</p> <p>Written for graduate students in computer science, computer engineers, and computer information systems professionals, the updated third edition of <i>Data Mining</i> continues to provide an essential guide to the basic principles of the technology and the most recent developments in the field.</p>
<p>Preface xiii</p> <p>Preface to the Second Edition xv</p> <p>Preface to the First Edition xvii</p> <p><b>1 Data-Mining Concepts 1</b></p> <p>1.1 Introduction 2</p> <p>1.2 Data-Mining Roots 4</p> <p>1.3 Data-Mining Process 6</p> <p>1.4 From Data Collection to Data Preprocessing 10</p> <p>1.5 Data Warehouses for Data Mining 15</p> <p>1.6 From Big Data to Data Science 18</p> <p>1.7 Business Aspects of Data Mining: Why a Data-Mining Project Fails? 22</p> <p>1.8 Organization of This Book 26</p> <p>1.9 Review Questions and Problems 28</p> <p>1.10 References for Further Study 30</p> <p><b>2 Preparing the Data 33</b></p> <p>2.1 Representation of Raw Data 34</p> <p>2.2 Characteristics of Raw Data 38</p> <p>2.3 Transformation of Raw Data 40</p> <p>2.4 Missing Data 43</p> <p>2.5 Time-Dependent Data 44</p> <p>2.6 Outlier Analysis 49</p> <p>2.7 Review Questions and Problems 56</p> <p>2.8 References for Further Study 59</p> <p><b>3 Data Reduction 61</b></p> <p>3.1 Dimensions of Large Data Sets 62</p> <p>3.2 Features Reduction 64</p> <p>3.3 Relief Algorithm 75</p> <p>3.4 Entropy Measure for Ranking Features 77</p> <p>3.5 Principal Component Analysis 80</p> <p>3.6 Value Reduction 83</p> <p>3.7 Feature Discretization: ChiMerge Technique 86</p> <p>3.8 Case Reduction 90</p> <p>3.9 Review Questions and Problems 93</p> <p>3.10 References for Further Study 95</p> <p><b>4 Learning from Data 97</b></p> <p>4.1 Learning Machine 99</p> <p>4.2 Statistical Learning Theory 104</p> <p>4.3 Types of Learning Methods 110</p> <p>4.4 Common Learning Tasks 112</p> <p>4.5 Support Vector Machines 117</p> <p>4.6 Semi-Supervised Support Vector Machines (S3VM) 131</p> <p>4.7 kNN: Nearest Neighbor Classifier 134</p> <p>4.8 Model Selection vs. Generalization 138</p> <p>4.9 Model Estimation 142</p> <p>4.10 Imbalanced Data Classification 150</p> <p>4.11 90% Accuracy … Now What? 154</p> <p>4.12 Review Questions and Problems 158</p> <p>4.13 References for Further Study 161</p> <p><b>5 Statistical Methods 165</b></p> <p>5.1 Statistical Inference 166</p> <p>5.2 Assessing Differences in Data Sets 168</p> <p>5.3 Bayesian Inference 172</p> <p>5.4 Predictive Regression 175</p> <p>5.5 Analysis of Variance 181</p> <p>5.6 Logistic Regression 184</p> <p>5.7 Log-Linear Models 185</p> <p>5.8 Linear Discriminant Analysis 189</p> <p>5.9 Review Questions and Problems 191</p> <p>5.10 References for Further Study 194</p> <p><b>6 Decision Trees and Decision Rules 197</b></p> <p>6.1 Decision Trees 199</p> <p>6.2 <i>C4.5 Algorithm</i>: Generating a Decision Tree 201</p> <p>6.3 Unknown Attribute Values 209</p> <p>6.4 Pruning Decision Trees 214</p> <p>6.5 <i>C4.5 Algorithm</i>: Generating Decision Rules 215</p> <p>6.6 Cart Algorithm and Gini Index 219</p> <p>6.7 Limitations of Decision Trees and Decision Rules 222</p> <p>6.8 Review Questions and Problems 225</p> <p>6.9 References for Further Study 229</p> <p><b>7 Artificial Neural Networks 231</b></p> <p>7.1 Model of an Artificial Neuron 233</p> <p>7.2 Architectures of Artificial Neural Networks 237</p> <p>7.3 Learning Process 239</p> <p>7.4 Learning Tasks Using Anns 243</p> <p>7.5 Multilayer Perceptrons 245</p> <p>7.6 Competitive Networks and Competitive Learning 255</p> <p>7.7 Self-Organizing Maps 259</p> <p>7.8 Deep Learning 264</p> <p>7.9 Convolutional Neural Networks (CNNs) 270</p> <p>7.10 Review Questions and Problems 273</p> <p>7.11 References for Further Study 276</p> <p><b>8 Ensemble Learning 279</b></p> <p>8.1 Ensemble Learning Methodologies 280</p> <p>8.2 Combination Schemes for Multiple Learners 285</p> <p>8.3 Bagging and Boosting 286</p> <p>8.4 AdaBoost 288</p> <p>8.5 Review Questions and Problems 290</p> <p>8.6 References for Further Study 293</p> <p><b>9 Cluster Analysis 295</b></p> <p>9.1 Clustering Concepts 296</p> <p>9.2 Similarity Measures 299</p> <p>9.3 Agglomerative Hierarchical Clustering 306</p> <p>9.4 Partitional Clustering 310</p> <p>9.5 Incremental Clustering 313</p> <p>9.6 DBSCAN Algorithm 317</p> <p>9.7 BIRCH Algorithm 320</p> <p>9.8 Clustering Validation 323</p> <p>9.9 Review Questions and Problems 328</p> <p>9.10 References for Further Study 333</p> <p><b>10 Association Rules 335</b></p> <p>10.1 Market-Basket Analysis 337</p> <p>10.2 Algorithm <i>Apriori </i>338</p> <p>10.3 From Frequent Itemsets to Association Rules 340</p> <p>10.4 Improving the Efficiency of the <i>Apriori </i>Algorithm 342</p> <p>10.5 Frequent Pattern Growth Method 344</p> <p>10.6 Associative-Classification Method 346</p> <p>10.7 Multidimensional Association Rule Mining 349</p> <p>10.8 Review Questions and Problems 351</p> <p>10.9 References for Further Study 355</p> <p><b>11 Web Mining and Text Mining 357</b></p> <p>11.1 Web Mining 358</p> <p>11.2 Web Content, Structure, and Usage Mining 360</p> <p>11.3 Hits and Logsom Algorithms 362</p> <p>11.4 Mining Path-Traversal Patterns 368</p> <p>11.5 PageRank Algorithm 371</p> <p>11.6 Recommender Systems 374</p> <p>11.7 Text Mining 375</p> <p>11.8 Latent Semantic Analysis 379</p> <p>11.9 Review Questions and Problems 385</p> <p>11.10 References for Further Study 388</p> <p><b>12 Advances in Data Mining 391</b></p> <p>12.1 Graph Mining 392</p> <p>12.2 Temporal Data Mining 406</p> <p>12.3 Spatial Data Mining 422</p> <p>12.4 Distributed Data Mining 426</p> <p>12.5 Correlation Does not Imply Causality! 435</p> <p>12.6 Privacy, Security, and Legal Aspects of Data Mining 442</p> <p>12.7 Cloud Computing Based on Hadoop and Map/Reduce 449</p> <p>12.8 Reinforcement Learning 454</p> <p>12.9 Review Questions and Problems 459</p> <p>12.10 References for Further Study 461</p> <p><b>13 Genetic Algorithms 465</b></p> <p>13.1 Fundamentals of Genetic Algorithms 466</p> <p>13.2 Optimization Using Genetic Algorithms 468</p> <p>13.3 A Simple Illustration of a Genetic Algorithm 474</p> <p>13.4 Schemata 480</p> <p>13.5 Traveling Salesman Problem 483</p> <p>13.6 Machine Learning Using Genetic Algorithms 485</p> <p>13.7 Genetic Algorithms for Clustering 490</p> <p>13.8 Review Questions and Problems 493</p> <p>13.9 References for Further Study 494</p> <p><b>14 Fuzzy Sets and Fuzzy Logic 497</b></p> <p>14.1 Fuzzy Sets 498</p> <p>14.2 Fuzzy Set Operations 504</p> <p>14.3 Extension Principle and Fuzzy Relations 509</p> <p>14.4 Fuzzy Logic and Fuzzy Inference Systems 513</p> <p>14.5 Multifactorial Evaluation 518</p> <p>14.6 Extracting Fuzzy Models from Data 521</p> <p>14.7 Data Mining and Fuzzy Sets 526</p> <p>14.8 Review Questions and Problems 528</p> <p>14.9 References for Further Study 530</p> <p><b>15 Visualization Methods 533</b></p> <p>15.1 Perception and Visualization 534</p> <p>15.2 Scientific Visualization and Information Visualization 535</p> <p>15.3 Parallel Coordinates 542</p> <p>15.4 Radial Visualization 544</p> <p>15.5 Visualization Using Self-Organizing Maps 547</p> <p>15.6 Visualization Systems for Data Mining 549</p> <p>15.7 Review Questions and Problems 554</p> <p>15.8 References for Further Study 555</p> <p><b>Appendix A: Information on Data Mining 559</b></p> <p>A.1 Data-Mining Journals 559</p> <p>A.2 Data-Mining Conferences 564</p> <p>A.3 Data-Mining Forums/Blogs 568</p> <p>A.4 Data Sets 570</p> <p>A.5 Comercially and Publicly Available Tools 574</p> <p>A.6 Web Site Links 583</p> <p><b>Appendix B: Data-Mining Applications 589</b></p> <p>B.1 Data Mining for Financial Data Analyses 589</p> <p>B.2 Data Mining for the Telecomunication Industry 593</p> <p>B.3 Data Mining for the Retail Industry 596</p> <p>B.4 Data Mining in Healthcare and Biomedical Research 599</p> <p>B.5 Data Mining in Science and Engineering 602</p> <p>B.6 Pitfalls of Data Mining 605</p> <p>Bibliography 607</p> <p>Index 633</p>
<p><b>MEHMED KANTARDZIC, P<small>H</small>D,</b> is a Professor in the Department of Computer Engineering and Computer Science (CECS) at the University of Louisville, and is Director of the Data Mining Lab and CECS Graduate Programs. He is a member of IEEE, ISCA, KAS, WSEAS, IEE, and SPIE.
<p><b>PRESENTS THE LATEST TECHNIQUES FOR ANALYZING AND EXTRACTING INFORMATION FROM LARGE AMOUNTS OF DATA IN HIGH-DIMENSIONAL DATA SPACES</b> <p>The revised and updated third edition of??<i>Data Mining</i>??contains in one volume an introduction to a systematic approach to the analysis of large data sets that integrates results from disciplines such as statistics, artificial intelligence, data bases, pattern recognition, and computer visualization. Advances in deep learning technology have opened an entire new spectrum of applications. The author—a noted expert on the topic—explains the basic concepts, models, and methodologies that have been developed in recent years. <p>This new edition introduces and expands on many topics, as well as providing revised sections on software tools and data mining applications. Additional changes include an updated list of references for further study, and an extended list of problems and questions that relate to each chapter.This third edition presents new and expanded information that: <ul> <li>Explores big data and cloud computing</li> <li>Examines deep learning</li> <li>Includes information on convolutional neural networks (CNN)</li> <li>Offers reinforcement learning</li> <li>Contains semi-supervised learning and S3VM</li> <li>Reviews model evaluation for unbalanced data</li> </ul> <p>Written for graduate students in computer science, computer engineers, and computer information systems professionals, the updated third edition of <i>Data Mining</i> continues to provide an essential guide to the basic principles of the technology and the most recent developments in the field.

Diese Produkte könnten Sie auch interessieren:

Quantifiers in Action
Quantifiers in Action
von: Antonio Badia
PDF ebook
96,29 €
Managing and Mining Uncertain Data
Managing and Mining Uncertain Data
von: Charu C. Aggarwal
PDF ebook
96,29 €