gogogo
Syndetics cover image
Image from Syndetics

Data mining / Pieter Adriaans, Dolf Zantinge.

By: Contributor(s): Material type: TextTextPublication details: Harlow, England ; Reading, Mass. : Addison-Wesley, 1996.Description: xi, 158 p. : ill. ; 24 cmISBN:
  • 0201403803
Subject(s): DDC classification:
  • 005.7565 ADR
Holdings
Item type Current library Call number Copy number Status Date due Barcode
Standard Loan Clonmel Library Main Collection 005.7565 ADR (Browse shelf(Opens below)) Available R02494XKRCC
Standard Loan Moylish Library Main Collection 005.7565 ADR (Browse shelf(Opens below)) 1 Available 39002100500488
Standard Loan Thurles Library Main Collection 005.7565 ADR (Browse shelf(Opens below)) Available R05403KKRCT
Standard Loan Thurles Library Main Collection 005.7565 ADR (Browse shelf(Opens below)) Available R05408FKRCT
Standard Loan Thurles Library Main Collection 005.7565 ADR (Browse shelf(Opens below)) Available R07350KRCT
Standard Loan Thurles Library Main Collection 005.7565 ADR (Browse shelf(Opens below)) Available R07345KRCT

Enhanced descriptions from Syndetics:

Data Mining deals with discovering hidden knowlege, unexpected patterns.It is currently regarded as the key element of a much more involved process called knowledge discovery. Setting up a data mining environment is not a trivial task.This book aims to provide essential insights and guidelines to help you make the right decisions when setting up a data mining environment.It deals with the following questions

What is Data Mining?
Which techniques are suitable for my data?
How do I set up a data mining environment?
How do I justify the costs?

Includes bibliographical references (p. 141-144) and index.

Table of contents provided by Syndetics

  • Introduction
  • What is learning?
  • Data mining and the data warehouse
  • The knowledge discovery process
  • Setting up a KDD environment
  • Some real-life applications
  • Some formal aspects of learning algorithms

Excerpt provided by Syndetics

Data mining deals with the discovery of hidden knowledge, unexpected patterns and new rules from large databases. It is currently regarded as the key element of a much more elaborate process called knowledge discovery in databases (KDD), which is closely linked to another important development O data warehousing. A data warehouse is a central store of data that has been extracted from operational data. The information in a data warehouse is subject-oriented, non-volatile, and of an historic nature, so data warehouses tend to contain extremely large data sets. The combination of data warehousing, decision support, and data mining indicates an innovative and totally new approach to information management. Until now, information systems have been built and operated mainly to support the operational processes of an organization. KDD and data warehousing view the information in an organization in an entirely new way O as a strategic source of opportunity. KDD is the first practical step towards realizing information as a production factor. There are many books already available on data warehousing, also some on machine learning and databases, and a few on data mining and knowledge discovery in databases. What is lacking, however, is a comprehensive overview for management. This book attempts to provide such an overview, and is aimed at anyone who wants to get the most out of large databases O general, marketing and IT management, or any professional who wants a high-level overview of the process. We have tried to make the book as accessible as possible. It contains no complicated mathematics, and we have used many examples from our daily practice. Data mining is important for all organizations that utilize large data sets; any organization with large volumes of financial data, huge customer databases, or helpdesk service records can benefit from this newly emerging field. This book offers a comprehensive introduction to data mining and provides clear answers to questions such as: What is data mining? Which techniques are suitable for my data? How do I set up a data mining environment? How do I justify the costs? The whole KDD process, including data selection, cleaning, coding, using different pattern recognition techniques, and reporting, is illustrated by means of extensive case histories and numerous examples. Setting up a data mining environment is not a trivial task. The long-term aim is to create a self-learning organization that makes optimal use of the information it generates. This book aims to provide essential insights and guidelines to help you make the right decisions when you are setting up such an environment. Syllogic is one of the world's leading companies in data and systems management, and has extensive experience of pattern recognition in databases. In 1991, Syllogic created CAPTAINS at the request of KLM (the Royal Dutch Airline). This was one of the first commercial data mining applications. In writing this book, we have drawn on our extensive experience of setting up client/server, data warehouse, KDD and data mining environments for our customers. Overview of the book Chapter 1provides a broad introduction to the area of KDD: basic definitions are given, the importance of the development for modern organizations is pointed out, and some hints for setting up a data mining environment are given. InChapter 2, we deal with self-learning computer systems. After briefly discussing the somewhat more abstract or philosophical aspects of learning, we illustrate the relationship between machine learning and the methodology of science. The main aim of this chapter is to give the reader a general feeling of the difficulties and risks of using pattern recognition and machine-learning algorithms. Without a deeper understanding of the methodological issues, it is too easy to draw incorrect conclusions on the basis of the output of a learning algorithm. InChapter 3, the relationship between data mining and the data warehouse is discussed. Data is obviously needed for the data mining process, and a data warehouse is the best structure for providing this. A KDD environment must also be integrated with a decision support system, the design of databases for decision support being an art in itself. Cost justification is also briefly discussed. Chapter 4describes the complete KDD process on the basis of an extensive example drawn from the marketing domain. All the various stages of the data mining process as we see it are considered, from the specification of an information requirement via data selection, enrichment and coding, to discovery and reporting. Much attention is paid to the issue of data cleaning, as this is particularly important in current data mining projects. The discovery stage is complex since one can use many techniques. In the chapter we apply different processes to the same sample data set, to give a good picture of the possibilities of hybrid learning O that is, learning by means of a range of techniques. After the extensive discussion of the KDD process in Chapter 4, we devoteChapter 5to the process of setting up a KDD environment. What do we need to consider when we want to start a data mining project? What are the necessary steps? At the end of the chapter, we formulate ten golden rules for setting up a KDD environment, which encapsulate the experience that we have built up over the past several years. InChapter 6we describe some real-life applications from daily experience at Syllogic: customer profiling for a large bank, embedded learning in a system that predicts pilot bid behavior (career intentions), and a more technical case, on the reverse engineering of databases. These three cases give a neat illustration of the broad possibilities for application of data mining techniques. Chapter 7describes some formal aspects of machine learning and relational theory related to data mining: complexity theory, fuzzy databases, and database primitives for data mining. This chapter is not essential reading for those who want to apply data mining techniques but do not need a deeper knowledge of its technical underpinnings. We conclude the book with a summary, an extensive glossary, and a subject-oriented list of further reading. Acknowledgments Many people have contributed to this book in one way or another. It is impossible to thank them all here, but there are some names that deserve a special mention. Firstly, we would like to thank our colleagues at Syllogic. Arno Knobbe implemented many of the machine-learning algorithms that are in operation in Syllogic at the moment, and designed the principal case study in Chapter 4. He, together with Marc-Paul van der Hulst and Ronan Waldron, has been responsible for many of the case studies that we have been working on at Syllogic. Some portions of this book have been previously published in the Automatisering Gids, and we thank Henk Ester for his continued support. We owe thanks to Lisa Birthistle and Thea van Breenen for typing the text and drawing many of the figures. Lisa is especially acknowledged for editing the manuscript and correcting the English. Our thanks also go to Evangelos Simoudis of IBM for his many valuable comments on the manuscript. We would also like to mention the staff of the Tandem high performance research center, especially Wouter Senf, whose ideas on adapting the relational model for data mining have been very valuable. Other people who have contributed in some way to the book are Charles Gooda, Evert Jan van Hasselt, Gusti Eiben, Karen Mosman of Addison Wesley Longman and Vassilis Moustakis of the Heraklion University in Greece. The publishers are grateful to KLM for permission to feature the CAPTAINS case study. Finally we would like to thank Rini and Marion for their continued support. It is well known that sharing your life with somebody who runs a company is difficult, but sharing your life with somebody who runs a company and also insists on publishing books is particularly hard. 0201403803P04062001 Excerpted from Data Mining by Pieter Adriaans, Dolf Zantinge All rights reserved by the original copyright owners. Excerpts are provided for display purposes only and may not be reproduced, reprinted or distributed without the written permission of the publisher.

Powered by Koha