A top-down approach for creating and implementing data mining solutions
Laurinen, Perttu (2006-06-13)
Avaa tiedosto
https://urn.fi/URN:ISBN:9514281268
Kuvaus
Tiivistelmä
Abstract
The information age is characterized by ever-growing amounts of data surrounding us. By reproducing this data into usable knowledge we can start moving toward the knowledge age. Data mining is the science of transforming measurable information into usable knowledge. During the data mining process, the measurements pass through a chain of sophisticated transformations in order to acquire knowledge. Furthermore, in some applications the results are implemented as software solutions so that they can be continuously utilized. It is evident that the quality and amount of the knowledge formed is highly dependent on the transformations and the process applied. This thesis presents an application independent concept that can be used for managing the data mining process and implementing the acquired results as software applications.
The developed concept is divided into two parts — solution formation and solution implementation. The first part presents a systematic way for finding a data mining solution from a set of measurement data. The developed approach allows for easier application of a variety of algorithms to the data, manages the work chain, and differentiates between the data mining tasks. The method is based on storage of the data between the main stages of the data mining process, where the different stages of the process are defined on the basis of the type of algorithms applied to the data. The efficiency of the process is demonstrated with a case study presenting new solutions for resistance spot welding quality control.
The second part of the concept presents a component-based data mining application framework, called Smart Archive, designed for implementing the solution. The framework provides functionality that is common to most data mining applications and is especially suitable for implementing applications that process continuously acquired measurements. The work also proposes an efficient algorithm for utilizing cumulative measurement data in the history component of the framework. Using the framework, it is possible to build high-quality data mining applications with shorter development times by configuring the framework to process application-specific data. The efficiency of the framework is illustrated using a case study presenting the results and implementation principles of an application developed for predicting steel slab temperatures in a hot strip mill.
In conclusion, this thesis presents a concept that proposes solutions for two fundamental issues of data mining, the creation of a working data mining solution from a set of measurement data and the implementation of it as a stand-alone application.
Kokoelmat
- Avoin saatavuus [34589]