Oracle Data Mining Concepts 10g Release 1 (10.1) Part Number B10698-01 |
|
|
View PDF |
This chapter describes what data mining is, what Oracle Data Mining is, and outlines the data mining process.
Too much data and not enough information -- this is a problem facing many businesses and industries.
A solution lies here, with data mining. Most businesses have an enormous amount of data, with a great deal of information hiding within it, but "hiding" is usually exactly what it is doing: So much data exists that it overwhelms traditional methods of data analysis.
Data mining provides a way to get at the information buried in the data. Data mining finds hidden patterns in large, complex collections of data, patterns that elude traditional statistical approaches to analysis.
Oracle Data Mining (ODM) embeds data mining within the Oracle database. There is no need to move data out of the database into files for analysis and then back from files into the database for storing. The data never leaves the database -- the data, data preparation, model building, and model scoring results all remain in the database. This enables Oracle to provide an infrastructure for application developers to integrate data mining seamlessly with database applications.
ODM is designed to support production data mining in the Oracle database. Production data mining is most appropriate for creating applications to solve problems such as customer relationship management, churn, etc., that is, any data mining problem for which you want to develop an application.
ODM provides single-user milt-session access to models. Model building is either synchronous in the PL/SQL interface or asynchronous in the Java interface.
ODM integrates data mining with the Oracle data base and exposes data mining through the following interfaces:
The ODM Java interface and DBMS_DATA_MINING have similar, but not identical, capabilities. For a comparison of the interfaces, see Appendix A.
Data mining functions are based on two kinds of learning: supervised (directed) and unsupervised (undirected).
Supervised learning functions are typically used to predict a value, and are sometimes referred to as predictive models. Unsupervised learning functions are typically used to find the intrinsic structure, relations, or affinities in data but no classes or labels are assigned aprioi. These are sometimes referred to as descriptive models.
Oracle Data Mining supports the following data mining functions: