Oracle9i Data Mining Concepts Release 9.2.0.2 Part Number A95961-02 |
|
|
View PDF |
This chapter discusses two major topics:
For an example of ODM basic usage, see Chapter 3.
This chapter provides an overview of the steps required to perform basic ODM tasks. For detailed examples of how to perform these tasks, see the ODM sample programs. The ODM sample programs are distributed with the ODM documentation. For an overview of the ODM sample programs, see Appendix A.
This chapter does not include a detailed description of any of the ODM API classes and methods. For detailed information about the ODM API, see the ODM Javadoc in the directory $ORACLE_HOME/dm/doc
on any system where ODM is installed.
ODM depends on the following Oracle9i Java Archive (.jar
) files:
$ORACLE_HOME/jdbc/lib/classes12.jar $ORACLE_HOME/lib/xmlparserv2.jar $ORACLE_HOME/rdbms/jlib/jmscommon.jar $ORACLE_HOME/rdbms/jlib/aqapi.jar $ORACLE_HOME/rdbms/jlib/xsu12.jar $ORACLE_HOME/dm/lib/odmapi.jar
These files must be in your CLASSPATH to compile and execute ODM programs.
If you use a database character set that is not US7ASCII, WE8DEC, WE8ISO8859P1, or UTF8, you must also include the following in your CLASSPATH:
$ORACLE_HOME/jdbc/lib/nls_charset12.zip
If you do not include nls_charset12.zip in your CLASSPATH, an ODM program will fail with the following error:
oracle.jms.AQjmsException: Non supported character set:oracle-character-set-178
This section describes the steps required to perform several common data mining tasks using ODM.
All work in ODM is done using MiningTask
objects.
This section summarizes the steps required to build a model.
MiningFunctionSettings
object.MiningBuildTask
object.getCurrentStatus
method to get the status of the task. Alternatively, use the waitForCompletion
method to wait until all asynchronous activity for task completes.After successful completion of the task, a build results object exists.
The following sample programs illustrate building ODM models:
Sample_AdaptiveBayesNetworkBuild.java
Sample_NaiveBayesBuild.java
Sample_AssociationRulesBuild.java
Sample_ClusteringBuild.java
Data mining tasks are usually performed in sequence. The following sequence of tasks is typical:
To implement a sequence of dependent task executions, you may periodically check the asynchronous task execution status using the getCurrentStatus
method or block for completion using the waitForCompletion
method. You can then perform the dependent task after completion of the previous task.
For example, follow these steps to perform the build, test, and compute lift sequence:
MiningTestTask
object. Either periodically check the status of the test operation or block until the task completes.MiningComputeLiftTask
object.Model Seeker builds multiple models; it then evaluates and compares the models to find a "best" model.
Follow these steps to use Model Seeker:
ModelSeekerTask
(MST) instance to hold the information needed to specify the models to build. The required information is defined in subclasses of the MiningFunctionSettings
(MFS) and MiningAlgorithmSettings
(MAS) classes.
You can specify a combination of as many instances of the following as desired:
NaiveBayesAlgorithmnSettings
CombinationNaiveBayesSettings
AdaptiveBayesNetworkAlgorithmSettings
CombinationAdaptiveBayesNetSettings
(You cannot specify clustering models or Association Rules models.)
getCurrentStatus
method to get the status of the task, using the task name. Alternatively, use the waitForCompletion
method to wait until all asynchronous activity for the required work completes.getResults
method to view the summary information and the best model. Model Seeker discards all models that it builds except the best one.The sample program Sample_ModelSeeker.java
illustrates how to use Model Seeker.
Models based on data sets with a large number of attributes can have very long build times. To minimize build time, you can use ODM Attribute Importance to identify the critical attributes and then build a model using these attributes only.
Identify the most important attributes by building an Attributes Importance model as follows:
The sample program Sample_AttributeImportanceBuild.java
illustrates how to build an attribute importance model.
After identifying the important attributes, build a model using the selected attributes as follows:
adjustAttributeUsage
defined on MiningFunctionSetting
. Only the attributes returned by Attribute Importance will be active for model building.The sample program Sample_AttributeImportanceUsage.java
illustrates how to build a model using the important attributes.
You make predictions by applying a model to new data, that is, by scoring the data.
Any table that you score (apply a model to) must have the same format as the table used to build the model. If you build a model using a table that is in transactional format, any table that you apply that model to must be in transactional format. Similarly, if the table used to build the model was in nontransactional format, any table to which you apply the model must be in nontransactional format.
Note that you can score a single record, which must also be in the same format as the table used to build the model.