Oracle Ultra Search Online Documentation Release 9.2 |
|
1. Introduction
2. Sample Agent Files
3. Setting up the Sample Crawler Agent
Oracle Ultra Search provides a sample implementation of user-defined crawler agents using the Ultra Search agent API.Upon invocation, this sample agent connects to a specified Oracle database and retrieves the contents of a table for the crawler to collect and index.
The sample agents are fully functional and can be customized to adapt to other database-based data sources. This agent performs the following task:
- Reads data source parameters
- Connects to the database that contains the data source
- Initializes fetching document URL and attributes from the data source
- Fetches document URL and attributes from the data source
- Disconnects from the data source
For more information, see About the Ultra Searach Crawler Agent API.
The sample agent files are located in the $ORACLE_HOME/ultrasearch/sample directory. You can directly view the sample agent source code using your preferred text editor.
The following table describes all sample agent files:
File | Description |
---|---|
sample_agent_readme.htm | This file |
SampleAgent.java | Sample crawler agent implementation using agent APIs |
3.1 Compile and Build Agent Jar File
The Java source code for the sample agent must be first compiled into class files and put into a jar file in the $ORACLE_HOME/ultrasearch/lib/agent/ directory. The classes needed for compilation are the JDK class (classes.zip), Oracle JDBC thin driver (classes12.zip), and ultraserach.jar. For example:
javac -J-ms16m -J-mx96m -O -classpath /jdk1.2.2_05/lib/classes.zip:/lib/classes12.zip: $ORACLE_HOME/ultrasearch/lib/ultrasearch.jar SampleAgent.javaTo build the sampleAgent.jar file:
/jdk1.2.2_05/bin/jar cv0f /oracle/ultrasearch/lib/agent/sampleAgent.jar SampleAgent.class 'SampleAgent$DocNode.class'3.2 Create a Data Source Type
A data source type that uses the sample agent must be created first.
- Name: URL table type
- Description: Table with rows of URLs
- Agent Name: SampleAgent
- Agent Jar File: sampleagent
3.3 Define Data Source Parameters
Define parameters for a data source type:
- Database Connect String (DB connection)
- User Name (schema owner of the URL table)
- Password (schema owner password, encrypted)
- Table Name (URL table name)
- URL Column (Column holding doc URLs)
- Ignore Flag Column (1 for ignoring, 0 otherwise)
- Language Column (Document Language)
- Attribute List (List of column for attributes)
It is in the following format: [column name/attribute name] <data type> [column name/attribute name] <data type> ... where <data type> 0 is number, 1 is string, and 2 is date. For example, if the document has 4 attributes: Company Name, Category, Revenue, S&P Rating, then it is specified as: [Company Name/Company/1][Category/Classification/1][Revenue/Revenue/0][Rating/Alalyst Rating/1]- Log File Name (log file)
- Log Directory (Location of log file)
3.4 Define a Data Source of this Type
A data source is defined, which initializes the data source parameters. For example, the value specified accesses a table whose schema is:
TABLE NEWS ( ARTICLE_NO NUMBER, NEWS_URL VARCHAR2(740), TITLE VARCHAR2(200), AUTHOR VARCHAR2(100), PUB_DATE DATE default SYSDATE, PUBLISHER VARCHAR2(100), PRICE NUMBER, LANG VARCHAR2(10), IGNORE NUMBER DEFAULT 0, PRIMARY KEY (NEWS_URL) );
- Database Connect String: dlsun1710:5521:search
- User Name: SCOTT
- Password: TIGER
- Table Name: NEWS
- URL Column: NEWS_URL
- Ignore Flag Column: IGNORE
- Language Column: LANG
- Attribute List: [ARTICLE_NO/Article Number/0][TITLE/Article Title/1][AUTHOR/Author/1][PUB_DATE/Report Date/2][PUBLISHER/Newspaper/1][PRICE/Download Cost/0]
- Log File Name: testagent.log
- Log Directory: /tmp/ultrasearch/
Copyright © 2002 Oracle Corporation. All Rights Reserved. |
|