The DBMS_PARALLEL_EXECUTE
package enables incremental update of table data in parallel.
This chapter contains the following topics:
Overview
Security Model
Constants
Views
Exceptions
Examples
This package lets you incrementally update table data in parallel, in two high-level steps:
Group sets of rows in the table into smaller-sized chunks.
Run a user-specified statement on these chunks in parallel, and commit when finished processing each chunk.
This package introduces the notion of parallel execution task. This task groups the various steps associated with the parallel execution of a PL/SQL block, which is typically updating table data.
All of the package subroutines (except the GENERATE_TASK_NAME Function and the TASK_STATUS Procedure) perform a commit.
DBMS_PARALLEL_EXECUTE
is a SYS
-owned package which is granted to PUBLIC
.
Any user can create or operate parallel execution tasks and access the USER
view.
Users who have the ADM_PARALLEL_EXECUTE_TASK
role can perform administrative routines (qualified by the prefix ADM_
) and access the DBA view.
Apart from the administrative routines, all the subprograms refer to tasks owned by the current user.
To execute chunks in parallel, you must have CREATE
JOB
system privilege.
The CHUNK_BY_SQL
, RUN_TASK
, and RESUME_TASK
subprograms require a query, and are executed using DBMS_SQL
. Invokers of the DBMS_SQL
interface must ensure that no query contains SQL injection.
The DBMS_PARALLEL_EXECUTE
package uses the constants shown in following tables:
Table 107-1, "DBMS_PARALLEL_EXECUTE Constants - Chunk Status Value"
Table 107-2, "DBMS_PARALLEL_EXECUTE Constants - Task Status Value"
Table 107-1 DBMS_PARALLEL_EXECUTE Constants - Chunk Status Value
Constant | Type | Value | Description |
---|---|---|---|
|
|
1 |
Chunk has been assigned for processing |
|
|
2 |
Chunk has been processed successfully |
|
|
3 |
Chunk has been processed, but an error occurred during processing |
|
|
0 |
Chunk is unassigned |
Table 107-2 DBMS_PARALLEL_EXECUTE Constants - Task Status Value
Constant | Type | Value | Description |
---|---|---|---|
|
|
4 |
Table associated with the task has been chunked, but none of the chunk has been assigned for processing |
|
|
2 |
Table associated with the task is being chunked |
|
|
3 |
Chunking failed |
|
|
8 |
Only applicable if parallel execution is used, this occurs if a job slave crashes or if the database crashes during |
|
|
1 |
The task has been created by the |
|
|
6 |
All chunks processed without error |
|
|
7 |
All chunks processed, but with errors in some cases |
|
|
5 |
Part of the chunk assigned for processing, or which has been processed |
The following table lists the exceptions raised by DBMS_PARALLEL_EXECUTE
.
Table 107-3 Exceptions Raised by DBMS_PARALLEL_EXECUTE
Exception | Error Code | Description |
---|---|---|
|
29499 |
Specified chunk does not exist |
|
29497 |
Same task name has been used by an existing task |
|
29492 |
Attempts to chunk a table that is not in |
|
29495 |
Attempts to resume execution, but the task is not in |
|
29494 |
Attempts to execute the task that is not in |
|
29493 |
Attempts to set an invalid value to the chunk status |
|
29491 |
Attempts to chunk a table by rowid in cases in which the table is not a physical table, or the table is an IOT |
|
29490 |
User does not have the necessary |
|
29498 |
Specified |
The following examples run on the Human Resources (HR) schema of the Oracle Database Sample Schemas. It requires that the HR schema be created with the JOB
SYSTEM
privilege.
This example shows the most common usage of this package. After calling the RUN_TASK Procedure, it checks for errors and reruns in the case of error.
DECLARE l_sql_stmt VARCHAR2(1000); l_try NUMBER; l_status NUMBER; BEGIN -- Create the TASK DBMS_PARALLEL_EXECUTE.CREATE_TASK ('mytask'); -- Chunk the table by ROWID DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID('mytask', 'HR', 'EMPLOYEES', true, 100); -- Execute the DML in parallel l_sql_stmt := 'update EMPLOYEES e SET e.salary = e.salary + 10 WHERE rowid BETWEEN :start_id AND :end_id'; DBMS_PARALLEL_EXECUTE.RUN_TASK('mytask', l_sql_stmt, DBMS_SQL.NATIVE, parallel_level => 10); -- If there is an error, RESUME it for at most 2 times. L_try := 0; L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('mytask'); WHILE(l_try < 2 and L_status != DBMS_PARALLEL_EXECUTE.FINISHED) LOOP L_try := l_try + 1; DBMS_PARALLEL_EXECUTE.RESUME_TASK('mytask'); L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('mytask'); END LOOP; -- Done with processing; drop the task DBMS_PARALLEL_EXECUTE.DROP_TASK('mytask'); END; /
A user can specify a chunk algorithm by using the CREATE_CHUNKS_BY_SQL Procedure. This example shows that rows with the same manager_id
are grouped together and processed in one chunk.
DECLARE l_chunk_sql VARCHAR2(1000); l_sql_stmt VARCHAR2(1000); l_try NUMBER; l_status NUMBER; BEGIN -- Create the TASK DBMS_PARALLEL_EXECUTE.CREATE_TASK ('mytask'); -- Chunk the table by MANAGER_ID l_chunk_sql := 'SELECT distinct manager_id, manager_id FROM employees'; DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_SQL('mytask', l_chunk_sql, false); -- Execute the DML in parallel -- the WHERE clause contain a condition on manager_id, which is the chunk -- column. In this case, grouping rows is by manager_id. l_sql_stmt := 'update EMPLOYEES e SET e.salary = e.salary + 10 WHERE manager_id between :start_id and :end_id'; DBMS_PARALLEL_EXECUTE.RUN_TASK('mytask', l_sql_stmt, DBMS_SQL.NATIVE, parallel_level => 10); -- If there is error, RESUME it for at most 2 times. L_try := 0; L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('mytask'); WHILE(l_try < 2 and L_status != DBMS_PARALLEL_EXECUTE.FINISHED) Loop L_try := l_try + 1; DBMS_PARALLEL_EXECUTE.RESUME_TASK('mytask'); L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('mytask'); END LOOP; -- Done with processing; drop the task DBMS_PARALLEL_EXECUTE.DROP_TASK('mytask'); end; /
Executing Chunks in an User-defined Framework
You can execute chunks in a self-defined framework without using the RUN_TASK Procedure. This example shows how to use GET_ROWID_CHUNK Procedure, EXECUTE
IMMEDIATE
, SET_CHUNK_STATUS Procedure to execute the chunks.
DECLARE l_sql_stmt varchar2(1000); l_try number; l_status number; l_chunk_id number; l_start_rowid rowid; l_end_rowid rowid; l_any_rows boolean; CURSOR c1 IS SELECT chunk_id FROM user_parallel_execute_chunks WHERE task_name = 'mytask' AND STATUS IN (DBMS_PARALLEL_EXECUTE.PROCESSED_WITH_ERROR, DBMS_PARALLEL_EXECUTE.ASSIGNED); BEGIN -- Create the Objects, task, and chunk by ROWID DBMS_PARALLEL_EXECUTE.CREATE_TASK ('mytask'); DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID('mytask', 'HR', 'EMPLOYEES', true, 100); l_sql_stmt := 'update EMPLOYEES e SET e.salary = e.salary + 10 WHERE rowid BETWEEN :start_id AND :end_id'; -- Execute the DML in his own framework -- -- Process each chunk and commit. -- After processing one chunk, repeat this process until -- all the chunks are processed. -- <<main_processing>> LOOP -- -- Get a chunk to process; if there is nothing to process, then exit the -- loop; -- DBMS_PARALLEL_EXECUTE.GET_ROWID_CHUNK('mytask', l_chunk_id, l_start_rowid, l_end_rowid, l_any_rows); IF (l_any_rows = false) THEN EXIT; END IF; -- -- The chunk is specified by start_id and end_id. -- Bind the start_id and end_id and then execute it -- -- If no error occured, set the chunk status to PROCESSED. -- -- Catch any exception. If an exception occured, store the error num/msg -- into the chunk table and then continue to process the next chunk. -- BEGIN EXECUTE IMMEDIATE l_sql_stmt using l_start_rowid, l_end_rowid; DBMS_PARALLEL_EXECUTE.SET_CHUNK_STATUS('mytask',l_chunk_id, DBMS_PARALLEL_EXECUTE.PROCESSED); EXCEPTION WHEN OTHERS THEN DBMS_PARALLEL_EXECUTE.SET_CHUNK_STATUS('mytask', l_chunk_id, DBMS_PARALLEL_EXECUTE.PROCESSED_WITH_ERROR, SQLCODE, SQLERRM); END; -- -- Finished processing one chunk; Commit here -- COMMIT; END LOOP;
Table 107-4 DBMS_PARALLEL_EXECUTE Package Subprograms
Subprogram | Description |
---|---|
Drops all chunks of the specified task owned by the specified owner |
|
Drops the task of the given user and all related chunks |
|
Returns the task status |
|
Stops the task of the given owner and related job slaves |
|
Creates a task for the current user |
|
Chunks the table associated with the given task by the specified column. |
|
Chunks the table associated with the given task by |
|
Chunks the table associated with the given task by means of a user-provided |
|
Drops the task and all related chunks |
|
Drops the task's chunks |
|
Returns a unique name for a task |
|
Picks an unassigned |
|
Picks an unassigned |
|
Deletes all the processed chunks whose status is |
|
Retries the given the task if the RUN_TASK Procedure finished with an error, or resumes the task if a crash occurred. |
|
Executes the specified SQL statement on the chunks in parallel |
|
Sets the status of the chunk |
|
Stops the task and related job slaves |
|
Returns the task status |
This procedure drops all chunks of the specified task owned by the specified owner.
This function returns the task status.
DBMS_PARALLEL_EXECUTE.ADM_TASK_STATUS ( task_owner IN VARCHAR2, task_name IN VARCHAR2) RETURN NUMBER;
This procedure stops the task of the specified owner and related job slaves.
This procedure creates a task for the current user. The pairing of task_name
and current_user
must be unique.
DBMS_PARALLEL_EXECUTE.CREATE_TASK ( task_name IN VARCHAR2, comment IN VARCHAR2 DEFAULT NULL);
This procedure chunks the table (associated with the specified task) by the specified column. The specified column must be a NUMBER
column. This procedure takes the MIN
and MAX
value of the column, and then divides the range evenly according to chunk_size
. The chunks are:
START_ID END_ID --------------------------- --------------------------- min_id_val min_id_val+1*chunk_size-1 min_id_val+1*chunk_size min_id_val+2*chunk_size-1 … … min_id_val+i*chunk_size max_id_val
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_NUMBER_COL ( task_name IN VARCHAR2, table_owner IN VARCHAR2, table_name IN VARCHAR2, table_column IN VARCHAR2, chunk_size IN NUMBER);
This procedure chunks the table (associated with the specified task) by ROWID
. num_row
and num_block
are approximate guidance for the size of each chunk. The table to be chunked must be a physical table with physical ROWID
having views and table functions. Index-organized tables are not allowed.
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID ( task_name IN VARCHAR2, table_owner IN VARCHAR2, table_name IN VARCHAR2, by_row IN BOOLEAN, chunk_size IN NUMBER);
Table 107-11 CREATE_CHUNKS_BY_ROWID Procedure Parameters
Parameter | Description |
---|---|
|
Name of the task |
|
Owner of the table |
|
Name of the table |
|
|
|
Approximate number of rows/blocks to process for each commit cycle |
This procedure chunks the table (associated with the specified task) by means of a user-provided SELECT
statement. The SELECT
statement that returns the range of each chunk must have two columns: start_id
and end_id
. If the task is to chunk by ROWID
, then the two columns must be of ROWID
type. If the task is to chunk the table by NUMBER
column, then the two columns must be of NUMBER
type. The procedure provides the flexibility to users who want to deploy user-defined chunk algorithms.
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_SQL ( task_name IN VARCHAR2, sql_statement IN CLOB, by_rowid IN BOOLEAN);
This function returns a unique name for a task. The name is of the form prefixN where N is a number from a sequence. If no prefix is specified, the generated name is, by default, TASK$_1
, TASK$_2
, TASK$_3
, and so on. If 'SCOTT
' is specified as the prefix, the name is SCOTT1
, SCOTT2
, and so on.
DBMS_PARALLEL_EXECUTE.GENERATE_TASK_NAME ( prefix IN VARCHAR2 DEFAULT 'TASK$_') RETURN VARCHAR2;
This procedure picks an unassigned NUMBER
chunk and changes it to ASSIGNED
. If there are no more chunks to assign, any_rows
is set to FALSE
. Otherwise, the chunk_id
, start
, and end
_id
of the chunk are returned as OUT
parameters. The chunk info in DBMS_PARALLEL_EXECUTE_CHUNKS$
is updated as follows: STATUS
becomes ASSIGNED
; START_TIMESTAMP
records the current time; END_TIMESTAMP
is cleared.
See Also:
ViewsDBMS_PARALLEL_EXECUTE.GET_NUMBER_COL_CHUNK ( task_name IN VARCHAR2, chunk_id OUT NUMBER, start_rowid OUT ROWID, end_id OUT ROWID, any_rows OUT BOOLEAN);
This procedure picks an unassigned ROWID
chunk and changes it to ASSIGNED
. If there are no more chunks to assign, any_rows
is set to FALSE
. Otherwise, the chunk_id
, start
, and end
_id
of the chunk are returned as OUT
parameters. The chunk info in DBMS_PARALLEL_EXECUTE_CHUNKS$
is updated as follows: STATUS
becomes ASSIGNED
; START_TIMESTAMP
records the current time; END_TIMESTAMP
is cleared.
See Also:
ViewsDBMS_PARALLEL_EXECUTE.GET_ROWID_CHUNK ( task_name IN VARCHAR2, chunk_id OUT NUMBER, start_rowid OUT ROWID, end_id OUT ROWID, any_rows OUT BOOLEAN);
This procedure deletes all the processed chunks whose status is PROCESSED
or PROCESSED_WITH_ERROR
.
This procedure retries the specified the task if the RUN_TASK Procedure finished with an error, or resumes the task if a crash occurred. You can only invoke this procedure if the task is in a CRASHED
or FINISHED_WITH_ERROR
state. For a crashed serial execution, the state remains in processing. The FORCE
option allows you to resume any task in PROCESSING
state. However, it is your responsibility to determine that a crash has occurred. The procedure resumes processing the chunks which have not been processed. Also, chunks which are in PROCESSED_WITH_ERROR
or ASSIGNED
(due to crash) state are processed because those chunks did not commit. This procedure takes the same argument as the RUN_TASK Procedure. The overload which takes task_name
as the only input argument re-uses the arguments provided in the previous invoking of the RUN_TASK Procedure or RESUME_TASK Procedures.
DBMS_PARALLEL_EXECUTE.RESUME_TASK ( task_name IN VARCHAR2, sql_stmt IN CLOB, language_flag IN NUMBER, edition IN VARCHAR2 DEFAULT NULL, apply_crossedition_trigger IN VARCHAR2 DEFAULT NULL, fire_apply_trigger IN BOOLEAN DEFAULT TRUE, parallel_level IN NUMBER DEFAULT 0, job_class IN VARCHAR2 DEFAULT 'DEFAULT_JOB_CLASS', force IN BOOLEAN DEFAULT FALSE); DBMS_PARALLEL_EXECUTE.RESUME_TASK ( task_name IN VARCHAR2, force IN BOOLEAN DEFAULT FALSE);
Table 107-19 RESUME_TASK Procedure Parameters
Parameter | Description |
---|---|
|
Name of the task |
|
SQL statement; must have |
|
Determines how Oracle handles the SQL statement. The following options are recognized:
|
|
Specifies the edition in which to run the statement. Default is the current edition. |
|
Specifies the unqualified name of a forward crossedition trigger that is to be applied to the specified SQL. The name is resolved using the edition and |
|
Indicates whether the specified |
|
Number of parallel jobs; zero if run in serial; |
|
If running in parallel, the jobs all belong to the specified job class |
|
If |
Suppose the chunk table contains the following chunk ranges:
START_ID END_ID --------------------------- --------------------------- 1 10 11 20 21 30
And the specified SQL statement is:
UPDATE employees SET salary = salary + 10 WHERE e.employee_id BETWEEN :start_id AND :end_id
This procedure executes the following statements in parallel:
UPDATE employees SET salary =.salary + 10 WHERE employee_id BETWEEN 1 and 10; COMMIT; UPDATE employees SET salary =.salary + 10 WHERE employee_id between 11 and 20; COMMIT; UPDATE employees SET salary =.salary + 10 WHERE employee_id between 21 and 30; COMMIT;
This procedure executes the specified statement (sql_stmt
) on the chunks in parallel. It commits after processing each chunk. The specified statement must have two placeholders called start_id
and end_id,
respectively, which represent the range of the chunk to be processed. The type of each placeholder must be ROWID
where ROWID
-based chunking was used, or NUMBER
where NUMBER
-based chunking was used.
DBMS_PARALLEL_EXECUTE.RUN_TASK ( task_name IN VARCHAR2, sql_stmt IN CLOB, language_flag IN NUMBER, edition IN VARCHAR2 DEFAULT NULL, apply_crossedition_trigger IN VARCHAR2 DEFAULT NULL, fire_apply_trigger IN BOOLEAN DEFAULT TRUE, parallel_level IN NUMBER DEFAULT 0, job_class IN VARCHAR2 DEFAULT 'DEFAULT_JOB_CLASS');
Table 107-20 RUN_TASK Procedure Parameters
Parameter | Description |
---|---|
|
Name of the task |
|
SQL statement; must have |
|
Determines how Oracle handles the SQL statement. The following options are recognized:
|
|
Specifies the edition in which to run the statement. Default is the current edition. |
|
Specifies the unqualified name of a forward crossedition trigger that is to be applied to the specified SQL. The name is resolved using the edition and |
|
Indicates whether the specified |
|
Number of parallel jobs; zero if run in serial; |
|
If running in parallel, the jobs belong to the specified job class |
The SQL statement is executed as the current user.
Since this subprogram is subject to reexecution on error, you need to take great care in submitting a statement to RUN_TASK
that is not idempotent.
Chunks can be executed in parallel by DBMS_SCHEDULER
job slaves. Therefore, parallel execution requires CREATE
JOB
system privilege. The job slaves is created under the current user. The default number of job slaves is computed as the product of Oracle parameters cpu_count
and parallel_threads_per_cpu
. On a Real Application Clusters installation, the number of job slaves is the sum of individual settings on each node in the cluster. This procedure returns only when all the chunks are processed. In parallel cases, this procedure returns only when all the job slaves finished.
Suppose the chunk table contains the following chunk ranges:
START_ID END_ID --------------------------- --------------------------- 1 10 11 20 21 30
And the specified SQL statement is:
UPDATE employees SET salary = salary + 10 WHERE e.employee_id BETWEEN :start_id AND :end_id
This procedure executes the following statements in parallel:
UPDATE employees SET salary =.salary + 10 WHERE employee_id BETWEEN 1 and 10; COMMIT; UPDATE employees SET salary =.salary + 10 WHERE employee_id between 11 and 20; COMMIT; UPDATE employees SET salary =.salary + 10 WHERE employee_id between 21 and 30; COMMIT;
This procedure sets the status of the chunk. The START_TIMESTAMP
and END_TIMESTAMP
of the chunk is updated according to the new status:
Value of the new Status Side Effect --------------------------- --------------------------- UNASSIGNED START_TIMESTAMP and END_TIMESTAMP will be cleared ASSIGNED START_TIMESTAMP will be the current time and END_TIMESTAMP will be cleared. PROCESSED or PROCESSED_WITH_ERROR The current time will be recorded in END_TIMESTAMP
See Also:
ViewsDBMS_PARALLEL_EXECUTE.SET_CHUNK_STATUS ( task_name IN VARCHAR2, chunk_id OUT NUMBER, status IN NUMBER, err_num IN NUMBER DEFAULT NULL, err_msg IN VARCHAR2 DEFAULT NULL);
Table 107-21 SET_CHUNK_STATUS Procedure Parameters
Parameter | Description |
---|---|
|
Name of the task |
|
Chunk_id of the chunk |
|
Status of the chunk: |
|
Error code returned during the processing of the chunk |
|
Error message returned during the processing of the chunk |