OracleBulkCopy – Functionality and Performance Optimization

bulkoracle

To summarize the specifics: We need to stage approximately 5 million rows into a vendor (Oracle) database. Everything goes great for batches of 500k rows using OracleBulkCopy (ODP.NET), but when we try to scale up to 5M, the performance starts slowing to a crawl once it hits the 1M mark, gets progressively slower as more rows are loaded, and eventually times out after 3 hours or so.

I suspect it's related to a primary key on the table, but I've been trawling the Oracle forums and Stack Overflow for information and a lot of what I'm reading contradicts that (also, a lot of posts seem to contradict each other). I'm hoping that somebody can set the record straight on some closely-related questions about the process:

Does the OracleBulkCopy class use conventional or direct-path loading? Is there some way I can confirm this, one way or another?
Assuming it does use direct-path loading: Is it true that Oracle automatically sets all indexes to unusable during the load and puts them back online afterward? I've read several statements to this effect but again, cannot confirm it.
If #2 is true, then should it make any difference what indexes are on the table before I initiate a bulk copy operation? If so, why?
Related to #3, is there any practical difference, in general, between bulk loading with an unusable index vs. actually dropping the index before the load and recreating it afterward?
If #2 is not correct, or if there are some caveats I'm not understanding, then would it make any difference to explicitly make the index unusable before the bulk load, and then explicitly rebuild it afterward?
Is there anything else, other than index builds, that could cause a bulk copy operation to grow progressively slower as more and more records are added? (Maybe something to do with logging, although I would expect that bulk operations are not logged?)
If there really is no other way to get the performance up to snuff aside from dropping the PK/index first, what steps can I take to make sure that the index doesn't completely disappear, i.e. if the connection to the database is lost in the middle of the process?

Best Answer

A few more days of reading and experimentation and I was able to (mostly) answer a lot of these:

I found this buried in the ODP.NET documentation (ironically not in the OracleBulkCopy docs):

The ODP.NET Bulk Copy feature uses a direct path load approach, which is similar to, but not the same as Oracle SQL*Loader. Using direct path load is faster than conventional loading (using conventional SQL INSERT statements).

So it appears that it does use direct path.
This I was able to verify by doing a large bulk copy operation and getting the index properties from SQL Developer. The index did appear as UNUSABLE while the bulk copy was in progress. However, I've also discovered that OracleBulkCopy.WriteToServer will refuse to run if the index starts in an UNUSABLE state, so clearly there's more going on here, because if it were as simple as disabling and rebuilding the index then it shouldn't care about initial state.
It does make a difference specifically if the index is also a constraint. Found this little gem in the documentation linked above:
Enabled Constraints
During an Oracle bulk copy, the following constraints are automatically enabled by default:
- NOT NULL
- UNIQUE
- PRIMARY KEY (unique-constraints on not-null columns)
NOT NULL constraints are checked at column array build time. Any row that violates the NOT NULL constraint is rejected.

UNIQUE constraints are verified when indexes are rebuilt at the end of the load. The index is left in an Index Unusable state if it violates a UNIQUE constraint.
The documentation is a little hazy on what happens during the load, especially with primary keys, but one thing is absolutely certain - it behaves differently with a primary key vs. without one. Since the OracleBulkCopy will happily allow you violate index constraints (and punt the index into UNUSABLE state when it's done), my hunch is that it's building the PK index during the bulk copy but simply not validating it until afterward.
I'm not sure whether the difference observed is within Oracle itself or just a quirk of the OracleBulkCopy. The jury's still out on this one.
OracleBulkCopy will throw an exception if an index is initially in the UNUSABLE state, so it's really a moot point.
If there are other factors, indexes (and especially PK indexes) are still the most important, as I found out by:
Creating a global temporary table with the same schema (using CREATE AS), then bulk copying into the temporary table, and finally doing a plain old INSERT from the temp table into the real table. Since the temp table has no index, the bulk copy happens very fast, and the final INSERT is also fast because the data is already in a table (I haven't tried the append hint yet, since a 5M row table-to-table copy already takes less than 1 minute).

I'm not yet sure of the potential ramifications of (ab)using the temporary table space this way, but so far it hasn't given me any trouble, and it's much safer than the alternative by way of preventing corruption of either the rows or indexes.

The success of this also pretty clearly demonstrates that the PK index is the problem, as that is the only practical difference between the temp table and the permanent table - both started with zero rows during the performance tests.

Conclusion: Don't bother trying to bulk copy more than around 100k rows into an indexed Oracle table using ODP.NET. Either drop the index (if you don't actually need it) or "preload" the data into a different (non-indexed) table.

Related Solutions

When is an application using too many sessions

As with all things, it depends.

If you're using OCI, I assume you're developing a client/ server application. I would generally want the database in that situation to support shared server connections and I would want to request a shared server connection when the application connected to the database. That improves the performance of the application by limiting the amount of work the database has to do in order to create a new session. That has the side effect of leaving more resources available on the server for others. Using shared server connections does involve a longer code path when the database executing subsequent queries since the query has to be sent to a shared server process but that generally isn't a big deal if the number of shared server processes is reasonable.

Opening a number of sessions could also be a problem if the DBA is not using automatic PGA management. If you're using manual PGA management, PGA is configured on a per-session basis so each session can allocate a separate SORT_AREA_SIZE for sorts among other PGA components. If you created a large number of sessions in a database using manual PGA management and each session tried to maximize its PGA usage, you could easily starve the server of RAM and cause performance problems for everyone. Assuming you are using Oracle 10.1 or later, however, automatic PGA management is available. In that case, the DBA configures a PGA_AGGREGATE_TARGET (or includes the PGA in the MEMORY_TARGET in 11g) and the database takes care of ensuring that the aggregate PGA across all sessions is limited so that the database runs out of resources.

If the database supports shared server connections, your application gets a shared server connection, and the database uses automatic PGA management, most DBAs won't care too much how many sessions you create.

Now, if you are creating many sessions so that you can do more work in parallel, that would create performance problems over and above the number of sessions. It's easy enough to configure the database to support 1000 sessions, for example, it's much harder to configure the database to not die in a pile if all 1000 session simultaneously issue a meaty query against a data warehouse. If your application is using up all the resources available to the database for your queries, the DBA would probably want to consider using Oracle Resource Manager to prioritize different applications and/or different users. For example, the DBA could configure Resource Manager so that if the CPU utilization goes to 100%, your application in the aggregate gets 50% of the CPU, some other application in the aggregate gets 25%, and all others get the remaining 25%. If no other requests were pending, your application would be free to use all 100% of the CPU.

If you are running things in parallel, it might also be useful to investigate Oracle's ability to run statements in parallel because that might involve less overhead than writing your own parallelization code. I would expect, for example, that it would be far easier and quite possibly faster to write a client application that serially submitted statements that used Oracle parallel query to execute rather than opening multiple sessions and executing each statement from a separate thread of your application while Oracle was not using parallel query to execute any of the statements. If you use parallel query, the database can also adjust the number of parallel slaves that are spawned so that when the database is particularly busy fewer parallel slaves are started and when the database is relatively idle more parallel slaves are started. That would potentially satisfy both the desire to make the application as efficient as possible as well as to share the resources of the server more equitably with other applications.

Oracle DB Performance – Slowdown with DEFAULT_MAINTENANCE_PLAN

Seems like there are some Database Resource Manager (DRM) constraints put on user sessions when the database switches to maintenance plan via Scheduler window each night (as evident by your log portion).

There are at least two predefined DRM plans when a database is created.

DEFAULT_PLAN
DEFAULT_MAINTENANCE_PLAN

(See the complete list.)

When a window activates, it can switch the current resource plan to one specified in the window. You can query ALL_SCHEDULER_WINDOWS view to see which plan each window is assigned.

You can also switch between plans in Enterprise Manager or manually changing RESOURCE_MANAGER_PLAN initialization parameter value.

You can use Enterprise Manager or DBA_RSRC_* views to see

which plans are active in database – DBA_RSRC_PLANS
which resource consumer groups are defined – DBA_RSRC_CONSUMER_GROUPS
which group mappings are defined – DBA_RSRC_GROUP_MAPPINGS
which resource plan directives are defined – DBA_RSRC_PLAN_DIRECTIVES
and so on

Resource plan directives is what defines resource usage by the database sessions. The sessions map to consumer groups via group mappings.

When your database switches to another plan (using window or manually), the different set of directives is enabled which controls the resource usage by sessions. Thus when your database switches from DEFAULT_PLAN to DEFAULT_MAINTENANCE_PLAN and vice versa, your user sessions notice the difference – their performance dynamically changes according to the resource directives defined in the enabled plan.

For more details on implementing and using resource plans in DRM, consult the Managing Resources with Oracle Database Resource Manager chapter in Database Administrator's Guide.

UPDATE

You can check which directives are defined for plans in your database.

select plan, group_or_subplan, mgmt_p1, mgmt_p2, mgmt_p3
from DBA_RSRC_PLAN_DIRECTIVES
where plan like ('DEFAULT_%')
order by plan, mgmt_p1 desc, mgmt_p2 desc, mgmt_p3 desc;

PLAN                        GROUP_OR_SUBPLAN    MGMT_P1 MGMT_P2 MGMT_P3
------------------------    ----------------    ------- ------- -------
DEFAULT_MAINTENANCE_PLAN    SYS_GROUP                75       0       0
DEFAULT_MAINTENANCE_PLAN    OTHER_GROUPS              0      70       0
DEFAULT_MAINTENANCE_PLAN    ORA$AUTOTASK_SUB_PLAN     0      25       0
DEFAULT_MAINTENANCE_PLAN    ORA$DIAGNOSTICS           0       5       0
DEFAULT_PLAN                SYS_GROUP                75       0       0
DEFAULT_PLAN                OTHER_GROUPS              0      90       0
DEFAULT_PLAN                ORA$AUTOTASK_SUB_PLAN     0       5       0
DEFAULT_PLAN                ORA$DIAGNOSTICS           0       5       0

Notice how resources are allocated between different consumer groups. Here OTHER_GROUPS and ORA$AUTOTASK_SUB_PLAN are allowed to use 70% and 25% of CPU respectively in DEFAULT_MAINTENANCE_PLAN plan, while in DEFAULT_PLAN they are allowed 90% and 5% respectively.

It means that these directives might be one of the reasons your users' sessions run slower when maintenance windows are open, because users are allocated less resources than what they are during work hours.

Best Answer

Related Solutions

When is an application using too many sessions

Oracle DB Performance – Slowdown with DEFAULT_MAINTENANCE_PLAN

Related Question