Oracle 11g R2 ORA-12516 Error – Max Utilization Not Reached

connection-poolingjdbcoracle-11g-r2

Our app began to raise ORA-12516, TNS:listener could not find available handler with matching protocol stack errors, and I'm trying to diagnose why.

The workload has not changed, so I think we might have introduced a bug in the application code, but I can't see why. The most obvious would be a connection leak, or unclosed idle transactions, but:

  1. we are using a jdbc pool (tomcat-jdbc-pool), and the max active value is 73, below the 150 processes Oracle is configured to use. So I would have thought the pool would throw an error before Oracle hits its own limits.

  2. do select * from gv$resource_limit; here is what I get:

NST_ID RESOURCE_NAME                  CURRENT_UTILIZATION MAX_UTILIZATION INITIAL_ALLOCATION LIMIT_VALUE
---------- ------------------------------ ------------------- --------------- ------------------ -----------
         1 processes                                       32              40        150                150  
         1 sessions                                        32              45        248                248  
...

So it seems max utilization never hits the limit.

Does anyone have a tip on how to track this down?

EDIT: I made some progress on this. It turns out that tomcat-jdbc-pool has a query to check connections health. This query was configured as select 1 from dual;. Notice the semi-column, as this is the culprit: removing it fixes the app. I still have to find out what really happens with the ; at the end of the check query. Any advice is welcome.

Best Answer

I think select 1 from dual; is a syntax error. Here's what happens on the wire:

With select 1 from dual a connection's lifecycle is:

  • Initial establishment
  • First query (validation does not yet happen)
  • Return to the pool
  • Validation: the pool sends select 1 from dual
  • The database answers: ORA-01403: no data found
  • Next query

and so on. The connection is normally not closed.

With select 1 from dual; however, the following happens:

  • Initial establishment
  • First query
  • Return to the pool
  • Validation: the pool sends select 1 from dual;
  • The database answers: ORA-00911: invalid character
  • The pool terminates the connection with a normal FIN packet.

The next query then begins with a freshly established connection. Now, as long as the load stays low, you might not even notice this. But under high load, all this reconnecting must exhaust some database resource. Which one, i don't know...