Why would the Oracle read from disk

in-memory-databaseoracleoracle-12c

[Copied from ServerFault to what seems like a better-suited forum]

Our Oracle server (version 12c) is running on a RedHat-6.7 server with 128GB of RAM.

The current size of the only database on it is only about 60GB.

Yet, we see plenty of disk reads — on the device where the database is stored — in the iostat-output (shown in blue). There are some writes (shown in yellow) too, of course:

chart of several columns from iostat output

While the writes to permanent storage make sense because we do make modifications to the data some times, the reads do not — the entire DB can fit in memory twice over… Indeed, we should be able to store it on a USB-stick and still have good read-times — comparable to those of Redis, for example.

What parts of the Oracle-configuration should we examine and tune to make the server use all available RAM?

We will look at the INMEMORY settings. But it would seem, the inmemory_* parameters come into play only, when the server has to (because of RAM shortage) decide, what to drop from memory to be re-read back later. In our case nothing should ever be dropped from memory, because there is room for everything. And yet some things are, evidently, being (re)read again and again…

Update: the output of free on the server currently reads:


total used free shared buffers cached
Mem: 132044132 99974992 32069140 89131408 292508 93060984
-/+ buffers/cache: 6621500 125422632
Swap: 33554428 0 33554428

The processes owned by the oracle-user all seem to share the above-mentioned 85GB memory piece. Maybe, we really are just constraining the server ourselves — the 30GB unused memory seems to suggest that. How do we tell the server to use more?

Best Answer

Just a few possibilities with not 100% accurate explanation.

Let's say you have 100+ GB SGA (with 100 GB buffer cache) and 20 GB PGA.

  1. Operations performing a sort (queries with ORDER BY) or a hash (HASH JOIN) use PGA memory to do that. A single session can typically use up to 5% percent of target PGA for these operations, if it requires more memory, then it switches to the temp tablespace, where it continues its work, but instead of memory, it performs its on disk. So if your query needs to sort some bigger but not-even-close-to-the-available-memory amount of data (1 GB+ data), it will sort it on disk.

  2. Big tables (=segments) are not cached when read with a full scan. By default, a big table (=segment) is typically a table (=segment) bigger than 2% of the buffer cache. This behaviour can be controlled by setting the "_small_table_threshold" parameter, that specifies the threshold in database blocks for deciding if a segment should be cached. Similar example: if you have a table that is over 2 GB, and your query reads it with a full scan, the database will not even try to cache it. If you execute this query multiple times, the database will read the table from disk over and over again.

  3. LOB columns are stored in seperate segments. LOB segments are not cached by default (if not specified explicitly at the time of creation or later), so they will be read from disk over and over again. The caching can be enabled as:

alter table table1 modify lob (lob_column1) (cache); or (cache reads) to cache reads only.

+1. You have a bigger SGA configured than your database. (Hopefully you configured it at instance level as well). You can force full database caching as:

ALTER DATABASE FORCE FULL DATABASE CACHING;

+2. In-Memory is an extra cost option for the database that needs to be licensed. Use it only if you have the appropriate license for it.