Does increased Data Dictionary size affects performance

data dictionaryoracleoracle-11gperformance

As I understand, Data Dictionary

  • Is a collection of tables, part of SYS schema (if not the whole SYS
    schema)
  • It is stored as 'regular' tables, so whatever is applied to
    database tables, also applies to data dictionary (e.g. indexes,
    statistics)
  • Its statistics should be updated
  • It is always cached

Can the the size and the complexity of oracle data dictionary affect the database performance?

When I add a new object on my database a new (or multiple) entries is/are added on data dictionary. A scenario I can think of has to do with partitioning. So if I have a table which have a partition and subpartitions, I will have an entry for the table, an entry for each partition and an entry of each partition subpartitions in data dictionary. Those entries are spawned among different 'tables'. Imagine know having tables with thousands of partitions and hundreds of subpartition. Again this number of records may not be something that a dbms can handle, but how it will affect its performance as it grows?

Note: I have read a lot of discussion on if and how partitioning affects performance in general. This is out of scope for this question. Partitioning is just an example I gave, to show how data dictionary volume can grow.

Best Answer

So a good starting point for this is Martin Widlake's post on the size of the 'data dictionary'. The term can be a bit woolly as there's a bunch of stuff that Oracle stores (such as executable code, job history etc) which may or may not be relevant.

https://mwidlake.wordpress.com/2009/08/03/why-is-my-system-tablespace-so-big/

He points out an example where he has 13 Gb in a data dictionary segment "C_OBJ#_INTCOL#". That is technically a cluster rather than a regular table, but it is really just a special table where the data is arranged so that related items tend to be stored in the same block on disk. That object is storing histogram information for table columns which tells you that, for a specific table/column, you might have 60% of values as US, 25% as Canada and 15% Mexico, or that you have twice as much data for October 2017 as you did for October 2016.

Whether that's a good use of space will depend on your situation. Histograms are basically used to determine the best access path. For example, given a query that has both a date and zip code paths available, which is going to involve less work.

Ultimately, it isn't about the size of a data dictionary, but a more precise question of whether storing additional information in there can help avoid mistakes. Even then, the downside is often the impact of gathering the additional information and keeping it up-to-date rather than the space required to store it.

Additional : A cluster tends to take up a bit more space than a regular table as trying to keep related data together needs to allow more space for the data to grow rather than moving it around when it doesn't fit.