After hours of misdirection from official Oracle support, I dove into this on my own and fixed it. I am documenting it here in case someone else has this problem.
To do any of this, you must be the oracle user:
$ su - oracle
Step 1: You need to look at the alert log. It isn't in /var/log as expected. You have to run an Oracle log reading program:
$ adrci
ADRCI: Release 11.2.0.1.0 - Production on Wed Sep 11 18:27:56 2013
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
ADR base = "/u01/app/oracle"
adrci>
Notice the ADR base. That is not the install. You need to see the homes so you can connect to the one that you use.
adrci> show homes
ADR Homes:
diag/rdbms/cci/CCI
diag/tnslsnr/cci/listener
diag/tnslsnr/cci/start
diag/tnslsnr/cci/reload
CCI is the home. Set that.
adrci> set home diag/rdbms/cci/CCI
adrci>
Now, you can look at the alert logs. It would be very nice if they were in /var/log so you could easily parse the logs. Just stop wanting and deal with this interface. At least you can tail (and I hope you have a scrollback buffer):
adrci> show alert -tail 100
Scroll back until you see errors. You want the FIRST error. Any errors after the first error are likely being caused by the first error. In my case, the first error was:
ORA-19815: WARNING: db_recovery_file_dest_size of 53687091200 bytes is 100.00% used, and has 0 remaining bytes available.
This is caused by transactions. Oracle is not designed to be used. If you do push a lot of data into it, it saves transaction logs. Those go into the recovery file area. Once that is full (50GB full in this case). Then, Oracle just dies. By design, if anything is messed up, Oracle will respond by shutting down.
There are two solutions, the proper one and the quick and dirty one. The quick and dirty one is to increase db_recovery_file_dest_size. First, exit adrci.
adrci> exit
Now, go into sqlplus without opening the database, just mounting it (you may be able to do this without mounting the database, but I mount it anyway).
$ sqlplus /nolog
SQL*Plus: Release 11.2.0.1.0 Production on Wed Sep 11 18:40:25 2013
Copyright (c) 1982, 2009, Oracle. All rights reserved.
SQL> connect / as sysdba
Connected.
SQL> startup mount
Now, you can increase your current db_recovery_file_dest_size, increased to 75G in my case:
SQL> alter system set db_recovery_file_dest_size = 75G scope=both
Now, you can shutdown and startup again and that previous error should be gone.
The proper fix is to get rid of the recovery files. You do that using RMAN, not SQLPLUS or ADRCI.
$ rman
Recovery Manager: Release 11.2.0.1.0 - Production on Wed Sep 11 18:45:11 2013
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
RMAN> backup archivelog all delete input;
If you've got RMAN-06171: not connected to target database
, than try to use rman target /
instead of just rman
Wait a long time and your archivelog (that was using up all that space) will be gone. So, you can shutdown/startup your database and be back in business.
If you don't have any backups. Well... You're just flat down.
Nothing to be done.
You might as well drop the entire database and recreate.
Reasons follow:
1.- Corrupt blocks can only be repaired by getting the original block from a full backup (at least of the datafile that has the corrupted blocks) and applying the archivelogs after the original block has been restored to the original location.
2.- No backups, No archivelogs and no exports means you lost your data. There is no way to get it back as you cannot find out the actual storage order of each row once it got corrupted.
3.- If the undo tablespace got corupted as well, you can't even recover a datafile/block because the RDBMS won't find the undo data needed to make it consistent. In the best of cases you could apply forward changes on the redo logs, but you couln't rollback unfinished transactions before the crash.
Next time, as soon as you have a running database, get it to work in archivelog mode, make a full back every once in a while and, to be on the safe side, export the most important data you have there..
Best Answer
This is probably as complete a way of killing an Oracle database as you could wish for. The sys tables contain all the metadata about every object in the database -- objects, segments, extents ... so the database now contains no information on what user tables it stores, including the tables that store the data about that.
New database, I think.
And no more sys connection accidents.