The message "The database system is starting up." does not indicate an error. The reason it is at the FATAL level is so that it will always make it to the log, regardless of the setting of log_min_messages
:
http://www.postgresql.org/docs/9.1/interactive/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHEN
After the rsync, did you really run what you show?:
pgsql -c "select pg_stop_backup();";
Since there is, so far as I know, no pgsql
executable, that would leave the backup uncompleted, and the slave would never come out of recovery mode. On the other hand, maybe you really did run psql
, because otherwise I don't see how the slave would have logged such success messages as:
Log: consistent recovery state reached at 0/BF0000B0
and:
Log: streaming replication successfully connected to primary
Did you try connecting to the slave at this point? What happened?
The "Success. You can now start..." message you mention is generated by initdb
, which shouldn't be run as part of setting up a slave; so I think you may be confused about something there. I'm also concerned about these apparently conflicting statements:
The only ways I have restarted Postgres is through the service
postgresql-9.1 restart or /etc/init.d/postgresql-9.1 restart commands.
After I receive this error, I kill all processes and again try to
restart the database...
Did you try to stop the service through the service script? What happened? It might help in understanding the logs if you prefixed lines with more information. We use:
log_line_prefix = '[%m] %p %q<%u %d %r> '
The recovery.conf
script looks odd. Are you copying from the master's pg_xlog directory, the slave's active pg_xlog directory, or an archive directory?
The very first thing to do is to make a copy of whatever data files you still have, and to keep it and any backups safe until long after your recovery effort is complete. Please read this (short) Wiki page:
http://wiki.postgresql.org/wiki/Corruption
Once you have done that, you can attempt various recovery strategies without fear that you will be worse off for the attempt, beyond the time required to try it. In general I recommend carefully following one of the techniques described in the documentation -- attempts to cut corners or to be creative often lead to corruption. Only a seasoned expert with a good understanding of PostgreSQL internals should attempt to deviate from the documented steps.
You didn't describe your backup strategy; details of what is available there may suggest alternatives you would not otherwise have.
Ultimately, if you have data of value which is not backed up, you may need to hand-edit the system tables to eliminate references to lost tablespace. This is not for the faint of heart. There are a number of companies with which you can contract for such services, many of whom have experience with recovery from catastrophic hardware failure like this.
http://www.postgresql.org/support/professional_support/
I am not affiliated with any of these companies.
Best Answer
User, which runs postgresql, often
postgres
, must be able to traverse to this/path/to/log/filesystem/
(often/var/log/postgresql
), specified bylog_directory
config option and be able to write files there. If this path does not exist, you will getNo such file or directory
error.You can try switching to this user (e.g.
su postgres
) and checkcd /path/to/log/filesystem
(whatever this path really is) and if that throws error, then create the directories (e.g.mkdir -p /path/to/your/log_directory
).There's also a possibility that service manager changes mount points for the service, like
systemd
is capable of.