The message "The database system is starting up." does not indicate an error. The reason it is at the FATAL level is so that it will always make it to the log, regardless of the setting of log_min_messages
:
http://www.postgresql.org/docs/9.1/interactive/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHEN
After the rsync, did you really run what you show?:
pgsql -c "select pg_stop_backup();";
Since there is, so far as I know, no pgsql
executable, that would leave the backup uncompleted, and the slave would never come out of recovery mode. On the other hand, maybe you really did run psql
, because otherwise I don't see how the slave would have logged such success messages as:
Log: consistent recovery state reached at 0/BF0000B0
and:
Log: streaming replication successfully connected to primary
Did you try connecting to the slave at this point? What happened?
The "Success. You can now start..." message you mention is generated by initdb
, which shouldn't be run as part of setting up a slave; so I think you may be confused about something there. I'm also concerned about these apparently conflicting statements:
The only ways I have restarted Postgres is through the service
postgresql-9.1 restart or /etc/init.d/postgresql-9.1 restart commands.
After I receive this error, I kill all processes and again try to
restart the database...
Did you try to stop the service through the service script? What happened? It might help in understanding the logs if you prefixed lines with more information. We use:
log_line_prefix = '[%m] %p %q<%u %d %r> '
The recovery.conf
script looks odd. Are you copying from the master's pg_xlog directory, the slave's active pg_xlog directory, or an archive directory?
In general if you are seeing problems of this sort, it may be best to take them up on the pgsql-bugs list. People there can help figure out what information to gather to help determine what the scope of this misbehavior is and help fix it for you.
Also 8.4.11 to 8.4.12 wal restore should work just fine.
If this is only occasionally happening, I don't think your explanations get there. It sounds like something that really could use additional troubleshooting by people who can determine if a code fix is required.
Best Answer
The
postmaster
symlink appears to be historical. I didn't even know it existed until you pointed it out.The other two are a typical pattern for UNIX daemon applications.
postgres
provides the server functionality, andpg_ctl
provides control client for it.It would be possible to bundle
pg_ctl
's functionality intopostgres
, but doing so would meanpostgres
would have to run in two modes: as a DB server, or as a client to connect to a running DB server. That's a little bit unpleasant architecturally, and would add what's essentially "dead code" (unreachable and unusable) to running PostgreSQL server instances.Similarly, it'd be possible to remove the ability to launch a PostgreSQL server from the
postgres
executable and require you to do it viapg_ctl
, but that'd be cumbersome, and would really annoy people who manage distribution init scripts. They usually want to bypass application startup programs and launch the daemon directly, so they can prevent it from detaching to run in the background, trap exit codes, capture stderr, etc.Essentially:
postgres
is the low level server program.pg_ctl
is a user tool for starting/stopping the server and other management actions.postmaster
is an obsolete alias forpostgres
.In general you use distribution service management (
service
,systemctl
,update-rc.d
, etc) if you installed PostgreSQL from packages, orpg_ctl
if you installed from source or 3rd party binary installers that don't integrate with the operating system services system.