PostgreSQL corrupted after running pg_resetxlog

barmanpostgresqlpostgresql-9.3

We are using PostgreSQL version 9.3 on Ubuntu 14.04. This PostgreSQL server shared among all our application servers (Odoo), so we made it run on the separate environment.

On Saturday we found disk full issue on this DB server. On our further investigation, we found the backup server(barman) is went off. So all the archive log stay on the database server. This occupied the entire disk. Our database backup server may be stopped working a month back.

By googling we found a solution, that by resetting the pg_xlog files to solve this problem. So we clean the log file using pg_restxlog command. As the forum said the disk cleared and we reboot the server. But no database found :-(. We listed using psql – list command. Nothing worked till now. We are not able to restore the backup from barman too. Then we continue our investigation and we found all the data stay safe under the base folder of the main data path of Postgres.

The steps we execute to reset the log follows.

  1. Try to stop the database server

    sudo service postgresql stop

  2. Logged in as Postgres user

    sudo su – postgres

  3. Run the reset command.

    /usr/lib/postgresql/9.3/bin/pg_resetxlog -f /var/lib/postgresql/9.3/main/

  4. Disable the barman configuration in postgres.conf file to stop the backup process for while.

  5. And reboot the server

File content of /var/lib/postgresql/9.3/main/

postgres@server2:~$ du -h 9.3/main/
12K     9.3/main/pg_notify
28M     9.3/main/base/2735749
73M     9.3/main/base/4172290
46M     9.3/main/base/4410494
81M     9.3/main/base/3002089
43M     9.3/main/base/4282962
47M     9.3/main/base/3377227
130M    9.3/main/base/4098067
44M     9.3/main/base/1682791
58M     9.3/main/base/3377231
4.0K    9.3/main/base/pgsql_tmp
6.1M    9.3/main/base/12030
41M     9.3/main/base/4280118
54M     9.3/main/base/3149391
45M     9.3/main/base/4202614
49M     9.3/main/base/3344071
45M     9.3/main/base/2985056
51M     9.3/main/base/2120822
18G     9.3/main/base/3655712
25M     9.3/main/base/2759574
40M     9.3/main/base/4388978
52M     9.3/main/base/2435773
53M     9.3/main/base/4236740
55M     9.3/main/base/3386464
6.2M    9.3/main/base/12035
201M    9.3/main/base/4112218
54M     9.3/main/base/1625789
635M    9.3/main/base/149656
40M     9.3/main/base/4190162
25M     9.3/main/base/4090019
150M    9.3/main/base/4338686
6.2M    9.3/main/base/1
86M     9.3/main/base/2101485
185M    9.3/main/base/3453985
48M     9.3/main/base/4244883
41M     9.3/main/base/4160039
47M     9.3/main/base/3377180
38M     9.3/main/base/4150310
8.9G    9.3/main/base/2926431
47M     9.3/main/base/1693701
28M     9.3/main/base/4153341
25M     9.3/main/base/2744130
74M     9.3/main/base/2023404
29M     9.3/main/base/3231291
28M     9.3/main/base/2749185
43M     9.3/main/base/4371923
47M     9.3/main/base/3410953
47M     9.3/main/base/4313961
50M     9.3/main/base/4399246
49M     9.3/main/base/3402258
84M     9.3/main/base/3379836
64M     9.3/main/base/2777796
30G     9.3/main/base
5.8M    9.3/main/global
88K     9.3/main/pg_multixact/offsets
256K    9.3/main/pg_multixact/members
348K    9.3/main/pg_multixact
4.0K    9.3/main/pg_xlog/archive_status
33M     9.3/main/pg_xlog
100K    9.3/main/pg_stat_tmp
4.0K    9.3/main/pg_serial
4.0M    9.3/main/pg_clog
4.0K    9.3/main/pg_stat
52K     9.3/main/pg_subtrans
4.0K    9.3/main/pg_tblspc
4.0K    9.3/main/pg_twophase
4.0K    9.3/main/pg_snapshots
30G     9.3/main/

Best Answer

I guess it is too late now. If you use pg_resetxlog, you need to be extremely careful and, most importantly, know what you are doing.

Thanks to PostgreSQL's robustness, all you had to do in that case was to free space in the Barman server, for example by deleting the oldest backup in the catalogue. Then, once space was reclaimed, PostgreSQL could have resumed shipping WAL files and automatically recovered.

I know it is too late for you, but I am hoping that my reply will be able to help somebody in the future and prevent them from running pg_resetxlog.