While using select datname, xact_commit from pg_stat_database;
as suggested by @Jack Douglas doesn't quite work (apparently due to autovacuum), select datname, tup_inserted, tup_updated, tup_deleted from pg_stat_database
does appear to work. Both DML and DDL changes will change the values of tup_* columns while a vacuum
does not (vacuum analyze
on the other hand...).
In the off chance that this may be useful for others, I'm including the backup script that I've put in place. This works for Pg 8.4.x but not for 8.2.x-- YMMV depending on the version of Pg used.
#!/usr/bin/env perl
=head1 Synopsis
pg_backup -- selectively backup a postgresql database cluster
=head1 Description
Perform backups (pg_dump*) of postgresql databases in a cluster on an
as needed basis.
For some database clusters, there may be databases that are:
a. rarely updated/changed and therefore shouldn't require dumping as
often as those databases that are frequently changed/updated.
b. are large enough that dumping them without need is undesirable.
The global data is always dumped without regard to whether any
individual databses need backing up or not.
=head1 Usage
pg_backup [OPTION]...
General options:
-F, --format=c|t|p output file format for data dumps
(custom, tar, plain text) (default is custom)
-a, --all backup (pg_dump) all databases in the cluster
(default is to only pg_dump databases that have
changed since the last backup)
--backup-dir directory to place backup files in
(default is ./backups)
-v, --verbose verbose mode
--help show this help, then exit
Connection options:
-h, --host=HOSTNAME database server host or socket directory
-p, --port=PORT database server port number
-U, --username=NAME connect as specified database user
-d, --database=NAME connect to database name for global data
=head1 Notes
This utility has been developed against PostgreSQL version 8.4.x. Older
versions of PostgreSQL may not work.
`vacuum` does not appear to trigger a backup unless there is actually
something to vacuum whereas `vacuum analyze` appears to always trigger a
backup.
=head1 Copyright and License
Copyright (C) 2011 by Gregory Siems
This library is free software; you can redistribute it and/or modify it
under the same terms as PostgreSQL itself, either PostgreSQL version
8.4 or, at your option, any later version of PostgreSQL you may have
available.
=cut
use strict;
use warnings;
use Getopt::Long;
use Data::Dumper;
use POSIX qw(strftime);
my %opts = get_options();
my $connect_options = '';
$connect_options .= "--$_=$opts{$_} " for (qw(username host port));
my $shared_dump_args = ($opts{verbose})
? $connect_options . ' --verbose '
: $connect_options;
my $backup_prefix = (exists $opts{host} && $opts{host} ne 'localhost')
? $opts{backup_dir} . '/' . $opts{host} . '-'
: $opts{backup_dir} . '/';
do_main();
########################################################################
sub do_main {
backup_globals();
my $last_stats_file = $backup_prefix . 'last_stats';
# get the previous pg_stat_database data
my %last_stats;
if ( -f $last_stats_file) {
%last_stats = parse_stats (split "\n", slurp_file ($last_stats_file));
}
# get the current pg_stat_database data
my $cmd = 'psql ' . $connect_options;
$cmd .= " $opts{database} " if (exists $opts{database});
$cmd .= "-Atc \"
select date_trunc('minute', now()), datid, datname,
xact_commit, tup_inserted, tup_updated, tup_deleted
from pg_stat_database
where datname not in ('template0','template1','postgres'); \"";
$cmd =~ s/\ns+/ /g;
my @stats = `$cmd`;
my %curr_stats = parse_stats (@stats);
# do a backup if needed
foreach my $datname (sort keys %curr_stats) {
my $needs_backup = 0;
if ($opts{all}) {
$needs_backup = 1;
}
elsif ( ! exists $last_stats{$datname} ) {
$needs_backup = 1;
warn "no last stats for $datname\n" if ($opts{debug});
}
else {
for (qw (tup_inserted tup_updated tup_deleted)) {
if ($last_stats{$datname}{$_} != $curr_stats{$datname}{$_}) {
$needs_backup = 1;
warn "$_ stats do not match for $datname\n" if ($opts{debug});
}
}
}
if ($needs_backup) {
backup_db ($datname);
}
else {
chitchat ("Database \"$datname\" does not currently require backing up.");
}
}
# update the pg_stat_database data
open my $fh, '>', $last_stats_file || die "Could not open $last_stats_file for output. !$\n";
print $fh @stats;
close $fh;
}
sub parse_stats {
my @in = @_;
my %stats;
chomp @in;
foreach my $line (@in) {
my @ary = split /\|/, $line;
my $datname = $ary[2];
next unless ($datname);
foreach my $key (qw(tmsp datid datname xact_commit tup_inserted tup_updated tup_deleted)) {
my $val = shift @ary;
$stats{$datname}{$key} = $val;
}
}
return %stats;
}
sub backup_globals {
chitchat ("Backing up the global data.");
my $backup_file = $backup_prefix . 'globals-only.backup.gz';
my $cmd = 'pg_dumpall --globals-only ' . $shared_dump_args;
$cmd .= " --database=$opts{database} " if (exists $opts{database});
do_dump ($backup_file, "$cmd | gzip");
}
sub backup_db {
my $database = shift;
chitchat ("Backing up database \"$database\".");
my $backup_file = $backup_prefix . $database . '-schema-only.backup.gz';
do_dump ($backup_file, "pg_dump --schema-only --create --format=plain $shared_dump_args $database | gzip");
$backup_file = $backup_prefix . $database . '.backup';
do_dump ($backup_file, "pg_dump --format=". $opts{format} . " $shared_dump_args $database");
}
sub do_dump {
my ($backup_file, $cmd) = @_;
my $temp_file = $backup_file . '.new';
warn "Command is: $cmd > $temp_file" if ($opts{debug});
chitchat (`$cmd > $temp_file`);
if ( -f $temp_file ) {
chitchat (`mv $temp_file $backup_file`);
}
}
sub chitchat {
my @ary = @_;
return unless (@ary);
chomp @ary;
my $first = shift @ary;
my $now = strftime "%Y%m%d-%H:%M:%S", localtime;
print +(join "\n ", "$now $first", @ary), "\n";
}
sub get_options {
Getopt::Long::Configure('bundling');
my %opts = ();
GetOptions(
"a" => \$opts{all},
"all" => \$opts{all},
"p=s" => \$opts{port},
"port=s" => \$opts{port},
"U=s" => \$opts{username},
"username=s" => \$opts{username},
"h=s" => \$opts{host},
"host=s" => \$opts{host},
"F=s" => \$opts{format},
"format=s" => \$opts{format},
"d=s" => \$opts{database},
"database=s" => \$opts{database},
"backup-dir=s" => \$opts{backup_dir},
"help" => \$opts{help},
"v" => \$opts{verbose},
"verbose" => \$opts{verbose},
"debug" => \$opts{debug},
);
# Does the user need help?
if ($opts{help}) {
show_help();
}
$opts{host} ||= $ENV{PGHOSTADDR} || $ENV{PGHOST} || 'localhost';
$opts{port} ||= $ENV{PGPORT} || '5432';
$opts{host} ||= $ENV{PGHOST} || 'localhost';
$opts{username} ||= $ENV{PGUSER} || $ENV{USER} || 'postgres';
$opts{database} ||= $ENV{PGDATABASE} || $opts{username};
$opts{backup_dir} ||= './backups';
my %formats = (
c => 'custom',
custom => 'custom',
t => 'tar',
tar => 'tar',
p => 'plain',
plain => 'plain',
);
$opts{format} = (defined $opts{format})
? $formats{$opts{format}} || 'custom'
: 'custom';
warn Dumper \%opts if ($opts{debug});
return %opts;
}
sub show_help {
print `perldoc -F $0`;
exit;
}
sub slurp_file { local (*ARGV, $/); @ARGV = shift; <> }
__END__
Update: the script has been put on github here.
The best option IMO is to use a cron job to generate the logs for 'yesterday'::date. You could also use triggers before insert/update/delete to update the other table but this adds complexity and overhead, and for the current day, but this gets pretty complicated. Generate your historical logs once the data won't change anymore.
In this case you write a sql query and run it via psql and cron.
I would also add a trigger denying update or delete to records covered in your historical data if you can.
This gives you a few benefits:
It is more obvious when it breaks
It is simpler, with simpler failure cases
Now, as per your concerns:
You say you need rows for every day. This can be handled a number of relatively easy ways in PostgreSQL (remember that dates support integer math so you can take a base date and add a series to it, to generate a date series). This is a pretty easy way to get around if you are generating rows per day of week, etc.
You say you cant guarantee things won't change. The key question here is what your change window is and to do your historical reports after this window has closed. For example if it is after a month, you can generate reports for all dates in a month a month prior (i.e. generate all dates in January during early March). You can then rely on a view to handle newer rows vs older rows in a live basis. You can then have a trigger which ensures that the date of an inserted row in the orders table is newer than the newest date in the other table.
In my experience worrying about keeping this as a live summary usually isn't necessary. Small organizations (with small data sets) tend to close out books at least once a year, and live reporting is an option there. Larger organizations with larger data sets tend to close out receivables and payables (i.e. invoices) once a month or so, and so the only areas that have to be reported live (because they are subject to adjustment or revision) are open orders (which can be revised) and invoices which may need to be reviewed occasionally (and should never be revised but may have adjustments issued against them which might or might not need to be tracked in such a system).
Best Answer
Yes, it looks like
finance.fund_part_status
is an enum type.Assuming you're using the
psql
command-line interpreter, try this command: