Mysql – pt-table-checksum says there is a difference but pt-table-sync with –print says nothing

MySQLpt-table-checksum

I am using pt-table-checksum to ensure master-slave is synchronized.

When I run pt-table-checksum, I see some differences on employee_profile.trigger_log but when I use pt-table-sync with –print, I see nothing.

I resolved some differences on some tables with pt-table-sync but when I re-test with pt-table-checksum, this tool still says there is a difference.

So what is the problem with pt-table-checksum ? Is is accuracy ?

[UPDATE]

I use

pt-table-checksum -h <masterip> -u <the_user> -p '<passwd of the user>' --recursion-method dsn=h=<slaveip>,P=3306,u=<the_userr>,p=<passwd of the user>,D=check_db,t=dsns --nocheck-binlog-format --set-vars innodb_lock_wait_timeout=50 > ~/stats_check_db_<slaveip>.txt

After that, I run command to filter:

cat ~/stats_check_db_<slaveip>.txt | awk '{print "*"$3"*" " -- " $8}' | grep -Fv  "*0*"

This is output:

*DIFFS* -- TABLE
*1* -- mysql.user
*1* -- employee_profile.entry
*1* -- employee_profile.ip
*1* -- employee_profile.thread

When I run pt-table-sync:

pt-table-sync  --print h=<slaveip>,P=3306,u=<the user>,p=<passwd of the user> --sync-to-master --databases employee_profile --tables entry

I see nothing. After a while, when I re-check with pt-table-checksum, I don't see difference on employee_profile.entry

For a resolved table, I still see some differences on next re-check. These are disappeared from result of pt-table-checksum on the third checksum.

[UPDATE 2]

I am using percona toolkit version 2.2.13 – lastest version on http://www.percona.com/

I reviewed all implementations of db server that I did checksum. These are old implementation of mysql or mariadb that someone built them.

I see the problem just appear when the implementation between master and slave is different such as master is MariaDB 5.5.34 and slave is MariaDB 5.5.37 (I see false positive error: Cannot nibble table because MySQL chose no index instead of the PRIMARY index – just ignore it and then execute pt-table-sync) or master is mysql 5.5.35 and slave is mysql 5.5.37 (I see pt-table-checksum says there is a difference but pt-table-sync –print says nothing – just re-test for sure).

I think it is a reason for false positive errors.

Thank you so much !

Best Answer

There were false positives in old versions of pt-table-checksum. Make sure you don't run ancient version.

To check what the differences are you could also use this script():

#./compare_table.sh <db> <table> <master> <slave>



#!/bin/bash

# Usage:
# $0 database table master slave
set -eux
db=$1
tbl=$2
master=$3
slave=$4
tmpdir=`mktemp -d`

checksum_table="percona.checksums"

chunks=`mysql -h $slave -NBe "select chunk from $checksum_table where (this_crc<>master_crc or this_cnt<>master_cnt) AND db='$db' AND tbl='$tbl'"`

for c in $chunks
do
    echo "# $db.$tbl, chunk $c"
    chunk_index=`mysql -h $slave -NBe "SELECT chunk_index FROM $checksum_table WHERE db='$db' AND tbl='$tbl' AND chunk = '$c'"`
    index_fields=`mysql -h $slave -NBe "SELECT COLUMN_NAME FROM information_schema.STATISTICS WHERE TABLE_SCHEMA='$db' AND TABLE_NAME='$tbl' AND INDEX_NAME='$chunk_index' ORDER BY SEQ_IN_INDEX"`
    index_field_last=`mysql -h $slave -NBe "SELECT COLUMN_NAME FROM information_schema.STATISTICS WHERE TABLE_SCHEMA='$db' AND TABLE_NAME='$tbl' AND INDEX_NAME='$chunk_index' ORDER BY SEQ_IN_INDEX DESC LIMIT 1"`
# EXAMPLE:
#
#WHERE 
#    (
#    (`user_id` > ?) 
#    OR 
#    (`user_id` = ? AND `activity_id` > ?) 
#    OR 
#    (`user_id` = ? AND `activity_id` = ? AND `activity_type_id` >= ?)
#    ) 
#AND 
#  (
#    (`user_id` < ?) 
#    OR 
#    (`user_id` < ?) 
#    OR 
#    (`user_id` = ? AND `activity_id` < ?) 
#    OR 
#    (`user_id` = ? AND `activity_id` = ? AND `activity_type_id` <= ?)
#    )
    if [ "$chunk_index" != "NULL" ]
    then
        where="WHERE"
        v_num=1
        lower_boundary=`mysql -h $slave -NBe "SELECT lower_boundary FROM $checksum_table WHERE db='$db' AND tbl='$tbl' AND chunk = '$c'"`
        upper_boundary=`mysql -h $slave -NBe "SELECT upper_boundary FROM $checksum_table WHERE db='$db' AND tbl='$tbl' AND chunk = '$c'"`
        clause_fields=""
        where="$where (0 "
        for f in $index_fields
        do 
            clause_fields="$f $clause_fields"
            op=""
            where="$where OR ( 1"
            for cf in $clause_fields
            do
                if test -z "$op"
                then   
                    if [ $cf == $index_field_last ]
                    then   
                        op=">="
                    else   
                        op=">"
                    fi
                else   
                    op="="
                fi
                v=`echo $lower_boundary | awk -F, "{ print \\\$$v_num}"`
                v_num=$(( $v_num + 1))
                where="$where AND \`$cf\` $op '$v'"
            done
            where="$where )"
        done
        where="$where )"

        v_num=1
        clause_fields=""
        where="$where AND ( 0"
        for f in $index_fields
        do 
            clause_fields="$f $clause_fields"
            op=""
            where="$where OR ( 1"
            for cf in $clause_fields
            do
                if test -z "$op"
                then   
                    if [ $cf == $index_field_last ]
                    then   
                        op="<="
                    else   
                        op="<"
                    fi
                else   
                    op="="
                fi
                v=`echo $upper_boundary | awk -F, "{ print \\\$$v_num}"`
                v_num=$(( $v_num + 1))
                where="$where AND \`$cf\` $op '$v'"
            done
            where="$where )"
        done
        where="$where )"
    else
        where="WHERE  1 "
    fi
    echo $where
    mysql -h $master -NBe "SELECT * FROM $db.$tbl $where"  >> "$tmpdir/master"
    mysql -h $slave -NBe "SELECT * FROM $db.$tbl $where"  >> "$tmpdir/slave"
    set +e
    diff -u "$tmpdir/master" "$tmpdir/slave" | less -S
    set -e
done
rm -r "$tmpdir"