Mysql – Check if a database table has been renamed

MySQLpostgresqlpythonsqlalchemy

My app has a list of an external db's tables and it needs to be regularly updated. How do I check if a table has been renamed? i.e. how to differentiate between a new table and an old table that's been renamed? Looking for sql/python/sqlalchemy solutions that'll work in as many engines as possible.

I'm aware of OID in postgres but as far as I could tell it isn't very reliable and there's no equivalent in mysql. I suppose I could do something with the create time and modify time but I'm trying to see if there's an easier way.

Edit: The app is a data visualisation tool to which external dbs connect. I do not have control over the design of dbs that connect to the app which is why I need to account for tables getting renamed.

Best Answer

There is no other way to establish the identity of a database table in PostgreSQL than through the OID (system column tableoid). Creation time is not stored in PostgreSQL.

Relying on the table OID is not only proprietary to PostgreSQL, but also somewhat unsafe, since OIDs can get reused. I recommend that you keep an inventory of your table modifications in a separate database table.

Related Solutions

Mysql – “show create table” output with out “show”

If you want something that's quick-and-dirty, may you should consider doing this:

STEP 01) Use mysqldump

mysqldump --no-data --all-databases > MySQLSchema.sql

STEP 02) Parse the text

Every CREATE TABLE would mark the next databse

Every USE dbname marks the name of the next database

Have each CREATE TABLE description in a text file

STEP 03) Do a LOAD DATA INFILE of each of the parsed files into a table of your choice (such as mydb.mytables)

STEP 04) Query for the structures

SELECT table_desc FROM mydb.mytables WHERE dbname="...' AND tbname='...';

Your your would look something like this:

CREATE TABLE mytables
(
    dbname VARCHAR(64) NOT NULL,
    tbname VARCHAR(64) NOT NULL,
    table_desc TEXT DEFAULT NULL,
    primary key (dbname,tbname)
);

Your mission is to parse the mysqldump I just mentioned and get each table description loaded into this table, along with the database anem table name.

Once you construct such a parsing program, you could do this process every hour and update your dynamic documentation.

Give it a Try !!!

UPDATE 2012-04-01 00:39 EDT

Here is modified suggestion: Given the mysqldump suggestion I made, your should load tMySQL Schema attained from the mysqldump into another DB server that contains no data. Set it up as a replication slave using replicate-do-db=mysql. That way, no data will collect in the slave. You can use this slave as the source of the mysqldumping of the DB schema. This separates the data from the schema. You can use your documentation system fetch the schema from the slave.

Postgresql – Should I do the maths within PostgreSQL for ACIDity

Welcome to DBA Stack Exchange!

My problem is that I don't know the 'speed' of SQL when it deals with stats and maths, compared to other language. I know that basic functions (correlation, R^2, ...) are already implemented in SQL, but I am using far more 'advanced' (I mean 'complex' here..) functions (even if I have not represented it here).

As a rule of thumb, for basic aggregations (Grouping, Joining, Summation) on large datasets (millions of rows) SQL will perform better. I'd recommend you leverage SQL's strengths here use a hybrid approach. Prep your data with basic aggregations as much as possible, but leave the higher-level mathematics and analysis logic in Python.

Would you think I should translate all my mathematical analysis from Python to SQL, and create Triggers launching it whenever a new value is inserted?

No.

As stated before, you can do basic aggregate functions, but I expect you will find it tedious if not outright impossible to model the more 'complex' statistical function from a feature-rich language such as Python into a constrained language such as SQL. Here is a decent article exploring this concept. Even if we are successful translating Python functions into SQL, our success is going to highly dependent on whether the database is optimized to deliver a set-based representation of all the data inputs needed for those calculations at the time of insert.

The thing is that between the update of the values, and the update of the regressions, the database is not in a 'stable' state: new values have been inserted, but the corresponding regressions doesn't exist yet.

Let's be sure we aren't dogmatically promoting database ACIDity at the potential cost of performance, and the assured cost associated with increased code complexity. I DO understand the importance of ACIDity, but from your description this sound like an analytic workflow to me. We have a 3AM bulk load of data, which is subsequently analyzed outside of SQL, and then re-inserted into the database.

To summarize.. YEs, SQL will perform better on basic aggregations across large datasets, but higher-level statistical analysis is best left to Python. I'm not sure it's worth the risk translating Python functions into SQL triggers for the sake of ACIDity.

P.S. I know this is a late answer to your question, so let us know what you eventually decided to implement!

Best Answer

Related Solutions

Mysql – “show create table” output with out “show”

UPDATE 2012-04-01 00:39 EDT

Postgresql – Should I do the maths within PostgreSQL for ACIDity

Related Question