Suggestions for SQL query repair tool or tips on building one

oracle

Not sure if this is ontopic so I figure I would ask here and if its not liked then we can close it.

Our hosting provider for a legacy application that I inherited is no longer supporting Oracle 10 and are requiring us to move to Oracle 11g. I didn't think this would be a problem until nearly half of the SQL queries started failing. Upon research I found that previous versions of Oracle had a bug that didn't throw an exception at malformed ANSI SQL queries, and that bug was fixed in 11g.

To make a long story short, the brilliant minds that came before me wrote some awful queries and only basic integration tests, and I now have a few weeks to fix about 1000 malformed queries or we are dead in the water.

I will literally set myself on fire if I have to manually fix all of these, and was wondering if you guys knew of some tools or techniques for hunting down these offending queries to make this slightly less painful. Perhaps a special code profiler, or even a brilliant regex that can help me locate column references without an alias?

Best Answer

Argh! I wanted to open a chat panel to ask you some more questions, but actually inviting doesn't work unless you are already chatting. Oh well.

You have indicated that these are java strings, which means that there are two problems.

  1. Parsing the SQL and making decisions (mark ambiguous items for user)
  2. Finding the SQL inside the java code

Hopefully the SQL should be fairly recognizable. It exists in a string, and sql follows a fairly ridged format. Also, your non-aliased items will follow a simple format too. They will be in columnname AS aliasname. Skip those, and only add aliases for columnnames not in that format. If you load in the db schema, you even get all valid columnnames to look for. Finally, if there is an ambiguous item, then write some context to stdout, and ask the user for input.

I would suggest using python and the PLY library. You get a full blown lex/yacc combo, with an easier language than C. Open a new question on SO if you want more help/or to scream at me. ;)