Mysql – Best approach on storing/displaying sanitized data

database-designinsertMySQLpostgresql

Surfing the web I find two approaches on storing data in databases.

Approach 1:
Store data in the database with the html-specialcharacters sanitized

$userdata = $_POST['field'];
store_in_db_via_PDO(htmlspecialchars($userdata), 'field');
display_on_page(get_from_db('field'));

Approach 2:
Store data as raw as possible in database and sanitize on display:

$userdata = $_POST['field'];
$store_in_db_via_PDO($userdata, 'field');
display_on_page(htmlspecialchars('field'));

BTW I'm not speaking about sqlinjection atacks, these are 'caught` by the PDO-driver

Is there an industry standard on storing data, which shouldn't include html-code, in a database?

Best Answer

I would generally store data in its purest form.

You should verify the data as you read it before pushing it out to the client anyway, in case a bug elsewhere has allowed data that is badly formed (either accidentally or maliciously) to be added to the database, so escaping the data on the way in does not save you any significant processing on the way out.

Storing the data pure also saves you a lot of hassle if you find a bug in the encoding routine or change your mind how you want things encoded (not likely an issue here as you are working to a pre-defined standard with HTML text, but good habits are easier to keep to if you use them religiously even when they make no difference). If you change you encoding standard then data that is stored pure doesn't need to be touched, but existing encoded data needs to be read, decoded, re-encoded using the new rules, and saved again.