PostgreSQL Update Query – Issues with Plus Symbol Output

postgresqlupdate

Simple but frustrating issue… I'm trying to do an update command to correct a misspelling with this field:

grouper=> select * from master_list WHERE group ~ 'JustAdopted';
       group       | activity | usefulness 
-------------------+----------+------------
 JustAdoptedAdvice+|          |           
                   |          | 
(1 row)

But postgres won't give me a straight answer on the field name…

grouper=> UPDATE master_list SET group = 'JustAdoptingAdvice' WHERE group = 'JustAdoptedAdvice';
UPDATE 0

grouper=> UPDATE master_list SET group = 'JustAdoptingAdvice' WHERE group = 'JustAdoptedAdvice+';
UPDATE 0

grouper=> select * from master_list WHERE group ~ '+';
ERROR:  invalid regular expression: quantifier operand invalid

What is going on with that field? Why is that plus symbol there and what does it represent?

And how do I reference it to change it, if the field won't update with or without the symbol being there?

Best Answer

This:

UPDATE master_list
SET group = 'JustAdoptingAdvice' 
WHERE group = 'JustAdoptedAdvice+';

should work perfectly fine if the field was really JustAdoptedAdvice+, but I don't think it is. psql is showing that + to mean continuation ... I suspect there's actually a newline there.

UPDATE master_list
SET group = 'JustAdoptingAdvice' 
WHERE group = E'JustAdoptedAdvice\n';

There could be whitespace after the newline too, so maybe even:

UPDATE master_list
SET group = 'JustAdoptingAdvice' 
WHERE group ~ 'JustAdoptedAdvice[[:blank:]\n]+';

which says "JustAdoptedAdvice followed by one or more of tab/space/newline".

Demo showing it's likely to be a newline:

regress=> CREATE TABLE t4(x text);
CREATE TABLE
regress=> INSERT INTO t4(x) VALUES (E'JustAdoptedAdvice\n');
INSERT 0 1
regress=> SELECT * FROM t4;
         x         
-------------------
 JustAdoptedAdvice+

(1 row)

To be sure, look at the hex representation of the utf-8 encoding of the string:

regress=> SELECT encode( convert_to(x, 'utf-8'), 'hex') FROM t4;
                encode                
--------------------------------------
 4a75737441646f707465644164766963650a
(1 row)

You'll see that it ends with 0x0a, which is a newline.