Postgresql – SELECT INTO with regexp_replace() doesn’t write changes into newly generated table

ddlpostgresqlpostgresql-9.1regular expression

I have been using this query in Postgres 9.1 in order to remove '' from existing data.
Below is a sample of the data before running regexp_replace():

6''
6''
6''
20''
12''
18''
20''
8''
10''
''
''

Upon running:

select REGEXP_REPLACE(regexp_replace, $$^'*$$, '' , 'g')
from temp_4 order by id;

I receive this clean output:

6
6
6
20
12
18
20
8
10
( ) <- stand-in for " "
( )

however when attempt to then write these results into a table, say using:

select * into table_3 from (select REGEXP_REPLACE(tbl, $$^"''*,*$$, '' , 'g') from temp_2 order by id) as temp_3;

I receive

Where '' has been removed from all values, except in the case it was the only value present. I have attempted to whitelist using [\w\s*] all other values instead of blacklist with this regex ^'*, but in both instances values of ''. I feel like I'm taking crazy pills.

How do I write my table where I can replace '' with 0 or null values?

Also, is my syntax terribly wrong in my attempts to write this data to other tables? Is there a better way to do this?

Best Answer

Your regexp_replace() statement is invalid. It would work like this:

SELECT regexp_replace(tbl, '('''')$', '' , 'g')
FROM (
 VALUES
  ('6''''')
 ,('6''''')
 ,('6''''')
 ,('20''''')
 ,('12''''')
 ,('18''''')
 ,('20''''')
 ,('8''''')
 ,('10''''')
 ,('''''')
 ,('''''')
) tbl(tbl)

Your SELECT INTO statement is invalid. It would look like this:

SELECT regexp_replace(tbl, '('''')$', '' , 'g')
INTO   temp table_4
FROM   temp_2
ORDER  BY id;

But I would use neither.

SELECT INTO is discouraged. Only supported for historical reasons. Use CREATE TABLE AS instead, which is the SQL standard way. Per documentation:

CREATE TABLE AS is functionally similar to SELECT INTO. CREATE TABLE AS is the recommended syntax, since this form of SELECT INTO is not available in ECPG or PL/pgSQL, because they interpret the INTO clause differently. Furthermore, CREATE TABLE AS offers a superset of the functionality provided by SELECT INTO.

Bold emphasis mine.

For the presented examples, rtrim() is much simpler and faster:

CREATE TEMP TABLE table_4 AS 
SELECT rtrim(tbl, '''')     -- trim all trailing '
FROM   temp_2
ORDER  BY id;

Related Solutions

PostgreSQL 9.1 Hot Backup Error: the database system is starting up

The message "The database system is starting up." does not indicate an error. The reason it is at the FATAL level is so that it will always make it to the log, regardless of the setting of log_min_messages:

http://www.postgresql.org/docs/9.1/interactive/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHEN

After the rsync, did you really run what you show?:

pgsql -c "select pg_stop_backup();";

Since there is, so far as I know, no pgsql executable, that would leave the backup uncompleted, and the slave would never come out of recovery mode. On the other hand, maybe you really did run psql, because otherwise I don't see how the slave would have logged such success messages as:

Log: consistent recovery state reached at 0/BF0000B0

and:

Log: streaming replication successfully connected to primary

Did you try connecting to the slave at this point? What happened?

The "Success. You can now start..." message you mention is generated by initdb, which shouldn't be run as part of setting up a slave; so I think you may be confused about something there. I'm also concerned about these apparently conflicting statements:

The only ways I have restarted Postgres is through the service postgresql-9.1 restart or /etc/init.d/postgresql-9.1 restart commands. After I receive this error, I kill all processes and again try to restart the database...

Did you try to stop the service through the service script? What happened? It might help in understanding the logs if you prefixed lines with more information. We use:

log_line_prefix = '[%m] %p %q<%u %d %r> '

The recovery.conf script looks odd. Are you copying from the master's pg_xlog directory, the slave's active pg_xlog directory, or an archive directory?

PostgreSQL insert into table (not origin) based on a condition on fields on different tables

You seem to be under the impression that some kind of automatic "row number" would exist. That is not the case. Unlike rows in a spreadsheed, tables in a relational database have no natural order.

This query should do the job, but it relies on the the contents of name_1 and name_2 to make the connection. If you rely on a row number, you have to add an actual column for that.

INSERT INTO table2 (id1, id2, score1, score2, comment_string)
SELECT t1.id1, t1.id2, t1.score1, t1.score2
      ,CASE WHEN t3.name_1 IS NULL
            AND  t4.name_2 IS NULL THEN 'removed_because ...'
       ELSE END AS comment_string
FROM   table_1 t1
LEFT   JOIN table_3 t3 USING (name_1)
LEFT   JOIN table_4 t4 USING (name_2)
ORDER  BY id1; -- undeclared in Q

Based on the assumption that table_3.name_1 and table_4.name_2 are unique. Else, the query could create a "proxy cross join", possibly multiplying rows, if there are several matches.
More about this caveat in this related answer on SO.

Best Answer

Related Solutions

PostgreSQL 9.1 Hot Backup Error: the database system is starting up

PostgreSQL insert into table (not origin) based on a condition on fields on different tables

Related Question