PostgreSQL – How to Return ID After Insert or Select

concurrencyinsertplpgsqlpostgresqlpostgresql-9.3

I want to build a function which will insert an email if the email value doesn't exist in the table and return the email_id of the row. How can I do this?
Also how can I return the id if the email was not inserted and it already exist in the DB? Do I need to perform another SELECT?

BEGIN;
  LOCK TABLE mailing_list IN SHARE ROW EXCLUSIVE MODE;
  INSERT INTO mailing_list (email)
  SELECT 'email'
   WHERE NOT EXISTS (
     SELECT * FROM mailing_list WHERE email='email'
   );
COMMIT;

I have tried adding the returning id but it doesn't work. I got:

query has no destination for result data

Sqlfiddle

Best Answer

@a_horse already explained how to avoid the error message you saw.

Here is a simple variant of the related version we have been referring to:

CREATE OR REPLACE FUNCTION f_email_insel(_email text, OUT email_id int) AS
$func$
BEGIN

LOOP
   BEGIN  -- start inner block inside loop to handle possible exception

   SELECT INTO email_id  m.email_id FROM mailing_list m WHERE m.email = _email;

   IF NOT FOUND THEN
      INSERT INTO mailing_list (email) VALUES (_email)
      RETURNING mailing_list.email_id INTO email_id;
   END IF;

   EXCEPTION WHEN UNIQUE_VIOLATION THEN     -- inserted in concurrent session.
      RAISE NOTICE 'It actually happened!'; -- hardly ever happens
   END;

   EXIT WHEN email_id IS NOT NULL;          -- else keep looping
END LOOP;

END
$func$ LANGUAGE plpgsql;

SQL Fiddle.

You only need the loop to deal with a possible race condition: If a concurrent transaction writes the same email value in between SELECT and INSERT, you would get a unique violation - which is handled properly here. This is assuming a UNIQUE constraint (or a UNIQUE index) on email, obviously.

The alternative with CTEs runs as one SQL statement. So here the overhead is slightly smaller (simpler queries), but the time frame for a race condition is slightly bigger. Especially if the row frequently already exists, this is a bit faster.
Read the detailed explanation over there and pick the approach that best fits your use case.