Postgresql – Speed up PostgreSQL logical decoding plugin

postgresqlpostgresql-11

I'm writing a new logical decoding output plugin for PostgreSQL inspired by wal2json. But mine produces JSON by using SPI to repeatedly invoke to_json(...) on column values (instead of wal2json's simpler JSON output that doesn't work well for some data types).

Testing shows that this is pretty CPU intensive. Is there a way to achieve the same result with better performance?

Could you somehow directly invoke row_to_json on the tuple given to the change callback?

Is there a way to directly invoke to_json on each column value without SPI?

(My attempts to call those functions directly with DirectFunctionCall1Coll crashed the server, so any pointers are welcome!)


Here's what the functioning, but CPU-intensive, code looks like:

spiArgType = columnInfo->atttypid;
/* Prepare the SQL */
spiPlan = SPI_prepare("SELECT to_json($1)::TEXT", 1, &spiArgType);
/* Execute it */
spiReturnValue = SPI_execute_plan(spiPlan, &columnValue, NULL, false, 1);
/* And if it succeeded, append to output string */
if (spiReturnValue < 0) {
    elog(ERROR, "SPI_execute_plan returned %d", spiReturnValue);
    appendStringInfo(out, "null");
} else {
  jsonDatum = SPI_getbinval(SPI_tuptable->vals[0], SPI_tuptable->tupdesc, 1, &isNull);
  appendStringInfo(out, "%s", TextDatumGetCString(jsonDatum));
}
SPI_freeplan(spiPlan);

A failed attempt using OidFunctionCall1Coll:

TupleDesc cols = ...;
HeapTuple tuple = ...;
Form_pg_attribute colInfo;
Datum colValue;
bool isNull;

for (i=0; i < cols->natts; ++i) {
  colInfo = TupleDescAttr(cols, i);

  // (Omitted checks for dropped or system column)

  colValue = heap_getattr(tuple, i+1, cols, &isNull);
  if (! isNull) {
    // The following call fails, and emits this error:
    // ERROR:  could not determine input data type
    jsonDatum = OidFunctionCall1Coll(F_TO_JSON, InvalidOid, colValue);
    // This is never reached:
    appendStringInfo(out, "%s", TextDatumGetCString(jsonDatum));
  }
  // ...
}

Best Answer

See this comment from fmgr.h:

/* These are for invocation of a function identified by OID with a
 * directly-computed parameter list.  Note that neither arguments nor result
 * are allowed to be NULL.  These are essentially fmgr_info() followed by
 * FunctionCallN().  If the same function is to be invoked repeatedly, do the
 * fmgr_info() once and then use FunctionCallN().
 */

extern Datum OidFunctionCall0Coll ...

This should be way cheaper than SPI.

Related Question