Postgresql – How to control how Pentaho DI formats integers coming from a text file

I have a simple pipe-delimited text file with integers recorded as a sequence of numbers:

012345|00678 |various|text|fields

There are some leading zeros, and some trailing white space, but no decimals, commas, internal spaces, or other formatting. Should be simple, right?

Pentaho's field analysis concludes that the fields are of type "Number", format "#.#", and precision "0". That format string makes me wonder if Pentaho thinks there might be decimals in the columns, but when I run Preview Rows, I get nice clean integers, without decimals, scientific notation, or leading zeros, exactly as I want it.

When I try to consume the data with a PostgreSQL Bulk Loader step, I get error messages like these:

Import Table.0 - ERROR {0} ERROR:  invalid input syntax for integer: "5.017645082E9"
Import Table.0 - ERROR {0} ERROR:  invalid input syntax for integer: "1.0"

It seems that Pentaho has taken it on itself to format the integers with scientific notation and/or fixed decimals, and naturally Postgres balks at importing this.

How can I stop Pentaho from formatting these integers?

I'm using Pentaho Data Integration v6.1, and Postgres 9.4.9. The file uses Windows line breaks and ANSI text. Changing the field format to "#" or "0" had no effect. Using a Table Output instead of a PostgreSQL Bulk Load works, but it is much slower.

Postgresql – How to control how Pentaho DI formats integers coming from a text file

Best Answer

Related Question

Best Answer

Related Solutions

Postgresql – How to DRY up a series of upserts of the same fields from multiple tables

Postgresql – How to we execute a log file from pg_xlogs directory postgresql

Related Question