I am new to pgAdmin and Azure database. I have a huge csv file with around 220 columns and I want to create a table out of it in pgAdmin4 to push it to Azure.
However I don't know how to detect automatically the types of columns.
Here is an exemple :
IDI GKID S01Q01 S02Q01_Gender ...
100093 enq030059569748fc89091fdd91cc337cac44eca90 Yes, I agree Female ...
I'm doing the script to create the table. However, given the number of columns I would like to automate it to get the script that would allow me to add the csv to the database in pgAdmin4 for Microsoft Azure.
Then, after transposing the csv header, I get:
IDI
GKID
S01Q01
S02Q01_Gender
...
I'm doing the script to create the table. However, given the number of columns I would like, in the best case, to automate the detection of the type of column to be able to write it in the right part, in the worst case to put a generic type of the type TEXT.
So far, I've tried
output = ""
file_name = "columns.txt"
string_to_add = " TINYTEXT,"
with open(file_name, 'r') as f:
file_lines = [''.join([x.strip(), string_to_add, '\n']) for x in f.readlines()]
with open(file_name, 'w') as f:
f.writelines(file_lines)
It gives me back:
IDI TINYTEXT,
GKID TINYTEXT,
S01Q01 TINYTEXT,
S02Q01_Gender TINYTEXT,
...
And, then, I can do:
CREATE TABLE my_table (
IDI TINYTEXT,
GKID TINYTEXT,
S01Q01 TINYTEXT,
S02Q01_Gender TINYTEXT,
...
But I'm not sure that this enough to make a table able to receive my csv file.
Best Answer
there are command line tools that do field type inferencing.
One is xsv https://github.com/BurntSushi/xsv/
Running this command on this (https://gist.githubusercontent.com/aborruso/3b1af402f0d2ed49465f218d19be81d9/raw/c0e95b320924e9e49902633d16e7ab253046ca16/input.csv)
you have
Using csvkit (https://csvkit.readthedocs.io/en/latest/index.html) and running
you will have
In csvkit you have also
that gives you