I have a large (5-10 GB) binary file on AWS S3 that will require custom parsing, probably in python. It is essentially a sequential set of millions of dataframes, all having the same structure. What is the best way for me to get this data into a severless/hosted AWS Aurora PostgreSQL instance? So far I have thought of:
1. I could write to a CSV file and use COPY, but the size would be astronomical
2. I could send it over the wire in batches of rows
3. use AWS Glue, though I'm still learning about that.
Postgresql – the best way to upload custom-parsed data into AWS Aurora PostgreSQL
awspostgresql
Best Answer
You could write the CSV data stream to a pipe rather than a file:
or