Conversion – Convert .xlsx to .csv with Semicolon Separated Fields

conversioncsvlibreofficeopenofficespreadsheet

I realize that this is not an entirely unix/linux related question. But since this is something I'll do on linux, I hope someone has an answer.

I have an online excel file (.xlsx) which gets updated periodically (by someone else). I want to write a script and put it in as a cronjob in order to to process that excel sheet. But to do that, I need to convert that into a text file (so a .csv) with semicolon separated columns. It can't be comma separated unfortunately since some columns have commas in them. Is it at all possible to do this conversion from shell? I have Open office installed and I can do this by using its GUI, but want to know if it is possible to do this from command line. Thanks!

PS: I have a Mac machine as well, so if some solution can work there, thats good as well. 🙂

Best Answer

OpenOffice comes with the unoconv program to perform format conversions on the command line.

unoconv -f csv filename.xlsx

For more complex requirements, you can parse XLSX files with Spreadsheet::XLSX in Perl or openpyxl in Python. For example, here's a quickie script to print out a worksheet as a semicolon-separated CSV file (warning: untested, typed directly in the browser):

perl -MSpreadsheet::XLSX -e '
    $\ = "\n"; $, = ";";
    my $workbook = Spreadsheet::XLSX->new()->parse($ARGV[0]);
    my $worksheet = ($workbook->worksheets())[0];
    my ($row_min, $row_max) = $worksheet->row_range();
    my ($col_min, $col_max) = $worksheet->col_range();
    for my $row ($row_min..$row_max) {
        print map {$worksheet->get_cell($row,$_)->value()} ($col_min..$col_max);
    }
' filename.xlsx >filename.csv
Related Question