How to Convert A .TXT File To A .CSV File Without Losing Any Of The Content Of That Document

I've been trying for about 4 days now to complete the seemingly simple activity of converting an OS X keychain text dump to a correctly formatted .csv with no progress, much less success. I posted a different question about my predicament yesterday, which has gone without comment or suggestion.

I am simply trying to convert a .txt file to a formatted .csv file, without losing any of the contents of the original document in the process.

What I have already tried:

(1) RWest's Gist and Ben-Oni and christophervalles variations thereof. The main ingredient of this script is a keychain.rb file and I think this would work if I only knew how to add in entry fields to match all the fields in the original document. However, I don't know the formula and have been unable to find a template. As such, running this script results in hundreds of missing entries because the inputs lack a corresponding output field.

#!/usr/bin/env ruby                                                                                                                                                                                                          
#                                                                                                                                                                                                                            
# Usage:                                                                                                                                                                                                                     
#   security dump-keychain -d login.keychain > keychain_logins.txt                                                                                                                                                           
#   # Lots of clicking 'Always Allow', or just 'Allow', until it's done...                                                                                                                                                   
#   curl -O curl -O https://raw.github.com/gist/1224792/06fff24412311714ad6534ab700a7d603c0a56c9/keychain.rb
#   chmod a+x ./keychain.rb                                                                                                                                                                                                  
#   ./keychain.rb keychain_logins.txt | sort > logins.csv                                                                                                                                                                    
#                                                                                                                                                                                                                            
# Then import logins.csv in 1Password using the format:                                                                                                                                                                      
# Title, URL/Location, Username, Password                                                                                                                                                                                    
# Remember to check 'Fields are quoted', and the Delimiter character of 'Comma'.                                                                                                                                             
require 'date'

class KeychainEntry
  attr_accessor :fields

  def initialize(keychain)
    last_key = nil
    @fields = {}
    data = nil
    aggregate = nil
    lines = keychain.split("\n")
    lines.each do |line|
      # Everything after the 'data:' statement is data.

      if data != nil
        data << line
      elsif aggregate != nil
        if ( line[0] == 32 || line[0] == " " )
          keyvalue = line.split('=', 2).collect { |kv| kv.strip }
          aggregate[keyvalue.first] = keyvalue.last
        else
          @fields[last_key] = aggregate
          aggregate = nil
        end
      end

      if aggregate == nil
        parts = line.split(':').collect { |piece| piece.strip }
        if parts.length > 1
          @fields[parts.first] = parts.last
        else
          last_key = parts.first
          data = [] if parts.first == "data"
          aggregate = {}
        end
      end
    end
    @fields["data"] = data.join(" ") if data
  end
end

def q(string)
  "\"#{string}\""
end

def process_entry(entry_string)
  entry = KeychainEntry.new(entry_string)


  if entry.fields['class'] == '"genp"'
    title = entry.fields['attributes']['0x00000007 <blob>'].gsub!('"', '')
    user = entry.fields['attributes']['"acct"<blob>'].gsub!('"', '')
    location = entry.fields['attributes']['"svce"<blob>'].gsub!('"', '')
    pass = entry.fields['data'][1..-2]
    puts "\"#{title}\"\t\"#{location}\"\t\"#{user}\"\t\"#{pass}\""
  end

  if entry.fields['class'] == '"inet"' && ['"form"', '"dflt"'].include?(entry.fields['attributes']['"atyp"<blob>'])
    site = entry.fields['attributes']['"srvr"<blob>'].gsub!('"', '')
    path = entry.fields['attributes']['"path"<blob>'].gsub!('"', '')
    proto= entry.fields['attributes']['"ptcl"<uint32>'].gsub!('"', '')
    proto.gsub!('htps', 'https');
    user = entry.fields['attributes']['"acct"<blob>'].gsub!('"', '')
    #user = entry.fields['attributes']['0x00000007 <blob>'].gsub!('"', '')
    date_string = entry.fields['attributes']['"mdat"<timedate>'].gsub(/0x[^ ]+[ ]+/, '').gsub!('"', '')
    date = DateTime.parse(date_string)
    pass = entry.fields['data'][1..-2]
    path = '' if path == '<NULL>'
    url = "#{proto}://#{site}#{path}"

    puts "#{site}\t#{url}\t#{user}\t#{pass}\t#{date}"
    #puts "#{user}\t #{pass}\t #{date}"
  end
end

accum = ''
ARGF.each_line do |line|
  if line =~ /^keychain: /
    unless accum.empty?
      process_entry(accum)
      accum = ''
    end
  end
  accum += line
end

(2) RMondello's Gist, which sort of worked for me once, though there were missing entries, but which I've been unable to get working at all again for no explicable reason.

set the logFile to ((path to desktop) as string) & "Passwords"
set keychainPath to "/Users/Dad/Desktop/dad.keychain"

-- write_to_file taken from http://www.macosxautomation.com/applescript/sbrt/sbrt-09.html
on write_to_file(this_data, target_file, append_data)
    try
        set the target_file to the target_file as string
        set the open_target_file to open for access file target_file with write permission
        if append_data is false then set eof of the open_target_file to 0
        write this_data to the open_target_file starting at eof
        close access the open_target_file
        return true
    on error
        try
            close access file target_file
        end try
        return false
    end try
end write_to_file

tell application "Usable Keychain Scripting"
    set keychainItems to get every keychain item of keychain keychainPath
    repeat with keychainItem in keychainItems
        set aServer to server in keychainItem
        set anAccount to account in keychainItem
        set aPassword to password in keychainItem

        set csvEntry to aServer & "," & anAccount & "," & aPassword & "
"

        my write_to_file(csvEntry, logFile, true)
    end repeat
end tell

(3) The AgileBits OS X Keychain To CSV Converter, which has resulted in .csv files with even more missing entries than RWest's Gist did.

# OS X Keychain text export converter
#
# Copyright 2014 Mike Cappella (mike@cappella.us)

package Converters::Keychain 1.02;

our @ISA    = qw(Exporter);
our @EXPORT     = qw(do_init do_import do_export);
our @EXPORT_OK  = qw();

use v5.14;
use utf8;
use strict;
use warnings;
#use diagnostics;

binmode STDOUT, ":utf8";
binmode STDERR, ":utf8";

use Encode;
use Utils::PIF;
use Utils::Utils qw(verbose debug bail pluralize myjoin unfold_and_chop print_record);
use Time::Local qw(timelocal);
use Time::Piece;

my $max_password_length = 50;

my %card_field_specs = (
    login =>            { textname => undef, fields => [
    [ 'username',       0, qr/^username$/ ],
    [ 'password',       0, qr/^password$/ ],
    [ 'url',        0, qr/^url$/ ],
    ]},
    note =>         { textname => undef, fields => [
    ]},
);

my (%entry, $itype);

# The following table drives transformations or actions for an entry's attributes, or the class or
# data section (all are collected into a single hash).  Each ruleset is evaluated in order, as are
# each of the rules within a set.  The key 'c' points to a code reference, which is passed the data
# value for the given type being tested.  It can transform the value in place, or simply test it and
# return a string (for debug output).  When the key 'action' is set to 'SKIP', the entry being tested
# will be rejected from consideration for export when the 'c' code reference returns a TRUE value.
# And in that case, the code ref pointed to by 'msg' will be run to produce debug output, used to
# indicate the reason for the rejection.
#
# The table facilitates adding new transformations and rejection rules, as necessary,
# through empirical discover based on user feedback.
my @rules = (
    CLASS => [
        { c => sub { $_[0] !~ /^inet|genp$/ }, action => 'SKIP', msg => sub { debug "\tskipping non-password class: ", $_[0] } },
    ],
    svce => [
        { c => sub { $_[0] =~ s/^0x([A-F\d]+)\s+".*"$/pack "H*", $1/ge } },
        { c => sub { $_[0] =~ s/^"(.*)"$/$1/ } },
        { c => sub { $_[0] =~ /^Apple Persistent State Encryption$/ or 
                 $_[0] =~ /^Preview Signature Privacy$/ or
                 $_[0] =~ /^Safari Session State Key$/ or
                 $_[0] =~ /^Call History User Data Key$/}, action => 'SKIP',
            msg => sub { debug "\t\tskipping non-password record: $entry{'CLASS'}: ", $_[0] } },
    ],
    srvr => [
        { c => sub { $_[0] =~ s/^"(.*)"$/$1/ } },
        { c => sub { $_[0] =~ s/\.((?:_afpovertcp|_smb)\._tcp\.)?local// } },
    ],
    path => [
        { c => sub { $_[0] =~ s/^"(.*)"$/$1/ } },
        { c => sub { $_[0] =~ s/^<NULL>$// } },
    ],
    ptcl => [
        { c => sub { $_[0] =~ s/htps/https/ } },
        { c => sub { $_[0] =~ s/^"(\S+)\s*"$/$1/ } },
    ],
    acct => [
        { c => sub { $_[0] =~ s/^0x([A-F\d]+)\s+".*"$/pack "H*", $1/ge } },
        { c => sub { $_[0] =~ s/^"(.*)"$/$1/ } },
    ],
    mdat => [
        { c => sub { $_[0] =~ s/^0x\S+\s+"(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})(\d{2})Z.+"$/$1-$2-$3 $4:$5:$6/g } },
    ],
    cdat => [
        { c => sub { $_[0] =~ s/^0x\S+\s+"(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})(\d{2})Z.+"$/$1-$2-$3 $4:$5:$6/g } },
    ],
    # desc must come before DATA so that 'secure note' type can be used as a condition in DATA below
    desc => [
        { c => sub { $_[0] =~ s/^"(.*)"$/$1/; $itype = 'note' if $_[0] eq 'secure note'; $_[0] } },
    ],
    DATA => [
        # secure note data, early terminates rule list testing
        { c => sub { $itype eq 'note' and $_[0] =~ s/^.*<key>NOTE<\/key>\\012\\011<string>(.+?)<\/string>.*$/$1/ }, action => 'BREAK',
            msg => sub { debug "\t\tskipping non-password record: $entry{'CLASS'}: ", $entry{'svce'} // $entry{'srvr'} } },

        { c => sub { $_[0] !~ s/^"(.+)"$/$1/ }, action => 'SKIP',
            msg => sub { debug "\t\tskipping non-password record: $entry{'CLASS'}: ", $entry{'svce'} // $entry{'srvr'} } },
        { c => sub { $_[0] =~ /^[A-Z\d]{8}-[A-Z\d]{4}-[A-Z\d]{4}-[A-Z\d]{4}-[A-Z\d]{12}$/ }, action => 'SKIP',
            msg => sub { debug "\t\tskipping non-password record: $entry{'CLASS'}: ", $entry{'svce'} // $entry{'srvr'} } },
        { c => sub { length $_[0] > $max_password_length }, action => 'SKIP',
            msg => sub { debug "\t\tskipping record with improbably long password: $entry{'CLASS'}: ", $entry{'svce'} // $entry{'srvr'} } },
        { c => sub { join '', "\trecord: class = $entry{'CLASS'}: ", $entry{'svce'} // $entry{'srvr'} } },  # debug output only
    ],
);

$DB::single = 1;                    # triggers breakpoint when debugging

sub do_init {
    return {
    'specs'     => \%card_field_specs,
    'imptypes'      => undef,
    'opts'      => [ [ q{-m or --modified           # set item's last modified date },
                   'modified|m' ],
               ],
    };
}

sub do_import {
    my ($file, $imptypes) = @_;
    my (%Cards, %dup_check);
    my $contents = $_;;

    {
    local $/;
    open my $fh, '<:encoding(utf8)', $file or bail "Unable to open file: $file\n$!";
    $contents = <$fh>;
    close $fh;
    }

    my ($n, $examined, $skipped, $duplicates) = (1, 0, 0, 0);
    my ($npre_explode, $npost_explode);

KEYCHAIN_ENTRY:
    while ($contents) {
    if ($contents =~ s/\Akeychain: (.*?)\n+(?=$|^keychain: ")//ms) {
        local $_ = $1; my $orig = $1;
        $itype = 'login';

        $examined++;
        debug "Entry ", $examined;

        s/\A"(.*?)"\n^(.+)/$2/ms;
        my $keychain = $1;
        #debug 'Keychain: ', $keychain;

        s/\Aclass: "?(.*?)"? ?\n//ms;
        my $class = $1;

        # attributes
        s/\Aattributes:\n(.*?)(?=^data:)//ms;
        %entry = map { clean_attr_name(split /=/, $_) } split /\n\s*/, $1 =~ s/^\s+//r;

        $entry{'CLASS'} = $class;

        # data
        s/\Adata:\n(.+)\z//ms;
        $entry{'DATA'}  = defined $1 ? $1 : '';

        # run the rules in the rule set above
        # for each set of rules for an entry key...
RULE:
        for (my $i = 0;  $i < @rules; $i += 2) {
        my ($key, $ruleset) = ($rules[$i], $rules[$i + 1]);

        debug "  considering rules for ", $key;
        next if not exists $entry{$key};

        # run the entry key's rules...
        my $rulenum = 1;
        for my $rule (@$ruleset) {
            debug "\t    rule $rulenum: called with ", unfold_and_chop $entry{$key};

            my $ret = ($rule->{'c'})->($entry{$key});

            debug "\t    rule $rulenum: returns ", $ret || 0, '   ', unfold_and_chop $entry{$key};

            if (exists $rule->{'action'}) {
            if ($ret) {
                if ($rule->{'action'} eq 'SKIP') {
                $skipped++;
                ($rule->{'msg'})->($entry{$key})        if exists $rule->{'msg'};
                next KEYCHAIN_ENTRY;
                }
                elsif ($rule->{'action'} eq 'BREAK') {
                debug "\t    breaking out of rule chain";
                next RULE;
                }
            }
            }

            $rulenum++;
        }
        }

        for (keys %entry) {
        debug sprintf "\t    %-12s : %s", $_, $entry{$_}    if exists $entry{$_};
        }

        #my $itype = find_card_type(\%entry);

        my %h;
        my ($notes, $card_modified);
        if ($itype eq 'login') {
        $h{'password'}  = $entry{'DATA'};
        $h{'username'}  = $entry{'acct'}                        if exists $entry{'acct'};
        $h{'url'}   = $entry{'ptcl'} . '://' . $entry{'srvr'} . $entry{'path'}  if exists $entry{'srvr'};
        }
        elsif ($itype eq 'note') {
        # convert ascii string DATA, which contains \### octal escapes, into UTF-8
        my $octets = encode("ascii", $entry{'DATA'});
        $octets =~ s/\\(\d{3})/"qq|\\$1|"/eeg;
        $notes = decode("UTF-8", $octets);
        }
        else {
        die "Unexpected itype: $itype";
        }

        # will be added to notes
        $h{'protocol'}  = $entry{'ptcl'}                    if exists $entry{'ptcl'} and $entry{'ptcl'} =~ /^afp|smb$/;
        $h{'created'}   = $entry{'cdat'}                    if exists $entry{'cdat'};
        if (exists $entry{'mdat'}) {
        if ($main::opts{'modified'}) {
            $card_modified = date2epoch($entry{'mdat'});
        }
        else {
            $h{'modified'}  = $entry{'mdat'};
        }
        }

        for (keys %h) {
        debug sprintf "\t    %-12s : %s", $_, $h{$_}                if exists $h{$_};
        }

        # don't set/use $sv before $entry{''svce'} is removed of _afp*, _smb*, and .local, since it defeats dup detection
        my $sv = $entry{'svce'} // $entry{'srvr'};

        my $s = join ':::', 'sv', $sv,
        map { exists $h{$_} ? "$_ => $h{$_}" : 'URL => none' } qw/url username password/;

        if (exists $dup_check{$s}) {
        debug "  *skipping duplicate entry for ", $sv;
        $duplicates++;
        next
        }
        $dup_check{$s}++;


        # From the card input, place it in the converter-normal format.
        # The card input will have matched fields removed, leaving only unmatched input to be processed later.
        my $normalized = normalize_card_data($itype, \%h, 
        { title     => $sv,
          tags      => undef,
          notes     => $notes,
          folder    => undef,
          modified  => $card_modified });

        # Returns list of 1 or more card/type hashes; one input card may explode into multiple output cards
        my $cardlist = explode_normalized($itype, $normalized);

        my @k = keys %$cardlist;
        if (@k > 1) {
        $npre_explode++; $npost_explode += @k;
        debug "\tcard type $itype expanded into ", scalar @k, " cards of type @k"
        }
        for (@k) {
        print_record($cardlist->{$_});
        push @{$Cards{$_}}, $cardlist->{$_};
        }
        $n++;
    }
    else {
        bail "Keychain parse failed, after entry $examined; unexpected: ", substr $contents, 0, 2000;
    }
    }

    $n--;
    verbose "Examined $examined record", pluralize($examined);
    verbose "Skipped $skipped non-login record", pluralize($skipped);
    verbose "Skipped $duplicates duplicate record", pluralize($duplicates);

    verbose "Imported $n record", pluralize($n) ,
    $npre_explode ? " ($npre_explode card" . pluralize($npre_explode) .  " expanded to $npost_explode cards)" : "";
    return \%Cards;
}

sub do_export {
    create_pif_file(@_);
}

sub find_card_type {
    my $eref = shift;

    my $type = (exists $eref->{'desc'} and $eref->{'desc'} eq 'secure note') ? 'note' : 'login';
    debug "\t\ttype set to '$type'";
    return $type;
}

# Places card data into a normalized internal form.
#
# Basic card data passed as $norm_cards hash ref:
#    title
#    notes
#    tags
#    folder
#    modified
# Per-field data hash {
#    inkey  => imported field name
#    value  => field value after callback processing
#    valueorig  => original field value
#    outkey => exported field name
#    outtype    => field's output type (may be different than card's output type)
#    keep   => keep inkey:valueorig pair can be placed in notes
#    to_title   => append title with a value from the narmalized card
# }
sub normalize_card_data {
    my ($type, $carddata, $norm_cards) = @_;

    for my $def (@{$card_field_specs{$type}{'fields'}}) {
    my $h = {};
    for my $key (keys %$carddata) {
        if ($key =~ /$def->[2]/) {
        next if not defined $carddata->{$key} or $carddata->{$key} eq '';
        my ($inkey, $value) = ($key, $carddata->{$key});
        my $origvalue = $value;

        if (exists $def->[3] and exists $def->[3]{'func'}) {
            #         callback(value, outkey)
            my $ret = ($def->[3]{'func'})->($value, $def->[0]);
            $value = $ret   if defined $ret;
        }
        $h->{'inkey'}       = $inkey;
        $h->{'value'}       = $value;
        $h->{'valueorig'}   = $origvalue;
        $h->{'outkey'}      = $def->[0];
        $h->{'outtype'}     = $def->[3]{'type_out'} || $card_field_specs{$type}{'type_out'} || $type; 
        $h->{'keep'}        = $def->[3]{'keep'} // 0;
        $h->{'to_title'}    = ' - ' . $h->{$def->[3]{'to_title'}}   if $def->[3]{'to_title'};
        push @{$norm_cards->{'fields'}}, $h;
        delete $carddata->{$key};
        }
    }
    }

    # map remaining keys to notes
    $norm_cards->{'notes'} .= "\n"        if defined $norm_cards->{'notes'} and length $norm_cards->{'notes'} > 0 and keys %$carddata;
    for my $key (keys %$carddata) {
    $norm_cards->{'notes'} .= "\n"  if defined $norm_cards->{'notes'} and length $norm_cards->{'notes'} > 0;
    $norm_cards->{'notes'} .= join ': ', $key, $carddata->{$key};
    }

    return $norm_cards;
}

# sort logins as the last to check
sub by_test_order {
    return  1 if $a eq 'login';
    return -1 if $b eq 'login';
    $a cmp $b;
}

sub clean_attr_name {
    return ($_[0] =~ /"?([^<"]+)"?<\w+>$/, $_[1]);
}

# Date converters
# LastModificationTime field:    yyyy-mm-dd hh:mm:ss
sub parse_date_string {
    local $_ = $_[0];
    my $when = $_[1] || 0;                  # -1 = past only, 0 = assume this century, 1 = future only, 2 = 50-yr moving window

    if (my $t = Time::Piece->strptime($_, "%Y-%m-%d %H:%M:%S")) {
    return $t;
    }

    return undef;
}

sub date2epoch {
    my $t = parse_date_string @_;
    return defined $t->year ? 0 + timelocal($t->sec, $t->minute, $t->hour, $t->mday, $t->mon - 1, $t->year): $_[0];
}

1;

(4) Swiss File Knife seemed promising as it has a built in command to convert tab separated text to comma separated text. However, running the command resulted in dozens and dozens and dozens of 'wrong format' errors. I assume my .txt file would need to be a .tsv file for the command to actually work and I have been unable to find any way whatsoever to convert a .txt to a .tsv.

So, this is where I am now, days later, in the same place I started. Can someone, please, tell me how to either alter the entry fields of the first script by RWest so that all content will pass through to the .csv or suggest an alternate method, script, application, rain dance, etc. that will allow me to simply have all the data in the text file appear in a csv file, with the proper columns and rows separating each entry?

EDIT: This is what you get when you just change the file extension:
enter image description here

Best Answer

I am simply trying to convert a .txt file to a .csv, without losing the contents of that document

You can just change the extension .txt into .csv.

'Comma separated value' files are just plain text files with the data columns separated by comma's (or ; or tab). 'txt' files are meant for text, and 'csv' files are ment for data storage, but it are both just text files.

Add on: The data looks organised like a dictionary type (or like a multidimensional array). A dictionary can not be converted to a 2D dataframe structure (like csv or excel) as it can have more dimensions than 2. What you want is not possible. But replacing csv with txt makes it readable though...

This is unreadable text: @1?舋H????A?!A?̃?-ƅm??I?؅?uҸ?

Best Answer

Related Solutions

How to convert .number file to any of the readble format including excel and pdf without having mac

How to convert an AI file to a CDR file

Related Question