Parse values from macOS extended file attributes

extended-attributesmacosplistterminal

If you download a file from the internet using Safari, some extended attributes are added to the downloaded file, among which com.apple.metadata:kMDItemWhereFroms which contains the original URL of the download. In Finder > Get Info the value of this key will be displayed under Where from:.

/bin/ls -alh shows the presence of extended attributes with a @ in the mode column and xattr -l filename.zip will list all the attributes.

According to the xattr's man page to print the value of an attribute one can use:

xattr -p com.apple.metadata:kMDItemWhereFroms filename.zip

# OUTPUT
# com.apple.metadata:kMDItemWhereFroms: bplist00�_=https://example.com/filename.zip

So even if the content is visible, the value is in a binary format with a header of bplist00�_.

I tried to parse it in the following way:

xattr -p com.apple.metadata:kMDItemWhereFroms filename.zip > url.plist

# checking the file format:
file url.plist

# OUTPUT:
# url.plist: Apple binary property list

# assuming this should work:
plutil -convert xml1 url.plist

# OUTPUT:
# url.plist: Property List error: Unexpected character b at line 1 / JSON error: 
# JSON text did not start with array or object and option to allow 
# fragments not set. around line 1, column 0.

Trying to parse the file with python's ootb plistlib throws an error too:

import plistlib

with open('url.plist', 'rb') as fi:
    plist = plistlib.load(fi)

# OUTPUT:
# plistlib.InvalidFileException: Invalid file

From the output it looks like it is not a regular binary plist format even though file url.plist claims it is an 'Apple binary property list'. Any hints as to what the format is and how to parse it to a plain text value?

Best Answer

You can print it in hex and then run it through xxd like so:

xattr -x -p com.apple.metadata:kMDItemWhereFroms filename.zip | xxd -r -p | plutil -p - 

If you want to have other output format you can change plutil to something like:

plutil -convert json -o - -

Example output:

["https:\/\/another.example.com","https:\/\/example.com\/path\/"]