ICloud – Removing numerous ‘homepage’ entries from Contacts.app with AppleScript extremely slow

applescriptcontactsicloudms office

A syncing issue with Outlook left me with hundreds, if not thousands of duplicate contacts. After managing to merge duplicates without Contacts crashing, I was left with 177 contacts, most of which with many repeat homepage entries. Rather than dying of boredom removing these by hand, I put together some AppleScript to do this for me, thinking that this would take a few minutes. It’s been a week now – the script starts well enough but soon slows down continually and also takes more and more memory from the system, until the spinning beachball of doom appears halting the script. One issue is that I seem to only be able to delete a contact’s urls one at a time in sequence, instead of all at once.

So the question is, what have I got wrong making this script near useless? Could it have something to do with iCloud syncing? Or is AppleScript inherently inefficient? (The constant saving is there because of the random times the script would cease functioning.):

tell application "Contacts"
    activate
    with timeout of 72000 seconds
        set myPeople to people
        set numPeople to (count of myPeople)
        repeat with i from 1 to numPeople
            set myGuy to item i of myPeople
            set myGuyName to get name of myGuy
            set personUrls to (the urls of myGuy whose value contains "outlook")
            set urlNum to count of personUrls
            if urlNum > 0 then
                repeat with j from urlNum to 1 by -1
                    log ((time string of (current date)) & " – [" & i & "/" & numPeople & "] " & myGuyName & " (" & j & "/" & urlNum & "): " & (the label of item j of personUrls))
                    delete item j of personUrls
                    save
                end repeat
            else
                log "No problematic URLs found for " & myGuyName
            end if
            if note of myGuy is not missing value then set note of myGuy to ""
        end repeat
        save
        log "Final save"
    end timeout
    return
end tell

Best Answer

That's bold to assert AppleScript being inefficient (vague term) as a cause. Of course, it very well could be a factor, but it feels a little awkward to say that your script is inefficient, and woefully so. I don't know if that is the only reason the script runs slowly, but it's a good place to start making fixes, which I'll outline by extracting problem lines in your code:

⚠️ set myPeople to people

Redundant. There is little point in assigning a value to a variable that you don't intend to use in a meaningful way (e.g. for manipulating data without changing the source, or, if you really need to, for making scripts easier to read or debug). Nowhere else in your script do you make a reference to myPeople, except for one other line that is also redundant. Therefore, don't waste an operation (and potentially memory, but not really in this particular case) creating a variable you don't need.

⭕️ set numPeople to (count of myPeople)

Redundant in principle (I note that you do log the value of numPeople, though it's only used to give you an index reference, which you don't need to know; see next comment).

⚠️ repeat with i from 1 to numPeople
       set myGuy to item i of myPeople

Ignoring for a moment the log call that references numPeople, then the entire purpose of the declaring numPeople is to allow iteration through a list by way of a counter variable (i in your case) that is used to access each item through its index (position), i.e. item i of myPeople. There are many instances where this would be very appropriate, but it is slower than letting AppleScript worry about how it accesses list items, which it can take off your hands using this syntax: repeat with myGuy in people

⚠️ set urlNum to count of personUrls

Redundant, for the same reason as above. As an additional note, I would personally choose to evaluate the size of a list using the length property. This doesn't apply to nested lists of lists for which you want to include deeply-nested items in the final number, but that's not the case here.

As soon as a script has evaluated (retrieved) an object, its properties will have been retrieved as part of that evaluation. length is a property of a list object, and its a simple, unary value (an integer), so accessing that value will always be quick. count is a command. It performs some undisclosed operation(s) and returns a value. I don't know what those operations are, and they will be performed at the C-language level, so probably (almost certainly) aren't slowing this script down at all. But, in principle, it's something to bear in mind as there are other situations a command and a property seemingly do the same thing, but the property is demonstrably faster.

Can't recall them right now.

⭕️ if urlNum > 0 then

Redundant, in principle. There is an else clause that you might insist on keeping, but the only thing it does it to log the fact that nothing was done. If someone asked me how to intentionally slow a script down because it's just too efficient, this might be one of my answers.

⚠️ repeat with j from urlNum to 1 by -1

This is flagged for both the use of a counter variable, j, and for its position within the if block. If urlNum were set to 0, the repeat loop hear would never be entered, and the script would continue executing the code that follows it. But, as will become clear, the entire repeat block is redunant.

⚠️ log ... (the label of item j of personUrls))
   delete item j of personUrls

I'm questioning the necessity of this log command as a whole. It's certainly not as self-defeating as the one I mentioned earlier, but it does perform a current date command call, and a lookup in the personUrls list object.

  • In situations where you do require a counter variable to iterate through a list, do as you did above, and declare a variable to which can assign the current list item's value, i.e. set hisURL to item j of personalUrls. In crude terms, each time you ask AppleScript for [the value of] item j of..., it must access the list object and perform a look up, which is a relatively expensive operation to perform. Declaring a variable means the lookup is only performed once, then the value (a copy of the original) is stored in memory, for which retrieval is quick and easy in computational terms.

Returning to the value being logged, its worth seems negated by the immediate deletion of the URL data. I'm wondering if you might have just wanted a means of tracking where your script had reached in its run, which needn't be so involved. Using your counter variables, you could simply: log [j, i] (logging their upper bounds once is sufficient, since those values don't change during a loop).

The delete command, when viewed in the context of the repeat loop in which it is called, is going to be slowing things down a lot. You are iterating through every item in a collection in order to delete it...

⚠️ save

...then you save your changes. How would I intentionally make my script run as slowly as possible ? I would perform a save operation on the entire address book a number of times equal to numPeople * urlNum. This value is at least 177, but it's actually multiples of this. The total number of times you need to perform the save operation, I imagine, would be 1.

The Knock-On Effects:

Now we know that iterating over PersonUrls was not necessary, the entire repeat block can be replaced with the line: delete every url of myGuy whose....

  • As something to be mindful of, any script that features nested repeat loops are going to be inefficient: the number of operations performed is a product of each list's size.

I actually note that you do mention attempting to delete a contact's URLs en masse, which didn't work for you, forcing you to do it iteratively. However, as you didn't supply any code showing the methods you tried to do mass-deletion, it's not possible to offer insight into why it failed for you.

Removing the repeat block has a cumulative benefit of negating the parent if block, irrespective of my earlier comments on it.

  • Conditionals can be expensive expressions to evaluate, particularly performing 177 of them that were never needed.

The Refactored Code:

Continuing back up through the script, the preceding variable declarations all become redundant, which leads to the eventual conclusion that your entire script is functionally equivalent to:

use application "Contacts"

tell (a reference to every person)
    delete (its urls where the value contains "outlook")
    set its note to missing value
end tell

save

System info: AppleScript version: 2.7 System version: 10.13.6

What Now ?

Was your ultimate goal to remove all URLs containing "outlook", or did you plan on retaining one "outlook" URL for contacts that have them ?