MacOS – Glitches in OS X – memory or ssd

macosmemoryssd

My laptop is Macbook pro 15', Early 2011, with an SSD drive – OCZ Agility 3 240GB. I changed my memory too to 16GB (Corsair from Amazon).

I experience glitches:
enter image description here

Also Skype refuses to launch. I reinstalled it several times. Launched it from different locations.

I have issues with Safari too. Sometimes it fails. Sometimes it doesn't start. I ran DiskUtility, I had file system issues. I fixed them.

So now I'm thinking this is from the memory or from the ssd. OCZ have no utility to check the drive status. I don't know how to check whether any files are destroyed or changed. I ran a memtest console tool to check the memory when I bought it – it was okay.

So now I'm looking for ways to diagnose where the problem is and fixing it.

In testing, some more chances to capture screenshots have provided these examples of the failure:

When Chrome is on focus:

enter image description here

When Chrome is not on focus:

enter image description here

Why would it work when not on focus? Makes no sense.

One full screen screenshot.

Here are the steps I've taken to try and narrow down the cause:

  1. I have an external HDD, I'm using Carbon Copy Cloner to make a copy of my drive for backup issues. I used it as a startup drive – the issues persisted – Skype refuses to start (the process takes 100% CPU), Safari starts but doesn't work (100% CPU).

  2. I have two graphics cards – Intel HD Graphics 3000 and AMD Radeon HD6750. I have a tool that manually forces the use of only the first one – to save battery. Now I switched to the other one for a little while and the glitches stay the same.

  3. I have old RAM – original 4GB (Hynix) and 8GB (Crucial). I changed back to the 8GB and Skype starts first, but if I stop and then start it again we're back to where we were. With the 4GB (currently I'm with it) – it even didn't want to start. So now it looks like the problem might be with the ssd?

  4. I noticed App Store and Software Updates fail to connect to the internet.

  5. I went into safe boot (press Shift during restart) – now everything works – skype starts everytime, software update works (I updated lion to the most up to date version). I changed the memory back to the 16gb and tested again into Safe Mode – again everything works. Now everything seems like a software problem.

  6. Most of the times Skype starts once correctly. When I restart – it fails until I restart. Could skype be messing up my machine?

  7. Old version of Skype works – this is a temp solution.

  8. I saw a solution where Little Snitch was blocking Software Update and App Store. I stopped my Little Snitch and they started working again. So these are fixed. The new skype still fails (sometimes the first time it works). Safari fails too. The glitches in the interface don't show up anymore. Everything works in Safe mode.

  9. I'm researching what is different in Safe Mode and cannot find such a good answer.

  10. I removed all software that had kext files installed (Virtual Box, Little Snitch, an antivirus).

  11. Did I mention that I ran 2 free antivirus softwares that failed to find anything?

  12. I stopped all login items for my user (dropbox, gfxCardStatus, Skype (old version).

  13. /System/Library/StartupItems is empty, so nothing to do here.

  14. My TimeMachine had a glitch and said it needs to create a new backup like it was just started for a first time. Now I see only the backups for the last day even though the old backups seem to be there (same space used in Time Machine's hdd).

  15. I removed HWNetMgr, HWPortDetect and StartOuc startup items from /Library/StartupItems. They were from a Huawei 3G modem. In blog it was said that this would speed up my shutdown time. Now shutdown goes from 10 down to 1 second.

  16. Startup time seems quite long. For a machine like this it is like 15 – 20 seconds.

  17. The only remaining /Library/StartupItem is a wireshark's ChmodBPF. I removed it too – didn't help.

  18. I reset the PRAM or whatever that is (Cmd + Option + P + R). Nothing changed obviously.

  19. The laptop experienced two crashes in Safe Mode, which destroys all my previous assumptions that this is a software issue. I'm with the 16GB memory.

  20. I cross-checked all processes running in Safe Mode and in normal mode. These stick out: DashboardClient, iStatLocal, iStatLocalDaemon, kextcache, lsboxd, mdworker, postgres. I'll remove postgres and check what the kext cache is.

  21. Deleted Postgres. The rest of the /Library/LaunchDaemons/ gives

    • com.apple.remotepairtool.plist
    • com.bjango.istatlocaldaemon.plist
    • com.bombich.ccc.plist
    • com.bombich.ccc.scheduledtask.1112C49C-F614-4B14-A06F-0933DCA55B5A.plist
    • com.maintain.CocktailScheduler.plist
    • com.microsoft.office.licensing.helper.plist
  22. /System/Library/LaunchAgents gives a lot of com.apple stuff, the only non-apple things are org.x.startx.plist and org.openbsd.ssh-agent.plist. Full list.

  23. I used plutil to "check" all plist files. /Library/LaunchDaemons/*.plist, /Library/Preferences/*.plist and ~/Library/Preferences/*.plist are fine.

  24. Ran fsck -fy: "Incorrect number of extended attributes" and "Invalid leaf record count". I fixed them. DiskUtility said nothing.

  25. ran fsck_hfs -f /dev/disk1s2 on the external drive. There are errors while the drive is read-only (mounted), no errors while it is not mounted. But no errors produced, I guess it says there are errors because it cannot read the drive.

  26. Google Drive's app starts its process, but there's no window, cpu is 100%. The same behavior with Chrome (even though there's a window, WebProcess behaves the same way). The same with Skype. What's the common between them? an API?

  27. I launched from the cloned CD and ran fsck_hfs -f on the SSD. It found no errors.

  28. DiskWarrior found a lot of issues and fixed them all.

  29. I ran the Apple's Hardware Test (AHT) from the original CD. No errors. I'll run the extensive test tonight. Still nothing.

  30. The external hdd (used for cloning) stopped working. Is this the perfect storm? Now I have to recopy everything. This with the non working TimeMachine could mean that I can loose my data? What's going on?

  31. After a restart, the external is somehow fixed. Frustration…

  32. Did an extensive AHT Test (Snow Leopard DVD2). No errors.

  33. Finally found where the logs reside (System Information). Errors – com.skype.skype[868]: DVFreeThread – CFMachPortCreateWithPort. com.google.Chrome[0]: DVFreeThread – CFMachPortCreateWithPort. com.apple.Safari[477]: objc[484]: Object 0x79e43160 of class __NSCFDictionary autoreleased with no pool in place – just leaking – break on objc_autoreleaseNoPool() to debug

  34. I installed OS X clean on an external drive – all worked – Skype, Safari. All was fine.

  35. Finally progress. I reinstalled OS X Lion on top of the SSD. It turns out that the installer installs on top of the existing system, so I didn't have to migrate anything. Now Safari works, Skype 5.x works, Eclipse works. Interestingly when I started Eclipse it said Java is not installed and it installed it. I didn't know the installer worked this way. I'm using the 8GB ram to make sure all is fine. I'll monitor carefully what is going on. I'll stop writing here until a problem occurs.

  36. Problem again. I updated my machine with Software Update and now it fails again.

  37. I started getting BSODs. http://pastebin.com/k7Yrn4Nk. This was with the 8gb memory.

  38. I switched to the 16gb memory and now all works the old broken way.

  39. I reinstalled. Now Skype works, Safari Works, VMWare Fusion works, SkyDrive fails, There are a few glitches, but this is the best state that my computer has been in a while. I'm not going to update, because I think that will do the damage.

  40. SOLUTION (SO FAR): Then again all failed again. At some point even the laptop stopped recognizing the SSD as a drive. Then it started recognizing it again. I was so tired of all this, that at the end I was thinking of buying the new Macbook, but it being not easy to upgrade and actually not being any faster than what I have now made me think twice. I finally reinstalled it from scratch. I thought it would take weeks before I reinstall all the apps I had, but it took 2 hours, which was amazing – I just put all the apps in a list and some custom configurations too. Now it works and I hope it stays that way. I'm with the 16gigs of memory and the ssd drive. I hope this horror will never occur again. It actually looks like it was an ugly software bug maybe caused by something else. I actually don't care and don't want to know. I just don't want to experience it again. I then reinstalled my other macbook too and my iPhone. All of them now work faster and better. I guess OS X (and iOS) reinstallation can be as fulfilling as Windows reinstallation (which is definitely not a compliment). I just hope everything stays working. If it doesn't I'll just buy a new laptop (and I am thinking of a non-apple one).

As an epilogue: I learned a lot about the system, but then again I bought the MacBook so that I wouldn't have to learn. I bought it for the it-just-works experience and I didn't get it. All of the diagnostics failed. Finally the solution was the Microsoft Windows one – a clean reinstall.

A year later: the laptop had major hardware issues which were unsolvable. Now I'm happy with a new Air.

Best Answer

The best thing with troubleshooting is to isolate the issue and keep good notes when the issue comes and when it goes.

Once you also have an understanding how to make the issue ( in your case, is that glitch constant or does it come and go ) it is then very easy to systematically isolate things.

In your case, switch the ram to the opposite slots, run with only one stick, then the others. Try to find out if it's the ram slot or the motherboard or the ram itself. Then you can isolate the sad by running a while on an external drive.

Repair technicians are very familiar with this process, and due to the volume of work they do, have better feel for what fails more often, have the tools to test booting your Mac from a clean OS externally, etc.

So even though you may not be as fast, skilled or familiar with troubleshooting by isolation, you can still use the same methodology to isolate this failure on your Mac.