Using EtreCheck to solve issues. An awesome troubleshooting tool. Lessons learned.
This is not a question but a lesson for everyone who is troubleshooting a very sticky problem.
So this was the issue, what appeared to trigger it and how I solved it.
I attached a second display to my iMac using a third party display port to HDMI dongle connected to one of the two thunderbolt ports. As soon as I attached the monitor I discovered that clicking on menu items would trigger a spinning beachball that would appear for half a second. As long as I stayed in the app the issue would not reappear. Once I switched to another app, the issue would return. In photoshop, where I spend some due time, brush tools would pause for split seconds at a time.
On first glance some steps to take might be, try another dongle, another video cable, disconnect all hardware from the Mac as an isolation step (too see if external hardware causes it), test in a new user account.
Here are some things I did before I found a fix, some of it, like reinstalling Mac OS was only done because I was doing other things so I didn't think it would hurt but this rarely fixes issues.
- I ran Disk Utility
- All maintenance operation in Onyx (exceptions were made to Spotlight Index, Mail's mailboxes, Disk position on the desktop),
- Removed all my user account Preferences files ~/Library/Preferences out so that new plist files were forced to spawn in default configs
- started in safe mode
- even reset PRAM, SMC
- reinstalled Mac OS (not an erase and install) in recovery mode
- Manually removed all ~/Library/Cache files
- New user account created and logged into but the issue is reproducible.
So, at this point I tried some more things, as you can imagine this is a very frustrating issue to cope with.
- I used the free AppCleaner to delete any app I had not used for more than a year.
- I moved ALL plist files inside of ~/Library/Preferences to a folder on the desktop
- I manually trashed ~/Library/Caches
- Disconnected external devices
— The issue persisted.
At this point I thought I should visit the Apple Discussions Forum so I did but before doing so I know that some of the geeks there like to see an EtreCheck (a free app) report so I ran the tool, it generated a report and I began to read through the report.
I can not tell you how valuable this tool is. It actually helped me solve this issue. How?
After running EtreCheck and looking at the report, files in these locations became suspect
Other file types were also of interest to me Launch Agents, Launch Daemons, User Launch Agent, User Login Items, Internet Plug-ins, User Internet Plug-ins, Safari Extensions, 3rd party preference panes and so on.
Etrecheck lists all files by name. conundrum here is that some of the files it lists spotlight will not list because spotlight doesn't search those places so will need a tool like "Find any file.app" by temple.org to ween those out. You then might ask, which files that EtreCheck lists do I deal with? That will depend but if you use some common sense, such as extrapolating from the file names you might be on a good path to a solution. If you are still not sure how to resolve your particular issue, describe the issue here on this forum and attach your EtreCheck report as it can help someone resolve an otherwise difficult issue.
I did eventually post on the Apple Discussions Forum, https://discussions.apple.com/thread/250418646?page=1 and resolved my own issue before anyone had chimed in because I simply read the EtreCheck report and began to remove items that I understood I no longer used and the issue was gone.
I did get something very valuable from the discussion on Apple's forum, get an external drive, install a clean OS on it, update it and add nothing to it. When trouble finds you, boot from this volume, partition, USB stick whatever and if the issue persists it's clear that there may be a hardware related issue, this translates to cables, dongles, hard drives and other devices, all of them should be suspect. Having an external drive with a clean install of Mac OS is a terrific way of quickly isolating issues.
Would love to get your take.
Screenshot of EtreCheck report. The app is free, there are mentions of money but that is only if you want the EtreCheck team to help you resolve an issue. To generate the report is free. Attach it with your issue and it can really help others help you.
I should add the following bit of info, after going through the report that EtreCheck creates, I took a deep dive and began to really delete some of these items even further. I can tell you that my computer now boots, restarts and shuts down at lightening speed.
I've tried a few of those automated clean up tools and none of them can compete with this because none that I've found tackle kext or plist files that are running with the system as they assume they should be running, after all they are installed so the devs of those utils assume they should be there.
Couple of takeaways;
These kext and plist files may be running consuming CPU cycles (especially of concern for portables as they can trickle battery power downward)
On any machine, these can slow things down or just increase heat and electrical consumption.
I have to say, although I've played around with EtreCheck over the years and have known about it for a long time, this is an extraordinarily powerful tool. It does nothing other than provide information about the system it probes and provide a report that is incredibly useful for troubleshooting issues. If I were a consultant I would absolutely purchase the pro version even if it does nothing other than support the devs.
I for one will be asking folks here to include the EtreCheck report when they have an especially sticky issue that no steps appear to resolve.
Hi guys and gals, just a quick update. Since posting the original message the issue kept returning despite whatever I threw at it. The original issue was an intermittent but persistent millisecond worth of delay that manifested itself in the form of a spinning beachball on High Sierra.
I finally bit the bullet and decided to migrate (manually) to a new user account. Long story short, the issue finally appears to have recessed but I am still carefully observing for any negative changes.
I am still in the process of migrating manually, returning items little by little to the user account and waiting hours before I triggering activation of more services. So far the experience has yielded a very responsive Mac, startups are faster and apps scream again.
It would be interesting to learn what the actually source of the conflict is. If I find it I will post back.
For whoever is paying attention: read on, the saga continued but it's finally really solved BUT, I had to completely wipe my internal drives. One might ask, well so what can I learn here, you wiped everything so of course everything works. Yes and yes but there are some caveats.
Firstly, let's recap on the main configuration. This iMac actually has 2 SATA3 connectors internally, only one was used by Apple as shipped. So years back, I used one of those SATA connectors for a small SSD and a 4TB HDD, leaving the optical still in service. I also had an external for Time Machine.
So, I took a hammer to the issue and wiped everything, all three drives. First I wiped the Time Machine drive (I don't know why but I felt something fishy might be there, I don't recall why, maybe just some weird instinct). Then I manually backed up my data from the internal HDD to the external drive.
Next I booted in recovery and staged what a time machine restore might look like in my case, I liked what I saw so I wiped the internal HDD, restarted and backed up the SSD to the internal HDD.
So now the internal HDD is my time machine drive
and the internal SSD holds the system and user accounts.
I then went back into recovery mode, I ran disk utility on the SSD (see more on this below) then wiped the SSD followed by a clean OS install and finally a restore using Time Machine from the internal HDD — all in recovery mode.
Why did I have to wipe everything?
Simply put, at this rate, I wasn't sure what was wrong. The main issue, an intermittent spinning beachball (occurring about 3 times a minute) persisted, no matter what I tried.
Note: When I ran disk utility on the SSD (APFS), I found a weird error that I looked up using Safari while in recovery mode, it has to do with some overlap of data. It was mentioned as being present in two addresses. Research told me that the only way to repair this was by erasing the APFS SSD drive. The disappointment here is that Apple, in my view, prematurely released APFS without consulting their programmer's to work with the documentation writers and publish full dev docs on APFS. As a result no 3rd party developers had a fix for the issue I had, not even Disk Utility in High Sierra recovery mode could offer a repair, only reporting it. In fact there was no suggestion on what to do or what it meant, just that an error was present on the drive.
It should also be understood the I installed the DriveDx demo prior to all this and the SSD was reported as like new, it had very little wear and no red flags were present. The is an OWC drive, highly rated and I trust it still, as I trust DriveDx.
There is no question in mind that the spinning beachball was a result of this oddball error reported in Disk Utility. How do I know? The issue absolutely disappeared and the machine has now been in use for over 4 days, with several restarts, shutdowns and sleeps. The thing is absolutely perfect.
Here are some issues that completely disappeared.
• After waking the machine, I use to see hundreds of notifications telling me that an external USB drive directly connected had ejected incorrectly. This still appears if the machine sleeps (this is due to some chase enclosure) but only as a single or a twin notification that when I dismiss all go away at once. I don't have to sit there clicking away for a minute or two to dismiss so many notifications.
• The machine no longer beachballs intermittently, this was the main issue.
Some other Lessons
So it is important to repeat that at this moment, amongst other externals drives, the main configuration is now;
An internal 4TB HDD, a 120GB SSD. The internal HDD is now my Time Machine drive. Data is stored on an external Firewire 800 4TB drive. Why is this worthy of mention?
Firstly the internal HDD use to house data and my main user account (done through the Users & Groups Advanced options).
This experience actually made me recognised that this setup was particularly ignorant of me because the internal HDD was under more pressure as it had to serve so all user based file activity, caching, and so on which meant more heat, inside the computer. This internal HDD is now a time machine backup only so it is triggered only briefly, generating much less heat inside the machine. Less heat is good for the longevity and performance of electronic components. I should have considered this early on but DriveDx does report that the drive had only a few hours that were over 50°C, the drive's threshold. The reason this was rarely exceeded was due to running Macs Fan Control for as long as I can remember. iMacs are quiet but mainly because the SMC sets fans to run at lower RPMs at the cost of more heat. This iMac anyway, can get rather hot at the touch so I prefer to keep it cool.
On Time Machine recovery, a little surprise, a concern?
Now, a little weird but expected and it kind of ticks me off actually. So what did I migrate inside recovery mode? My user account, apps and really that's it. Now, it turns out that a lot of the Sharing and Permissions set for apps are all over the place, a lot of fetching under the name, probably due to the old user account being absent and my new user account was not there with read/write access which prevented me in one case to install a plugin. I had to set the permissions using the get info window manually. My beef here is why Apple's time Machine restore process doesn't at a minimum include the new user (501) into the app's sharing and permissions. It seems like an oversight but I admit that Chown and Chmod commands and ACL's, permissions and these things are just really frail in my pool of knowledge. I am not a Unix server guy so no wonder.
I hope this report which I didn't proof read has some insight.
Addendum: after writing this I believe I found a screenshot of the error I had on my SSD, only the addresses differed. I am not 100% sure that this was the same error reference though.