MCH_SSKPD warning in dmesg

dmesgdrmkernel

While i was reading dmesg log just to check that everything is fine, i met

[   18.956187] [drm] Wrong MCH_SSKPD value: 0x16040307
[   18.956190] [drm] This can cause pipe underruns and display issues.
[   18.956192] [drm] Please upgrade your BIOS to fix this.

Looks that it does not cause problems on my laptop, but what does this message stands for? What can it cause? Where can I read more about MCH_SSKPD?

Best Answer

Dissecting the acronym, I get that MCH stands for 'Memory Controller Hub' with is an older name for the northbridge. This chip is part of your I/O controller hub.

As for SSKPD, there is not much information I can find other than what is in various intel manuals. Here is a snippet from one of them:

SSKPD — Sticky Scratchpad Data Register

This register holds 64 writable bits with no functionality behind them. It is for the convenience of BIOS and graphics drivers.

Unfortunately this doesn't give much information on what it is. According to Wikipedia, scratchpad is "special high-speed memory circuit used to hold small items of data for rapid retrieval."

Another piece of information is the log from the commit that added the warning:

drm/i915: detect wrong MCH watermark values

Some early bios versions seem to ship with the wrong tuning values for the MCH, possible resulting in pipe underruns under load. Especially on DP outputs this can lead to black screen, since DP really doesn't like an occasional whack from an underrun.

Unfortunately the registers seem to be locked after boot, so the only thing we can do is politely point out issues and suggest a BIOS upgrade.

Arthur Runyan pointed us at this issue while discussion DP bugs - thus far no confirmation from a bug report yet that it helps. But at least some of my machines here have wrong values, so this might be useful in understanding bug reports.

v2: After a bit more discussion with Art and Ben we've decided to only the check the watermark values, since the OREF ones could be be a notch more aggressive on certain machines.

So seemingly the value of the register has some meaning on some processors. There isn't anything I can find on the internet at this time which explains exactly what could go wrong by it having the wrong value, but I think this gives a good overall idea.

If you really want to dig further, you could email one of the guys who wrote or reviewed the commit.

Related Question