First, if your BIOS/UEFI does not detect correctly your RAM, then your OS won't do any better. There's no need to go any further if your BIOS display incorrect information about your setup.
=> You probably have at least an hardware problem.
EDIT: From your dmesg | grep memory, it seems that you have in fact an hardware problem, located in your embedded bios. At least, Linux has detected it and warns you about it : WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 13295MB of RAM
. It also seems that one of your 4 ram module is incorrectly recognised or inserted.
You can either report it to your manufacturer, upgrade your bios and change your motherboard. There's many chance that with less RAM, you won't encounter this bug.
As a side note, you may agree with this famous quote from Linus Torvalds about BIOS makers :
BIOS writers are invariably totally incompetent crack-addicted monkeys
Second, when your BIOS is OK with what you really have on your motherboard, you can take a look on Linux at /proc/meminfo
. It's often very clear about what your linux system know and do with your memory. Here is what I have on my 64bit / 8 Gb of RAM :
$ cat /proc/meminfo
MemTotal: 8175652 kB
MemFree: 5476336 kB
Buffers: 63924 kB
Cached: 1943460 kB
SwapCached: 0 kB
[...]
About the boot process and what is used/freed by linux kernel, you can grep it from dmesg
:
$ dmesg | grep Memory
[ 0.000000] Memory: 8157672k/8904704k available (6138k kernel code, 534168k absent, 212864k reserved, 6896k data, 988k init)
EDIT : As Gilles said, with dmidecode --type memory
, you can have details about your hardware configuration. It looks like this for a 4x2Gb system :
$ sudo dmidecode --type memory
# dmidecode 2.9
SMBIOS 2.6 present.
Handle 0x0020, DMI type 16, 15 bytes
Physical Memory Array
Location: System Board Or Motherboard
Use: System Memory
Error Correction Type: None
Maximum Capacity: 32 GB
Error Information Handle: Not Provided
Number Of Devices: 4
Handle 0x0022, DMI type 17, 28 bytes
Memory Device
Array Handle: 0x0020
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 2048 MB
[...]
[This block is repeated for each module]
I don't have a precise answer, but some of this is familiar. I don't know what a Probe Filter directory is, but CptSupermrkt explained that above.
In PCI, a Northbridge connects to memory and the processor. ECC errors are associated with DRAM. There are Error Correcting Code bits stored along with each word. On reads they're checked on writes they're updated. ECC Errors are correctable or uncorrectable, which indicate the ability to correct an error using the bits written. Uncorrectable does not indicate there is a permanent hardware error. These can happen when DRAM starts to fail.
Given all that, this looks like a transient error. You might try a complete memory test, but that's not likely to find anything. If the DRAM has failed your only corrective action is to replace it.
Best Answer
It appears that there is no surefire way to tell, however various approaches can get you some sort of answer. Apparently you pretty much have to try the different ones until you find one that tells you ECC is working.
In my case memtest86+ 4.20 couldn't be coaxed into realizing it was dealing with ECC RAM; even if I configured it for ECC On, it still reported
ECC: Disabled
on the IMC line. I haven't yet tried with a newer version. However (possibly after installing edac-utils, unfortunately I did both essentially at the same time), Linux reports in the boot logs (interspersed with some other entries):which is a pretty good indication. Manually doing
/etc/init.d/edac restart
does not create similar log entries, and looking at an older log from a few reboots ago, I see:dmidecode --type memory
also gives two pretty strong indications: the physical memory array's "error correction type" property (which however for some reason showed the same on non-ECC RAM, so this may be related to the motherboard's support rather than the memory's capabilities),and each memory device's total width and data width, respectively (the additional bits being those used for the ECC):