Linux – 64-bit Linux doesn’t recognize the RAM between 3 and 32 GB

64bitlinuxram

My problems were caused by a faulty memory module and quite possibly a broken kernel binary.


I just now booted my PC with basically brand new hardware. I've been running Debian 6.0 AMD64 before, and no change there (literally; I just unplugged the hard disks from the old motherboard and reconnected them to the new one), but found something curious:

  • I have physically installed 4 x 8 GB of RAM
  • UEFI/BIOS setup reports 16383 MB of RAM
  • Linux free -m reports 2985 MB of RAM

2985 MB seems too close to the magical 3 GB mark for it to be purely coincidence, but uname -r prints 2.6.32-5-amd64; clearly a 64-bit kernel, which is all that has ever been installed on the system drive I'm using. The new motherboard is an Asus M5A97 Pro, which has four DDR3 slots supposedly supporting 8 GB modules. The memory modules themselves are identical, four Corsair XMS3 PC12800 8 GB, purchased together.

I haven't looked around the UEFI setup in detail, but did browse through it and saw nothing that seemed like it would need changing to enable large amounts of RAM.

Edit: Further confirmation that I really am running 64-bit:

# file `which free`
/usr/bin/free: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
#

What's up with this, and what can I do about it?

Edit 2: dmesg, dmidecode and meminfo, as requested. I don't have physical access to the system right now, so will have to wait until tonight to pull out some modules and see what that does. (Note that dmidecode reports 3 x 8GB plus one empty DIMM slot. Also note the MTRR mismatch message from the kernel, leading to a loss of 13 GB, which at least adds up with what the motherboard itself is reporting.)

# dmidecode --type memory
# dmidecode 2.9
SMBIOS 2.7 present.

Handle 0x0026, DMI type 16, 23 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: Multi-bit ECC
        Maximum Capacity: 32 GB
        Error Information Handle: Not Provided
        Number Of Devices: 4

Handle 0x0028, DMI type 17, 34 bytes
Memory Device
        Array Handle: 0x0026
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 8192 MB
        Form Factor: DIMM
        Set: None
        Locator: DIMM0
        Bank Locator: BANK0
        Type: <OUT OF SPEC>
        Type Detail: Synchronous
        Speed: 1333 MHz (0.8 ns)
        Manufacturer: Manufacturer0
        Serial Number: SerNum0
        Asset Tag: AssetTagNum0
        Part Number: Array1_PartNumber0

Handle 0x002A, DMI type 17, 34 bytes
Memory Device
        Array Handle: 0x0026
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 8192 MB
        Form Factor: DIMM
        Set: None
        Locator: DIMM1
        Bank Locator: BANK1
        Type: <OUT OF SPEC>
        Type Detail: Synchronous
        Speed: 1333 MHz (0.8 ns)
        Manufacturer: Manufacturer1
        Serial Number: SerNum1
        Asset Tag: AssetTagNum1
        Part Number: Array1_PartNumber1

Handle 0x002C, DMI type 17, 34 bytes
Memory Device
        Array Handle: 0x0026
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 8192 MB
        Form Factor: DIMM
        Set: None
        Locator: DIMM2
        Bank Locator: BANK2
        Type: <OUT OF SPEC>
        Type Detail: Synchronous
        Speed: 1333 MHz (0.8 ns)
        Manufacturer: Manufacturer2
        Serial Number: SerNum2
        Asset Tag: AssetTagNum2
        Part Number: Array1_PartNumber2

Handle 0x002E, DMI type 17, 34 bytes
Memory Device
        Array Handle: 0x0026
        Error Information Handle: Not Provided
        Total Width: Unknown
        Data Width: 64 bits
        Size: No Module Installed
        Form Factor: DIMM
        Set: None
        Locator: DIMM3
        Bank Locator: BANK3
        Type: Unknown
        Type Detail: Synchronous
        Speed: Unknown
        Manufacturer: Manufacturer3
        Serial Number: SerNum3
        Asset Tag: AssetTagNum3
        Part Number: Array1_PartNumber3
#
======================================================================
# cat /proc/meminfo
MemTotal:        3056820 kB
MemFree:         1470820 kB
Buffers:          390204 kB
Cached:           194660 kB
SwapCached:            0 kB
Active:           488024 kB
Inactive:         419096 kB
Active(anon):     231112 kB
Inactive(anon):    96660 kB
Active(file):     256912 kB
Inactive(file):   322436 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 8 kB
Writeback:             0 kB
AnonPages:        322320 kB
Mapped:            33012 kB
Shmem:              5472 kB
Slab:             613952 kB
SReclaimable:     597404 kB
SUnreclaim:        16548 kB
KernelStack:        2384 kB
PageTables:        19472 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     1528408 kB
Committed_AS:     621464 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      294484 kB
VmallocChunk:   34359429080 kB
HardwareCorrupted:     0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        9216 kB
DirectMap2M:     2054144 kB
DirectMap1G:     1048576 kB
#
======================================================================
# dmesg | grep -i memory
[    0.000000] WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 13295MB of RAM.
[    0.000000] WARNING: at /tmp/buildd/linux-2.6-2.6.32/debian/build/source_amd64_none/arch/x86/kernel/cpu/mtrr/cleanup.c:1092 mtrr_trim_uncached_memory+0x2e6/0x311()
[    0.000000]  [<ffffffff814f7f1e>] ? mtrr_trim_uncached_memory+0x2e6/0x311
[    0.000000]  [<ffffffff814f7f1e>] ? mtrr_trim_uncached_memory+0x2e6/0x311
[    0.000000]  [<ffffffff814f7f1e>] ? mtrr_trim_uncached_memory+0x2e6/0x311
[    0.000000] initial memory mapped : 0 - 20000000
[    0.000000] init_memory_mapping: 0000000000000000-00000000bdf00000
[    0.000000] PM: Registered nosave memory: 000000000009d000 - 000000000009e000
[    0.000000] PM: Registered nosave memory: 000000000009e000 - 00000000000a0000
[    0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
[    0.000000] PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
[    0.000000] PM: Registered nosave memory: 00000000bd94d000 - 00000000bd99c000
[    0.000000] PM: Registered nosave memory: 00000000bd99c000 - 00000000bd9a6000
[    0.000000] PM: Registered nosave memory: 00000000bd9a6000 - 00000000bdade000
[    0.000000] PM: Registered nosave memory: 00000000bdade000 - 00000000bdaef000
[    0.000000] PM: Registered nosave memory: 00000000bdaef000 - 00000000bdb02000
[    0.000000] PM: Registered nosave memory: 00000000bdb02000 - 00000000bdb04000
[    0.000000] PM: Registered nosave memory: 00000000bdb04000 - 00000000bdb0d000
[    0.000000] PM: Registered nosave memory: 00000000bdb0d000 - 00000000bdb13000
[    0.000000] PM: Registered nosave memory: 00000000bdb13000 - 00000000bdb75000
[    0.000000] PM: Registered nosave memory: 00000000bdb75000 - 00000000bdd78000
[    0.000000] Memory: 3046732k/3111936k available (3075k kernel code, 4728k absent, 60476k reserved, 1879k data, 584k init)
[    1.636730] Freeing initrd memory: 9501k freed
[    1.647370] Freeing unused kernel memory: 584k freed
[    4.876602] [TTM] Zone  kernel: Available graphics memory: 1528410 kiB.
[    4.876615] [drm] radeon: 256M of VRAM memory ready
[    4.876617] [drm] radeon: 512M of GTT memory ready.
[   25.571018] VBoxDrv: dbg - g_abExecMemory=ffffffffa051d6c0
#

Grepping for e820 shows a bunch of ranges, topping out with e820 update range: 00000000bdf00000 - 000000043f000000 (usable) ==> (reserved). 43f000000 is 16 GiB, bdf00000 is 3039 MiB. I do not see that being coincidental.

# dmesg | grep -i e820
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009d800 (usable)
[    0.000000]  BIOS-e820: 000000000009d800 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 00000000bd94d000 (usable)
[    0.000000]  BIOS-e820: 00000000bd94d000 - 00000000bd99c000 (ACPI NVS)
[    0.000000]  BIOS-e820: 00000000bd99c000 - 00000000bd9a6000 (ACPI data)
[    0.000000]  BIOS-e820: 00000000bd9a6000 - 00000000bdade000 (reserved)
[    0.000000]  BIOS-e820: 00000000bdade000 - 00000000bdaef000 (ACPI NVS)
[    0.000000]  BIOS-e820: 00000000bdaef000 - 00000000bdb02000 (reserved)
[    0.000000]  BIOS-e820: 00000000bdb02000 - 00000000bdb04000 (ACPI NVS)
[    0.000000]  BIOS-e820: 00000000bdb04000 - 00000000bdb0d000 (reserved)
[    0.000000]  BIOS-e820: 00000000bdb0d000 - 00000000bdb13000 (ACPI NVS)
[    0.000000]  BIOS-e820: 00000000bdb13000 - 00000000bdb75000 (reserved)
[    0.000000]  BIOS-e820: 00000000bdb75000 - 00000000bdd78000 (ACPI NVS)
[    0.000000]  BIOS-e820: 00000000bdd78000 - 00000000bdf00000 (usable)
[    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
[    0.000000]  BIOS-e820: 00000000fec10000 - 00000000fec11000 (reserved)
[    0.000000]  BIOS-e820: 00000000fec20000 - 00000000fec21000 (reserved)
[    0.000000]  BIOS-e820: 00000000fed00000 - 00000000fed01000 (reserved)
[    0.000000]  BIOS-e820: 00000000fed61000 - 00000000fed71000 (reserved)
[    0.000000]  BIOS-e820: 00000000fed80000 - 00000000fed90000 (reserved)
[    0.000000]  BIOS-e820: 00000000fef00000 - 0000000100000000 (reserved)
[    0.000000]  BIOS-e820: 0000000100001000 - 000000043f000000 (usable)
[    0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
[    0.000000] e820 update range: 00000000bdf00000 - 000000043f000000 (usable) ==> (reserved)
[    0.000000] update e820 for mtrr
# 

EDIT 3/4 — partial success:

  • Upgrading the UEFI BIOS from version 0705 x64 08/23/2011 to 1007 02/10/2012 did not help: the exact same problem remained.
  • Removing one DIMM module (I took a lucky guess at which slot was #4: the one farthest from the CPU) allowed the BIOS to detect and use the remaining 24 GB, although a three-DIMM configuration is not "recommended" according to the diagram in the user's manual. Notably, seating one of the remaining DIMMs in slot #4 still allowed it to be used, so the slot is fine. Reseating the "original" DIMM into that slot dropped me back at my starting point.
  • Booting from the Debian 6.0.3 AMD64 installation CD into a rescue environment and checking its dmesg output shows no similar MTRR errors. Also, in that environment, with 3 x 8GB installed, 24 GB (plus or minus epsilon times pi or thereabouts; I didn't do the exact math) shows up as usable according to free.
  • Upgrading/reinstalling the kernel (there was a minor upgrade available) seems to have fixed the MTRR issues as well. dmesg now reports 26198016 KB total, and no MTRR errors, which is in line with what I would expect with 3 x 8GB installed. free -m now reports 24114 MB total RAM, which quite frankly is close enough for me.

This smells like a barfed DIMM, plus a kernel that for whatever reason was damaged; that latter may have happened during the power outage (though I must say that's an odd way for the kernel to break!). The non-working DIMM will go back to the reseller as soon as I talk to them (hopefully tomorrow).

(hopefully) FINAL EDIT

I RMA'd one of the two pairs of DIMMs, it was accepted by the reseller as damaged and they sent me a new pair, which seems to work just fine. So I'm now basically at where I originally intended nearly a month ago (although a large fraction of that time was not really due to the reseller), with 32 GB RAM usable; free -m reports 32194 MB total memory, and the kernel reports 34586624k RAM on initialization, both of which are well in line with my expectations.

Best Answer

First, if your BIOS/UEFI does not detect correctly your RAM, then your OS won't do any better. There's no need to go any further if your BIOS display incorrect information about your setup.

=> You probably have at least an hardware problem.

EDIT: From your dmesg | grep memory, it seems that you have in fact an hardware problem, located in your embedded bios. At least, Linux has detected it and warns you about it : WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 13295MB of RAM. It also seems that one of your 4 ram module is incorrectly recognised or inserted.

You can either report it to your manufacturer, upgrade your bios and change your motherboard. There's many chance that with less RAM, you won't encounter this bug.

As a side note, you may agree with this famous quote from Linus Torvalds about BIOS makers :

BIOS writers are invariably totally incompetent crack-addicted monkeys

Second, when your BIOS is OK with what you really have on your motherboard, you can take a look on Linux at /proc/meminfo. It's often very clear about what your linux system know and do with your memory. Here is what I have on my 64bit / 8 Gb of RAM :

$ cat /proc/meminfo 
MemTotal:        8175652 kB
MemFree:         5476336 kB
Buffers:           63924 kB
Cached:          1943460 kB
SwapCached:            0 kB
[...]

About the boot process and what is used/freed by linux kernel, you can grep it from dmesg :

$ dmesg | grep Memory
[    0.000000] Memory: 8157672k/8904704k available (6138k kernel code, 534168k absent, 212864k reserved, 6896k data, 988k init)

EDIT : As Gilles said, with dmidecode --type memory, you can have details about your hardware configuration. It looks like this for a 4x2Gb system :

$ sudo dmidecode --type memory
# dmidecode 2.9
SMBIOS 2.6 present.

Handle 0x0020, DMI type 16, 15 bytes
Physical Memory Array
    Location: System Board Or Motherboard
    Use: System Memory
    Error Correction Type: None
    Maximum Capacity: 32 GB
    Error Information Handle: Not Provided
    Number Of Devices: 4

Handle 0x0022, DMI type 17, 28 bytes
Memory Device
    Array Handle: 0x0020
    Error Information Handle: Not Provided
    Total Width: 64 bits
    Data Width: 64 bits
    Size: 2048 MB
    [...]
[This block is repeated for each module]
Related Question