USB thumbdrive raw performance
For starters I'd take a look at the USB thumb drive just to make sure that it's operating at an acceptable level. You can use the following 2 commands to measure it's performance:
$ hdparm -t /dev/sdb
$ dd count=100 bs=1M if=/dev/zero of=/media/disk/test oflag=sync
Substitute in the appropriate HDD info for your given setup.
File system
Also take a look at this article on increasing performance of USB thumbdrives from the Ubuntu help site.
References
Not really an answer, but too long for a comment.
Linux's memory management has been carefully tuned over its long lifetime by some very smart people and it normally does a pretty good job of making the right decision when choosing what to keep in memory and what to drop.
Unfortunately, it looks like your workload isn't very compatible with its decisions :-( Still, I am quite surprised about your report that it it is actually frequently preferring to force dirty memory out to swap rather than drop something from cache. A decision between dropping thing A from cache versus dropping thing B from cache might be a toss-up, but a decision between dropping thing C from cache versus writing out dirty memory D to swap should be heavily weighted toward dropping thing C because that's much less expensive!
There is a way to advise Linux that some piece of cached memory won't be needed in the future, and that is the madvise()
system call with MADV_DONTNEED
but I think it would be difficult for you to invoke that system call from a MATLAB script...
In any case, I don't think that reducing the size of the file cache is really what you want to do here. Remember that the executable files and libraries and run time files and plugins etc... for MATLAB itself, your script, your GUI environment, and other system software, also all live in the file cache, and you'd be forcing those things out of the file cache (in favour of either other contents of the file cache or in favour of heap memory and other non-file-backed mappings). That will cause your system to become slow just as surely as swapping will.
Best Answer
Potential Method #1 - F_DROP_CACHES
I found a method from 2012 that discusses a proposed patch to the Linux kernel in this mail thread titled: Re: [RFC Patch] fs: implement per-file drop caches.
excerptThe thread includes both a testcase and the actual patch to several files within the Linux kernel which adds an additional function to
fs/drop_caches.c
calleddrop_pagecache_file(struct file *filp)
. This function is then accessible through the frontend tool,fnctl.c
via the commandF_DROP_CACHES
. This case calls this function:Which handles the dropping of all the caches associated with the given file. From the file
So this can be used?include/linux/mm.h
:I found no evidence that this patch ever made its way into the main Linux kernel code repository, so this option would appear to be available, only if you're willing to recompile the Linux kernel yourself.
Potential Method #2 - Using dd
In that same thread, another user mentions a completely different methodology that makes use of
The following is excerpt from that email Testing it outdd
.I wasn't 100% positive how to test this out but I came up with the following approach.
make a 100MB file
trace file accesses using
fatrace
run
top
so we can monitor memory usage, note amount free.open file, note amount of free memory now. Note the
fatrace
of the filesample.txt
.drop the file from memory, note amount of free memory now. Note the output of
fatrace
.Example
In terminal #1: In terminal #2: In terminal #3: Now open the file,sample.txt
, and note the amount of RAM. In terminal #1. In terminal #2: Notice the output offatrace
in terminal #3: Now remove the file from RAM, in terminal #4: Note the output offatrace
in terminal #2: Note the RAM in terminal #3:So it would seem that all of the that was consumed by the file in RAM is freed.
Potential Method #3 - python-fadvise
Thanks to a comment by @frostchutz, there's another tool, a Python script, named
Example[pyadvise][4]
which provides a much simpler interface than the abovedd
methods. This script makes use of the sameposix_fadvise(2)
interface.And if we repeat the above test and use
pyadvise
in place ofdd
:I noticed an identical drop in the RAM being consumed as before when I used
dd
.