After understanding your use better, I think the real answer is: you have no need to REINDEX.
REINDEX recreates the entire index. But indexes are maintained incrementally; as rows are inserted and deleted, or updated with new key values, the index is updated to reflect the change, as well as prior versions.
Because of the way MVCC works, when you delete a row, postgres can't actually delete the row or the index entry immediately, as there may be transactions that still have visibility to the row, so it just marks it deleted. VACUUM is run periodically to garbage collect dead tuples, But indexes can still retain bloat where there are dead entries to old row versions. So, frequent updates to indexed columns or massive deletes can produce many dead index entries which make the key density of index blocks lower and produce inefficiencies in index operations. This is called "index bloat".
You can detect whether you're suffering it using some scripts which inspect catalog views and calculate bloat (available through the prior link). If you do incur index bloat, periodic REINDEX operations may be called for. In your use case, you have several million rows, and on a monthly basis delete or modify patches on the order of a few hundred rows. Over the course of a year, this access pattern is likely to touch only a small fraction of a percent of the total rows, so any index bloat resulting will be minuscule and not justify the expense of a REINDEX operation.
First of all, keep in mind that work_mem is per operation and so it can get excessive pretty quickly. In general if you are not having trouble with sorts being slow I would leave work_mem alone until you need it.
Looking at your query plans, one thing that strikes me is that the buffer hits are very different looking at the two plans, and that even the sequential scans are slower. I suspect that the issue has to do with read-ahead caching and having less space for that. What this means is you are biasing memory for re-use of indexes and against reading tables on disk.
My understanding is that PostgreSQL will look to the cache for a page before reading it from disk because it doesn't know really whether the OS cache will contain that page. Because the pages are then staying in the cache and because that cache is slower than the OS cache, this changes the sorts of queries which are fast vs the sorts that are slow. In fact reading the plans, aside from work_mem issues, it looks like all of your query info comes from the cache but it is a question of which cache.
work_mem: how much memory we can allocate for a sort or related join operation. This is per operation, not per statement or per back-end, so a single complex query can use many times this amount of memory. It isn't clear you are hitting this limit but it is worth noting and being aware of. if you increase this too far, you lose memory that might be available for the read cache and the shared buffers.
shared_buffers: how much memory to allocate to the actual PostgreSQL page queue. Now, ideally the interesting set of your database will stay in memory cached here and in the read buffers. However, what this does is ensure that the most frequently used information across all backends gets cached and not flushed to disk. On Linux this cache is significantly slower than the OS disk cache, but it offers guarantees that the OS disk cache dos not and is transparent to PostgreSQL. This is pretty clearly where your problem is.
So what happens is that when we have a request, we check the shared buffers first since PostgreSQL has deep knowledge of this cache, and look for the pages. If they are not there we ask the OS to open them from the file, and if the OS has cached the result it returns the cached copy (this is faster than the shared buffers, but Pg can't tell whether it is cached or on disk, and disk is much slower so PostgreSQL typically will not take that chance). Keep in mind this affects random vs sequential page access as well. So you may get better performance with lower shared_buffers settings.
My gut sense is that you probably get better, or at least more consistent, performance in high concurrency environments with larger shared_buffer settings. Also keep in mind that PostgreSQL grabs this memory and holds it so if you have other things running on the system, the read buffers will hold files read by other processes. It's a very large and complex topic. Larger shared buffer settings provide better guarantees of performance but may deliver less performance in some cases.
Best Answer
There is probably little if any benefit. If all the other indexes are rarely used and this one is intensively used, then maybe there will be. The gold standard would be try it and see. But unless all your scans are index-only scans, the index and the pages of the much larger table the index points at will be fighting over shared_buffers, as they certainly both won't fit.
Just because something will fit in the shared_buffers, doesn't mean it will get loaded into shared_buffers (you could force it with pg_prewarm) and then stay there forever (PostgreSQL doesn't provide a way to "nail" things into shared_buffers). And the buffer eviction algorithm is not all that clever, so don't count on them staying in shared_buffers just because it seems obvious to you that they should.
So why worry about it then?