Once you add a user to the individual shards, which you indicate you have done for MMS
, you must then have valid credentials to connect for any purpose, including mongodump
. Up until you added that user for MMS
, the shards were running with authentication enabled but with no users populated (this only happens if all your users are in the admin database ans using delegated auth for other databases, otherwise with 2.4 and below you would have at least one shard with users for each database - 2.6+ changes this behavior) and so you were able to connect without credentials.
Essentially this is a loophole left open so that you don't accidentally lock yourself out of your instances when you turn on auth with no users (and one that would probably have stopped working at some point as default security is tightened anyway).
The bottom line is that you will need to add a user for use with mongodump
, and it's a good idea to do so anyway rather than allowing non-authenticated users free access to your instances. If you are running 2.6 or later, then the built in backup role exists precisely for this purpose, if you are on 2.4 or earlier, then the description for that role gives you a great outline of what is needed to backup successfully (and in particular if you want to backup the users themselves).
Best Answer
As the page you linked implies, any point in time snapshot technique that includes the data files and the journal will suffice, LVM is just one option. EBS snapshots in Amazon EC2 will also work, as will similar snapshot solutions on SAN, NAS etc. You are not limited to LVM, but that is generally a solution people can implement themselves.
In terms of whether you can copy files to perform a backup, the answer is yes, but only if you stop all writes to the node you are backing up (thereby guaranteeing no changes to the files during the copy). You can do this in a couple of ways:
The most straight forward way is to just shut the node (this should be a secondary) down, copy the files, then start the node back up and let it catch up to the primary (check optime using
rs.status()
). Rinse and repeat (if you wish) to cycle through all nodes in the set, though the nodes are all identical, so one copy should generally be enough.The second way (mentioned by sysadmin1138) is to fsync (flush data to disk) and lock (prevent writes) the node but leave it running using the
fsyncLock
command (again, this should be a secondary). Once you have completed the copy, you unlock the database using thefsyncUnlock
command. There are dangers inherent in this technique - for example (and particularly if you are using authentication), you should always lock and unlock on the same connection, otherwise you risk locking yourself out of the database and having to kill the process to recover.As for other risks, it is common to use a hidden node for backups in each case, which prevents accidentally attempting reads from the node while it is behind, and/or while it is locked (depending on your methods).
Finally, there is one further (paid) option - MMS Backup. This service will essentially do all this for you (for a fee) and give you extras like point-in-time recovery and more - note: I work for MongoDB so I won't give you the hard sell here, but feel free to evaluate it yourself.