"It looks like something is updating the same row at both servers with different content and made merge agent crash?" This is handled by merge conflict tables, and would not cause the issues you are describing. These conflict tables are located on the publisher database, and are named like: MSMerge_conflict__.
To answer your question about what reinitialization does, by default, reinitialization will take a snapshot of your published articles, drop the articles on the subscriber side, recreate the articles on the subscriber side, and then bulk load data from the snapshot into the subscriber articles. Since this is a production environment, and those articles need to be available on the subscriber side, this should only be used as a last resort.
What you can do is query the MSrepl_errors table on the Distribution database. This will provide you with a command_id and an xact_seqno. You can use these values as inputs into the sys.sp_browsereplcmds stored procedure. This will provide you with command text that is actually failing. Using this information, you can better understand the nature of the failure. If a particular row cannot be inserted or deleted at the subscriber, you may have to either delete the existing row (to allow the insert) or insert a dummy row (to allow the delete), respectively.
I hope this information helps,
Matt
I have now Merge replication setup with 666 Table Articles in production, and I didn't get any warning or errors regarding articles limit.
Update:
Starting from SQL Server 2016, Articles limit is now: 2048
Source
Best Answer
Merge replication supports bidirectional subscriptions, with changes propagated from the subscribers to the publisher.
Other types of replication support this scenario: Transational Replication with updateable subscriptions (deprecated) and Peer-To-Peer Transactional Replication (more complicated to maintain, requires Enterprise Edition).
Keep in mind that Merge Replication keeps track of changes using triggers on each published table (article), which can determine substantial overhead to all write operations against those tables. Merge Replication also requires a unique uniqueidentifier column on each article: if you don't have one, it wil add it for you (more space needes, heavy fragmentation after adding the column).
Make sure you test performance before going down this road.