Linux – Alternative to xmllint to check xml validity

linuxxml

Sometimes, I have to check the validity of some big xml files against a xsd file.
The biggest xml file I received had a size close to 1.5GB.
xmllint took all my RAM and almost all the swap space, for a total memory usage of 18GB.
Consequently, the validation process lasted for 24 hours.

My question: Is there an alternative to xmllint --schema that consumes less memory, perhaps making use of some streaming features instead of loading the file to memory?

Best Answer

I did not test these validators, but from the top of my mind / little search:

  1. XMLStarlet - can be used for other things as well
  2. msv - Sun multi schema validator
  3. HaXML - haskell xml tools contains command line utilities (one is a validator)
  4. xsltproc should also verify documents at startup

There are plenty more option as most utilities will automatically validate the xml document, like e.g. xsltproc.

Related Question