This brings back a memory of when I was given the only copy of an XML file that was large data extract that needed to be restored. Unfortunately, somewhere along the lines the XML file had been corrupted in a way that broke most XML parsers of the time.
At the end of the day I had a Perl script that used a regex to extract each top level element in the XML, which it then could attempt to parse. If the element parsed correctly then it was put in known good file and if it didn't parse then it was put in it's own separate file. Luckily there was only a handful of those invalid XML elements, which I could fix up by hand and then stitch back into the known good XML file.
This brings back a memory of when I was given the only copy of an XML file that was large data extract that needed to be restored. Unfortunately, somewhere along the lines the XML file had been corrupted in a way that broke most XML parsers of the time.
At the end of the day I had a Perl script that used a regex to extract each top level element in the XML, which it then could attempt to parse. If the element parsed correctly then it was put in known good file and if it didn't parse then it was put in it's own separate file. Luckily there was only a handful of those invalid XML elements, which I could fix up by hand and then stitch back into the known good XML file.