Automatically extract text, layout and metadata information from XML-files of OCR-ed historical texts