Screenshot of exemplar machine-readable NeXML formatted data output from our automated analysis of the figure image from figure 1 of Park et al. 2008. Note that the genus, species, strain, and Genbank Accession numbers are semantically distinguished where detected. Heuristic post-OCR autocorrection processes are also noted where these have been applied (e.g. the conversion of a letter 'Z' to the number '2' in many Genbank Accession numbers). A machine-readable version of this file is supplied as supplementary material (Suppl. material 2).

 
  Part of: Mounce R, Murray-Rust P, Wills M (2017) A machine-compiled microbial supertree from figure-mining thousands of papers. Research Ideas and Outcomes 3: e13589. https://doi.org/10.3897/rio.3.e13589