Have you ever read scientific papers with an ebook reader? If so, you may have found it difficult to do. Compared with text-intensive documents, scientific papers are usually printed in a two-column format and they typically contain tables and figures. If the table or graphic appears in a different column on the same page, ebook readers have to scroll up and down in the corresponding text to view it.
The author of this paper has developed a tool that generates a single column of integrated text and graphics for scientific papers so that ebook users can more easily read them. Moreover, the tool allows the user to annotate the reformatted paper. While ebook readers have already been introduced for some products [1,2,3], they are primarily intended for text-intensive documents. This paper focuses on the layout of figures and tables in scientific papers specifically.
The proposed approach first analyzes page layout, and then arranges the contents of a document in an equivalent single-column document. In the first step, it computes font differences with two median values: median of black horizontal runs (MBR) and median character height (MCH). The author also introduces desired height, sometimes resulting in the incorrect repositioning of annotation. If the algorithm incorporated an adaptive framework for the height, this problem could be solved. The author tested the approach using only papers published in the Proceedings of DocEng ’10; to verify the effectiveness of the proposed approach, it will be necessary to analyze various types of papers. Furthermore, summarization based on the context of the scientific paper and search based on annotation would make the tool much more user-friendly.