Cafebr - Citation Amender/Formatter for Biological Research

A reference list is an essential part of a manuscript for an academic article. It is often necessary to reformat (i.e., change orders of pieces of article information such as authors, article title, publication year, and journal where the article was published) of a preformatted reference list when a manuscript is declined by one journal and submitted to another. EndNote, Zotero and Mendeley are examples of sophisticated reference management programs that help generate a reference list with less errors. However, their multifunctionality seems to have complicated the process of generating a reference list: they require many selection steps to obtain a final output, and also require to manually edit a file outside the execuiton program even to make small changes in the output.


Introduction
A reference list is an essential part of a manuscript for an academic article.Different journals require different formats for a reference list.Due to this situation, it is often necessary to reformat a preformatted reference list if a manuscript is declined by a journal for publication and submitted to another.It is error-prone and takes time to manually format a reference list.To quickly generate a reference list with less errors, reference management software packages have been developed.EndNote, which was developed by Clarivate Analytics (previously Thomson Reuters), is an example of such packages.It allows users to search for articles of interest, to deposit them in a database, and to output information about selected articles in a style chosen from more than 2000 preset styles (htt ps://endnote.com/product-details).However, it is a commercial, non-free program, and incompatible with Linux operating systems.Zotero (developed by Center for History and New Media at George Mason University in 2006; https://www.zotero.org)and Mendeley (developed by Mendeley Ltd., which was purchased by Elsevier; https:// www.mendeley.com)are examples of free reference management programs that can implement the functions similar to those of EndNote, and can run on Windows, Mac OS X, and Linux.All of these programs have many functions and user-friendly interfaces, and have still been improving.However, such multifunctionality seems to have been counteracting with simplicity: It requires many selection steps (i.e., cursor movements and clicks) to obtain a final output; even making small changes in the output requires users to look into and edit a file outside a main execution file.To simplify the process of generating a reference list for biological research, the author developed a new program, Cafebr (Citation Amender/Formatter for Biological Research).

Project description
Title: Cafebr -Citation Amender/Formatter for Biological Research Study area description: Cafebr was developed to more simply generate a reference list.It uses only PubMed (managed by The United States National Library of Medicine (NLM) at the National Institutes of Health (NIH), https://www.ncbi.nlm.nih.gov/pubmed) as a search engine to find articles of interest, and thus primarily for biological research.However, once a list (or database) of articles is provided, Cafebr can work on it regardless of whether the articles are those of biological research or not.
Design description: Cafebr was originally written in Perl, and most of the functions were copied to a GUI version written in HTML/JavaScript.This GUI version, as an HTML file, works on a web browser on any platform.To obtain an output reference list simply and flexibly, Cafebr focuses its functions on finding and replacing certain patterns of words rather than on managing (i.e., storing and displaying) articles.
To find articles, Cafebr can search the MEDLINE database with PubMed search engine (managed by The United States National Library of Medicine (NLM) at the National Institutes of Health (NIH), https://www.ncbi.nlm.nih.gov/pubmed), which covers articles with a wide variety of biology fields and is therefore most commonly used for biological research.Cafebr then extracts pieces of informaiton on articles from search results, and displays them in a table for selecting articles for further processing (Fig. 1, upper part).Alternatively, a user can provide a list (or any texts) of articles for Cafebr.Cafebr can precisely extract pieces of information from texts in the PubMed XML format, the PubMed abstract (text) format, or the Cafebr database format, in which one line corresponds to one article record consisting of 12 tab-delimited data fields (Authors, Article Title, Publication Year, Journal name, Volume, Issue, Pages, PubMed ID (PMID), PubMed Central ID (PMCID), DOI, Attributes, Author Information (affiliation etc.)).Texts in none of these formats can also be separated into pieces (fields) of informaiton if delimiters are designated.For example, the reference "Tsugama D (2018) Cafebr development."can be separated into the three fields "Tsugama D", "2018" and "Cafebr development" if the delimiting pattern "F1 (F2) F3." is designated.In this "Delimiter" option, it is also possible to use a journal name as a delimiter if the journal name is available in PubMed.All journal names are listed in the program (cafebr.html) file itself.The journal names are major contributers to the large (~2-MB) file size of cafebr.html,but allows it to be stand-alone (Fig.

2, upper part).
The extracted pieces of information or fields are formatted to generate a final reference list.Currently only four preset formats are available.However, all of them can be directly edited on the user interface, allowing to change output formats quickly and flexibly (Fig. 1 and Fig. 2, lower parts).The article ordering options include not only those to order by author names, publicaiton years and PubMed ID but also the "As in manuscript" option.This option outputs references as they are cited in the manuscript, and converts citations such as " (Tsugama et al., 2018)", "(Mike et al., 2014)" to "[Ref1]", "[Ref2]" in the manuscript.Some of the preset formats add HTML tags to specific fields of article information to display them in the italic, bold, or superscript style on a browser.Microsoft Word, which would be usually used for the final manual formatting of a whole manuscript, can maintain italic, bold and superscript styles if words in these styles are copied and pasted with the "Match Destination Formatting" option.All of these functions of Cafebr are available on its website (either http://stdtgm.itigo.jp/cafebr/cafebr.html(main) or http://studtsugama.s1006.xrea.com/cafebr/cafebr.xhtm(backup) with the aid of a CGI program (cafebr.cgi or cafebr.xcg,respectively).A stand-alone version of Cafebr is available at these websites and Zenodo (1 0.5281/zenodo.1404887).

Figure 1 .
Figure 1.Example of generation of a reference list from a PubMed search on Cafebr.

Figure 2 .
Figure 2. Example of use of the "Delimiter" option on Cafebr.