Bibtex and Mendeley-Desktop


Today I came across a weird behavior of mendeley-desktop-generated bibtex files while formatting references for my thesis. As my PhD field is zoology, I use italics a lot for scientific names and formatting all of them in the references by hand can be boring, even more when each time that mendeley syncs its database, it overwrites the previously-generated .bib files. So, manual modification requires to place the textit command elsewhere, compile with bibtex, and then compile the whole document with pdflatex (actually with C-c C-c LaTeX RET since I use emacs). However, as I edit, add or remove any reference in the database, the automatic bibtex files generated by mendeley will be overwritten and all hand-made changes will go away with it.

My first guess was that if I include the tag in the reference title (e.g., Phractocephalus nassi as \textit{Phractocephalus nassi}) the compiler would understand it as a latex command correctly, assuming that mendeley desktop was generating a plain-text version of the metadata. In fact I changed a lot of references before even testing whether it would work or not (shame on me), just to note that the PDF output was showing the latex command too! After a quick search I found that since eight years ago, the mendeley user community is asking the developers to support italics in metadata (see here). Later on during 2015 mendeley estated that mendeley desktop (as md onwards) does not support latex/bibtex. This issue has been already addressed by Kathy Lam); her solution to this issue was implemented in python in order to replace automatically the <i> to \textit{, and in a second version, to its escaped version {\\textless}i{\\textgreater}. I made first a “manual” replacement in emacs with M-% “tags” to “\textit{” or “}” but then realized that I would need to do it every single time I wanted to correct the bibtex formatting; then I found it a good chance to learn a bit more bash so that I could encapsulated sed replacements into a bash script that accepts arguments, the latter being until then a mistery to me.

My implementation searches both <i> and {\\textless}i{\\textgreater} tags (more on this later) and uses sed -i -e 's/tag1/tag2/g' file to correct inplace tags in the bibtex files. I also wanted the script to manage multiple files since md has the option of creating a bibtex file for each collection of references in my library (my current setting), so actually I have multiple files to be converted, not just one. The argument of my script would be then the path to the directory where md creates these files, since one of them (PhDThesis.bib) is already defined as the references file in my LaTeX main document; therefore, my thesis document uses automatically the file generated by md, so format conversion needs to take place between creation and compilation of the .tex file, and optionally it won’t hurt to to convert the remaining ones for sharing or compilation of other documents.


# name the bibtex file or path-to-file for ease of understanding

# replace using sed with the in-place and expression arguments looking for html tags mistranslated to latex '<i>' = '{\textless}i{\textgreater}' '</i>' = '{\textless}/i{\textgreater}' to '\textit{' and '}' respectively
# Also, please note that find ... | will find only files in $bibtexPath and then feed them one by one to read FILE so that its content passes to the iterartive variable $FILE (between quotation marks since it will take the content of the variable literally, without breaking at spaces)
find $bibtexPath -type f | while read FILE
echo Processing file "$FILE"
sed -i -e 's/{\\textless}i{\\textgreater}/\\textit{/g' "$FILE"
sed -i -e 's/{\\textless}\/i{\\textgreater}/}/g' "$FILE"
echo Success!

Two important things are noteworthy here: First, the script tells the user whether the file was visited and whether it was successfully converted; and second, it replaces both types of tags since mendeley can generate both depending on whether the option “Escape LaTeX special characters” is active under ‘Bibtex’ in options. Because of the latter, this script differs from Kathy Lam’s implementation as it is of wider application regardless of whether the file was created escaping special characters or not. The script is housed in my general_scripts repository on github. To be useful, it can be either executed as super user, or changed to executable and placed in an executable files path:

# Option 1
user@computer:~/path/to/script$ sudo ./html2bibtex path/to/bibtex/files

# option 2
user@computer:~/path/to/script$ sudo chmod +x html2bibtex # this makes the script executable
user@computer:~/path/to/script$ sudo cp html2bibtex path/to/executable/files/dir # such path can be something like /usrs/bin for instance
user@computer:~$ html2bibtex path/to/bibtex/files # run the script with the path as argument

This solves the problem of html tags, but what if we already have some reference titles with the \textit command (as was my case)? There are still two options: Convert them to properly-formatted latex, or deactivate the “Escape LaTeX…” option. The latter demonstrated to work directly after compiled since tags are not escaped when md generates the bibtex file; however, this approach is dangerous if there is any special character in the reference name (e.g., %) as it will cause problems during compilation (so far I have not run into any of these problems, though). Therefore, activation of the escaping option is the safest way to deal with bibtex file creation, but ir requires either to re-convert the commands to valid latex or to change all italics in reference name to html tags so that the html2bibtex script can deal with them properly. I suggest to use html tags instead of latex commands since the libreoffice plugin does not understand the latter, though it formats properly the html tags. So, the most general setting is to use html tags in metadata so that you can either use bibtex (after conversion of html tags to latex command with html2bibtex and before compilation of the .tex file) or the libreoffice plugin for generating bibliographies.

Hope this helps anyone having problems with bibtex and mendeley-generated files, specially in the biological sciences where italized terms are widespread.

Published by gaballench

Ictiólogo con interés en informática y música, simpatizante de iniciativas open source.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: