Peter Robejsek very kindly send me an email recently outlining a problem he had, and a solution he came up with, to transfer a Word document with Endnote references to LaTeX using an applescript?
Since at the time I had a
fairly large (100+ pages) document already written in Word and wanted to
transition to LaTeX I faced the problem of having to replace all the citations.
Obviously doing this manually would have been terrible, so I wrote a bit of
AppleScript to automatize the whole thing with the help of Excel.
I thought it might be useful for
others as well especially given the large number of Mac users these days. So if
you would care to publish it on the blog, please do so (if not, that's also fine).
The code is not very polished since I had never written any AppleScript before
and I am sure the whole strategy could be improved but back when I worked it
out I was looking for a quick rather than an elegant solution.
So, here is the description and the script below in plain text. Any comments, please let me know.
This document
gives a step by step procedure how to go about exporting a library from Endnote
into JabRef and using that to generate BibTex. In the second part (steps 3
& 4) we show how to take an existing word document with Endnote
bibliography and replace the references automatically by LaTeX code references.
It is assumed that the natbib style is to be used with LaTeX and that the user
is on a mac operating system.
1. Endnote
Need to get Endnote Reference Number, Author first name, date and Bibtex key into
the same excel spreadsheet
a) start with Endnote. Here create an export output style (e.g. go to the
finder, Usr/Applications/Endnote/Styles/BibTeXExport.ens and create copy;
Rename to BibTeXExportToJabref.ens or something like that)
b) In Endnote go to Edit->Output Styles->Open Style Manager and select
the newly created output style
c) In Endnote apply the new output style to your current library and go to
Edit->Output Styles-> edit"new output style"
d) For every entry in the Bibliography.Templates category add another field
like "| `endnotekey = `{Record Number}" as
well as a "," after the preceding field entry. The result will look
like this:
…
| `keywords = `{Keywords},
| `year = `{Year},
| `url = `{URL},
| `endnotekey = `{Record Number}
e) Export this file from endnote as a .txt file using your newly created output
style.
2. JabRef
a) add "endnotekey" field: Go to Options-> Set General Fields and
type into the line beginning with "general":
";endnotekey"
b) make the field visible in Jabref: Options -> Preferences -> Entry
Table Columns -> add field "endnotekey"
c) Autogenerate BibTex keys. The resulting library file will be used with
LaTeX
d) also avoid having the url and note fields show up in the bibliography by
going to Tools->Set/Clear/Rename Fields and set all fields to " "
with overwrite active.
e) create a copy of this library file. When that is done open in JabRef and
rename the "endnotekey" field. This is necessary so that its contents
can be exported to *.csv. This is done by: Going to Tools ->
Set/Clear/Rename Fields and renaming the "endnotekey" field to
"note". This will transfer the Endnote Reference Numbers to the field
note. Some entries in the database may already have text in "note"
however. In this case this needs to be deleted first: Tools ->
Set/Clear/Rename Fields-> Clear Fields+overwrite existing values.
f) Export the .bib library as a .csv library
3. Excel
a) import the *.csv library into excel. Delete all columns except for
"Identifier" (=Bibtexkey), "Author", "Year" and
"note".
b) assuming the fields remaining are in the order as above with one header row,
enter the following formula in E2: =LEFT(B2;SEARCH(",";B2;1)-1. This
should obtain for us the last name of the first author.
c) Then enter this formula into F2: =CONCATENATE("{";E2;",
";C2;" #";D2;"}"). This will give us the same format
as an unformatted Endnote citation. also into G2:
=CONCATENATE("{";E2;", ";C2;"
#";D2;"@@author-year}") to get the endnote formatted author year
style. Also in J, K and L =CONCATENATE("{";E2;",
";C2;" #";D2;";"), =CONCATENATE(E2;",
";C2;" #";D2;";") and =CONCATENATE(E2;",
";C2;" #";D2;"}") to get the cases where there are
"mass citations" i.e. more authors in one set of brackets.
d) Enter into H2 the corresponding format for Latex that goes with F2 (i.e.
gives author comma date in brackets):
=CONCATENATE("\citep{";A2;"}") and into I2 the
corresponding format for Author (date):
=CONCATENATE("\citet{";A2;"}"). Now fill down cells E to I.
Corresponding to the mass citation case enter in M, N and O:
=CONCATENATE("\citep{";A2;","),
=CONCATENATE(A2;",") and =CONCATENATE(A2;"}")
respectively.
e) then take these cells and paste special (value) them to columns P-Y, make
note of the number of rows. (Note: In case you notice any weird symbols e.g. ş
gets translated as Yue or something of the sort, you need to search and replace
these before doing step e) in order to be sure that no citations get left
behind)
4. Word
a) Go to Tools-> Endnote X5 -> Convert to unformatted citations. Select
the entire text you want to get your endnote citations replaced in
(cmd+a).
b) run this applescript where the to value should get replaced by the number of
rows from 3.e)
repeat with theIncrementValue from 1 to 345
repeat with theIncrementValue from 1 to 345
tell application "Microsoft
Excel"
set rg1 to "P" &
theIncrementValue
set rg2 to "Q" &
theIncrementValue
set rg3 to "T" &
theIncrementValue
set rg4 to "V" &
theIncrementValue
set rg5 to "U" &
theIncrementValue
set EndForm1 to value of range rg1 as string
set EndForm2 to value of range rg2 as
string
set EndForm3 to value of range rg3 as
string
set EndForm4 to value of range rg4 as
string
set EndForm5 to value of range rg5
as string
end tell
tell application "Microsoft Excel"
set rg6 to "R" &
theIncrementValue
set rg7 to "S" &
theIncrementValue
set rg8 to "W" &
theIncrementValue
set rg9 to "Y" &
theIncrementValue
set rg10 to "X" &
theIncrementValue
set TexForm1 to value of range rg6
as string
set TexForm2 to value of range rg7 as
string
set TexForm3 to value of range rg8 as
string
set TexForm4 to value of range rg9 as
string
set TexForm5 to value of range rg10
as string
end tell
tell application "Microsoft Word"
set findRange to find object of selection
tell findRange
execute find find text
EndForm1 replace with TexForm1 replace replace all
execute find find text
EndForm2 replace with TexForm2 replace replace all
execute find find text
EndForm3 replace with TexForm3 replace replace all
execute find find text
EndForm4 replace with TexForm4 replace replace all
execute find find text
EndForm5 replace with TexForm5 replace replace all
end tell
end tell
end repeat
Please note: The above procedure does not account for names that are of
agencies (European Banking Authority etc.) however the vast majority of
quotations should be easily taken care of in this way.) Feel free to improve on
the approach as desired.
Applying these steps worked well for me at the time of writing. However I can
give nor warranty explicit or implied that the approach is fault free. The only
application took place on MacOS X 10.6.8. Always back up important files before
manipulating them!