Discussion:
[pdftex] "fi" can not be copied correctly in a pdf generated by pdflatex
Peng Yu
2011-11-11 16:42:33 UTC
Permalink
Hi,

I run pdflatex on the following tex file. In the generated pdf file,
"fi" can not be copied correctly (from say, acrobat). I then tried
latex->dvipdf, and I can copy "fi" from the pdf file generated in this
way. Does anybody know what is the problem with pdflatex. How to make
pdflatex generated pdf so that "fi" is copiable?

\documentclass{report}
\begin{document}
Definition
\end{document}


My OS is Mac OX 10.6.7. pdflatex is installed here. But I don't
remember how it was installed.

/usr/local/texlive/2010/bin/x86_64-darwin/pdflatex
--
Regards,
Peng
Martin Schröder
2011-11-12 09:52:13 UTC
Permalink
Post by Peng Yu
way. Does anybody know what is the problem with pdflatex. How to make
pdflatex generated pdf so that "fi" is copiable?
http://tex.stackexchange.com/q/33476/5763

Best
Martin
Eric Marsden
2011-11-12 09:02:00 UTC
Permalink
py> I run pdflatex on the following tex file. In the generated pdf file,
py> "fi" can not be copied correctly (from say, acrobat). I then tried
py> latex-> dvipdf, and I can copy "fi" from the pdf file generated in this
py> way. Does anybody know what is the problem with pdflatex. How to make
py> pdflatex generated pdf so that "fi" is copiable?

The "fi" is included as a Unicode ligature in the PDF file. It's
really a bug in the PDF reader if it can't handle these characters
(evince on Linux works fine, for example).

You can work around this problem by using

\usepackage[resetfonts]{cmap}

in your document preamble.
--
Eric Marsden
Peng Yu
2011-11-14 04:40:19 UTC
Permalink
Hi Eric,
? ?\usepackage[resetfonts]{cmap}
The above code works for the word "Definition" in the main text but
not for the one in the table of the content. Is there anything special
about the table of the content? How to fix it for the table of content
as well?

\documentclass[11pt, a4paper, titlepage]{report}
\usepackage[resetfonts]{cmap}
\begin{document}
\tableofcontents
\chapter{Introduction}
\section{Definition}
\end{document}
--
Regards,
Peng
Ross Moore
2011-11-14 05:53:03 UTC
Permalink
Hello Peng,
Post by Peng Yu
Hi Eric,
Post by Eric Marsden
\usepackage[resetfonts]{cmap}
The above code works for the word "Definition" in the main text but
not for the one in the table of the content.
Hmm. That's strange, and surely a bug in cmap.sty .
Post by Peng Yu
Is there anything special
about the table of the content? How to fix it for the table of content
as well?
Try

\usepackage[noTeX]{mmap}

instead of the line for \usepackage{cmap} .
(Not sure what the [resetfonts] is for )

Does this now work for you?
It does for me, but I'm using an updated version of mmap.sty
that isn't generally released yet.

Hopefully the older version also works in this kind
of situation.
Post by Peng Yu
\documentclass[11pt, a4paper, titlepage]{report}
\usepackage[resetfonts]{cmap}
\begin{document}
\tableofcontents
\chapter{Introduction}
\section{Definition}
\end{document}
--
Regards,
Peng
All the best,

Ross

------------------------------------------------------------------------
Ross Moore ross.moore at mq.edu.au
Mathematics Department office: E7A-419
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia 2109 fax: +61 (0)2 9850 8114
------------------------------------------------------------------------
Robin Fairbairns
2011-11-14 09:33:36 UTC
Permalink
Post by Ross Moore
Post by Peng Yu
Post by Eric Marsden
\usepackage[resetfonts]{cmap}
The above code works for the word "Definition" in the main text but
not for the one in the table of the content.
Hmm. That's strange, and surely a bug in cmap.sty .
doesn't happen for me, with that call of cmap, in my test file.
Post by Ross Moore
Post by Peng Yu
Is there anything special
about the table of the content? How to fix it for the table of content
as well?
Try
\usepackage[noTeX]{mmap}
instead of the line for \usepackage{cmap} .
(Not sure what the [resetfonts] is for )
resetfonts option tells the package to remap the fonts built into the
format as well as those loaded subsequently. (presumably on the basis
that nobody uses ot1 nowadays, so all fonts are going to be loaded
later. obviously false, in this case.)

of course, it's not necessary if you're using anything other than ot1
encoding, or non-standard fonts.
Post by Ross Moore
Does this now work for you?
it works equally well for me, in my test file.
Post by Ross Moore
It does for me, but I'm using an updated version of mmap.sty
that isn't generally released yet.
Hopefully the older version also works in this kind
of situation.
Post by Peng Yu
\documentclass[11pt, a4paper, titlepage]{report}
\usepackage[resetfonts]{cmap}
\begin{document}
\tableofcontents
\chapter{Introduction}
\section{Definition}
\end{document}
does indeed fail (produces ^L in the toc for the fi ligature).

my trivial test doc:

\documentclass{article}
\usepackage[resetfonts]{cmap}
%\usepackage[noTeX]{mmap}
\begin{document}
\title{foo bar}
\tableofcontents
\section{definitions}
here we are again: defining nothing

\section{should be ok}
grumbly umbly
\end{document}

works fine with either cmap or mmap.

if i change it to \documentclass[11pt]{article}
the "fi" in "defining nothing" becomes ^L if i'm using cmap, but it's ok
with mmap.

all my tests were conducted on a (full) tl11 installation, last updated
on friday -- i've not installed anything special, that would be used in
this test (i do have one or two non-free fonts, which aren't in tl, but
they're unrelated to these tests.)

conclusion: there is indeed a bug in cmap, probably relating to
non-default sizes. it's easy to say that (i've got one instance), but
characterising the bug properly would take rather more effort.

i shall not recommend cmap.sty in the faq answer i'm writing on this
topic; i look forward to installing this shiny new mmap when it surfaces
;-)

robin
Peng Yu
2011-11-14 13:21:33 UTC
Permalink
On Mon, Nov 14, 2011 at 3:33 AM, Robin Fairbairns
Post by Robin Fairbairns
Post by Peng Yu
? ?\usepackage[resetfonts]{cmap}
The above code works for the word "Definition" in the main text but
not for the one in the table of the content.
Hmm. That's strange, and surely a bug in ?cmap.sty .
doesn't happen for me, with that call of cmap, in my test file.
Post by Peng Yu
Is there anything special
about the table of the content? How to fix it for the table of content
as well?
Try
? \usepackage[noTeX]{mmap}
instead of the line for ? \usepackage{cmap} .
(Not sure what the [resetfonts] is for )
resetfonts option tells the package to remap the fonts built into the
format as well as those loaded subsequently. ?(presumably on the basis
that nobody uses ot1 nowadays, so all fonts are going to be loaded
later. ?obviously false, in this case.)
of course, it's not necessary if you're using anything other than ot1
encoding, or non-standard fonts.
Does this now work for you?
it works equally well for me, in my test file.
It does for me, but I'm using an updated version of ?mmap.sty
that isn't generally released yet.
Hopefully the older version also works in this kind
of situation.
Post by Peng Yu
\documentclass[11pt, a4paper, titlepage]{report}
\usepackage[resetfonts]{cmap}
\begin{document}
\tableofcontents
\chapter{Introduction}
\section{Definition}
\end{document}
does indeed fail (produces ^L in the toc for the fi ligature).
?\documentclass{article}
?\usepackage[resetfonts]{cmap}
?%\usepackage[noTeX]{mmap}
?\begin{document}
?\title{foo bar}
?\tableofcontents
?\section{definitions}
?here we are again: defining nothing
?\section{should be ok}
?grumbly umbly
?\end{document}
works fine with either cmap or mmap.
if i change it to \documentclass[11pt]{article}
the "fi" in "defining nothing" becomes ^L if i'm using cmap, but it's ok
with mmap.
all my tests were conducted on a (full) tl11 installation, last updated
on friday -- i've not installed anything special, that would be used in
this test (i do have one or two non-free fonts, which aren't in tl, but
they're unrelated to these tests.)
conclusion: there is indeed a bug in cmap, probably relating to
non-default sizes. ?it's easy to say that (i've got one instance), but
characterising the bug properly would take rather more effort.
i shall not recommend cmap.sty in the faq answer i'm writing on this
topic; i look forward to installing this shiny new mmap when it surfaces
;-)
But latexmk -pdfdvi produces a pdf without any problem (without cmap).
If latex -> dvipdf can produce a pdf without problem, why pdflatex
can't do so. Isn't it more of a bug of pdflatex?

\documentclass[11pt, a4paper, titlepage]{report}
%\usepackage[resetfonts]{cmap}
\begin{document}


\tableofcontents

\chapter{Introduction}

\section{Definition}

\end{document}
--
Regards,
Peng
Ulrike Fischer
2011-11-17 10:49:54 UTC
Permalink
Post by Peng Yu
Hi,
I run pdflatex on the following tex file. In the generated pdf file,
"fi" can not be copied correctly (from say, acrobat). I then tried
latex->dvipdf, and I can copy "fi" from the pdf file generated in this
way.
On my system (miktex 2.9) I can't copy the fi and the fl from a pdf
generated with latex + dvipdfmx (due to a curious interference of
the glyphlists used by dvipdfmx).
Post by Peng Yu
Does anybody know what is the problem with pdflatex. How to make
pdflatex generated pdf so that "fi" is copiable?
\documentclass{report}
\begin{document}
Definition
\end{document}
Another reason to use T1-encoding. It should work with

\documentclass{article}
\usepackage[T1]{fontenc}
\begin{document}
ff fl ffi definition
\end{document}


Explanation: The cm-super fonts contains "Ligature" declaration
which the adobe reader can use as you can see if you look in the
afm-file. e.g.

C 102 ; WX 305.481 ; N f ; B 33 1 358 706 ; L f ff ; L i fi ; L l fl
;

(The important part for the fi is the "L i fi").


If you insist to use the old cm-fonts you can also get around the
problem with this:

\documentclass{article}
\input{glyphtounicode}
\pdfgentounicode=1
\begin{document}
ff fl ffi definition
\end{document}
--
Ulrike Fischer
Loading...