[rescue] Is it kosher to post Craigslist links here?

velociraptor velociraptor at gmail.com
Wed Jul 5 13:24:48 CDT 2006


On 7/3/06, Joshua Boyd <jdboyd at jdboyd.net> wrote:
> On Mon, Jul 03, 2006 at 03:31:52PM -0400, der Mouse wrote:
> > >> Shrug.  I don't much care what other formats are available, as long
> > >> as plain text is.
> > > I do all my documentation in PDF's.  Looks respectable when printed.
> > > Easy to extract the text as needed.
> >
> > Actually, rather difficult to extract text from, in my experience.  But
> > perhaps that's just because I refuse to use closed-source tools like
> > Acrobat.  (Someday, when my collection of round tuits fills out, I'll
> > build a PDF picker-apart.  But the PDF doc is something like a thousand
> > pages - and is, of course, itself a PDF file, leading to an amusing
> > chicken-and-egg situation.)
>
> It depends on the PDF file, but for a properly constructed file, there
> should be a way to extract the text on linux.  In evince cutting and
> pasting sometimes works.  I don't recall if it also works in xpdf.  I
> seem to recall that it doesn't work in gv.
>
> Of off the top of my head, I'm not certain how to extract the entire
> file from the command line, and I suspect that any such method would
> make no attempt at nice formatting.

pdf2ps | ps2ascii

or

pdftotext

There are plenty of ways to circumvent the "drm" of pdf files as well,
whether it's legal where you live is another matter.  A brief google
also suggests that xpdf might actually disable DRM in pdfs as a base
function of the tool.

=Nadine=



More information about the rescue mailing list