<?xml version="1.0" encoding="utf-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Google Book Search grants some PDF downloads</title>
	<atom:link href="http://www.stoa.org/archives/472/feed" rel="self" type="application/rss+xml" />
	<link>http://www.stoa.org/archives/472</link>
	<description>Serving news, projects, and links for digital classicists everywhere.</description>
	<lastBuildDate>Wed, 08 May 2013 16:09:24 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
	<item>
		<title>By: gs</title>
		<link>http://www.stoa.org/archives/472/comment-page-1#comment-86459</link>
		<dc:creator>gs</dc:creator>
		<pubDate>Fri, 28 Sep 2007 20:23:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.stoa.org/?p=472#comment-86459</guid>
		<description><![CDATA[Yeh but for blind people, how can they get text from the pdf? there&#039;s no way to ocr it once you download it, it won&#039;t let you.]]></description>
		<content:encoded><![CDATA[<p>Yeh but for blind people, how can they get text from the pdf? there&#8217;s no way to ocr it once you download it, it won&#8217;t let you.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Name (required)</title>
		<link>http://www.stoa.org/archives/472/comment-page-1#comment-21918</link>
		<dc:creator>Name (required)</dc:creator>
		<pubDate>Fri, 01 Sep 2006 19:05:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.stoa.org/?p=472#comment-21918</guid>
		<description><![CDATA[Planet PDF&#039;s criticisms are poorly thought-out.  For example, the statement that the books &quot;are difficult to download&quot; is a broad and generic statement; I certainly had no difficulty downloading Montaigne&#039;s &lt;i&gt;Essays&lt;/i&gt; by simply clicking the obvious &quot;download&quot; button.  But perhaps Planet PDF says this in relation to their statement, &quot;Clicking on a web link to a PDF file normally by default opens the document inside a Web browser.&quot;  This behavior exists in Internet Explorer on Windows with Adobe Acrobat installed, as it does in Safari on the Mac; Firefox, however, is polite enough to ask what a user wants to do with content that requires it to launch another program.

I do agree that PDF optimization is useful and should be standard practice -- pdfopt has been widely and freely available for ages as part of Ghostscript.  Optimization can, however, lead to larger file sizes, and Google may have been trying to avoid that.  

Text OCR would be nice, but there are projects for that already.  Google isn&#039;t into duplicating effort, generally.  Their product is highly legible and gives one a sense of the effort that once went into fine typesetting: compare the Montaigne download to PDFPlanet&#039;s &lt;i&gt;Fables&lt;/i&gt; of Aesop, generated with Microsoft word, and you&#039;ll see the difference between typesetting and bulk text chundering.  The former is an image of a work of skill; the latter is the McDonald&#039;s version.  If Google were trying to put Project Gutenberg out of business, perhaps PDFPlanet&#039;s criticism would be relevant -- but they fail to realize that each project has a separate goal.

Speaking of beauty in typesetting, and having scrubbed my eyes from looking at that Aesop translation generated by Word, the comment PDFPlanet makes about the resolution of Google&#039;s texts being so low for easy legibility is pure bunk.  Blowing up the text so that &quot;hoc est&quot; fills my screen reveals jagged edges, but at a reasonable zoom level, the text passes for perfect.  Perhaps PDFPlanet forgot to turn on antialiasing in their PDF renderer?  Perhaps they engage in hyperbole to attack a major project in order to gain attention for their parent company&#039;s commercial ventures (NitroPDF)?  It has succeeded, as I for one had never heard of their site or their product until this post on Stoa.]]></description>
		<content:encoded><![CDATA[<p>Planet PDF&#8217;s criticisms are poorly thought-out.  For example, the statement that the books &#8220;are difficult to download&#8221; is a broad and generic statement; I certainly had no difficulty downloading Montaigne&#8217;s <i>Essays</i> by simply clicking the obvious &#8220;download&#8221; button.  But perhaps Planet PDF says this in relation to their statement, &#8220;Clicking on a web link to a PDF file normally by default opens the document inside a Web browser.&#8221;  This behavior exists in Internet Explorer on Windows with Adobe Acrobat installed, as it does in Safari on the Mac; Firefox, however, is polite enough to ask what a user wants to do with content that requires it to launch another program.</p>
<p>I do agree that PDF optimization is useful and should be standard practice &#8212; pdfopt has been widely and freely available for ages as part of Ghostscript.  Optimization can, however, lead to larger file sizes, and Google may have been trying to avoid that.  </p>
<p>Text OCR would be nice, but there are projects for that already.  Google isn&#8217;t into duplicating effort, generally.  Their product is highly legible and gives one a sense of the effort that once went into fine typesetting: compare the Montaigne download to PDFPlanet&#8217;s <i>Fables</i> of Aesop, generated with Microsoft word, and you&#8217;ll see the difference between typesetting and bulk text chundering.  The former is an image of a work of skill; the latter is the McDonald&#8217;s version.  If Google were trying to put Project Gutenberg out of business, perhaps PDFPlanet&#8217;s criticism would be relevant &#8212; but they fail to realize that each project has a separate goal.</p>
<p>Speaking of beauty in typesetting, and having scrubbed my eyes from looking at that Aesop translation generated by Word, the comment PDFPlanet makes about the resolution of Google&#8217;s texts being so low for easy legibility is pure bunk.  Blowing up the text so that &#8220;hoc est&#8221; fills my screen reveals jagged edges, but at a reasonable zoom level, the text passes for perfect.  Perhaps PDFPlanet forgot to turn on antialiasing in their PDF renderer?  Perhaps they engage in hyperbole to attack a major project in order to gain attention for their parent company&#8217;s commercial ventures (NitroPDF)?  It has succeeded, as I for one had never heard of their site or their product until this post on Stoa.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jack Mitchell</title>
		<link>http://www.stoa.org/archives/472/comment-page-1#comment-21917</link>
		<dc:creator>Jack Mitchell</dc:creator>
		<pubDate>Fri, 01 Sep 2006 18:40:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.stoa.org/?p=472#comment-21917</guid>
		<description><![CDATA[It&#039;s strange, because they&#039;re obviously searchable /by Google/, since the .pdf&#039;s show highlighted items.  So they are indeed OCR&#039;d.  I wonder what the blue lines through certain parts mean.  I hope half the stuff doesn&#039;t have to be rescanned; though, of course, that could be done.

It may not be everything, but it&#039;s a pretty hefty wedge in the door, I think.  And, frankly, given what I&#039;ve found so far, who /needs/ anything written in the 20th century?  : )]]></description>
		<content:encoded><![CDATA[<p>It&#8217;s strange, because they&#8217;re obviously searchable /by Google/, since the .pdf&#8217;s show highlighted items.  So they are indeed OCR&#8217;d.  I wonder what the blue lines through certain parts mean.  I hope half the stuff doesn&#8217;t have to be rescanned; though, of course, that could be done.</p>
<p>It may not be everything, but it&#8217;s a pretty hefty wedge in the door, I think.  And, frankly, given what I&#8217;ve found so far, who /needs/ anything written in the 20th century?  : )</p>
]]></content:encoded>
	</item>
</channel>
</rss>
