Google Books wins case against authors over putting works online

mirimir · on Feb 26, 2018

Nobody who has ever used Google Books would think that it's putting complete copies online. The results include pages with hits, plus pages around them. There are so many missing pages that books aren't readable.

rburhum · on Feb 26, 2018

I do - or more accurately -I did. I remember years back there were techniques published that you could use by tweaking the URL to trick Google into giving you specific pages. I am sure people quite software to exploit that.

MarkMc · on Feb 26, 2018

Can't I just search the last phrase on the last result page to get to the next page in the book?

taneq · on Feb 26, 2018

Pretty sure it limits you to a set number of pages from any given book per day or something. I'm sure with sufficient effort you could work around it but a casual user can't just work their way through a complete book.

kelukelugames · on Feb 26, 2018

If that's the case then maybe we can write a program that scrapes 1% of 100 books every day. After 100 days we will have 100 books!

mseebach · on Feb 26, 2018

At some point you could also just get the book from the library and scan/photograph and OCR it.

meitham · on Feb 26, 2018

It's different. When doing something as your described in a public library, you're clearly breaking the library rules while you're in their premises and can be caught, very different than having that being done by a software your ran from your home. The former also does not scale to be economically profitable, where the later can be hacked once and ran to scrape thousands of books, basically a mass reproduction of books illegally.

mseebach · on Feb 26, 2018

1: I'd suggest you'd check out the book and bring it home with you and do the scanning at home.

2: This is not Google's first rodeo when it comes to scraping. I'd be surprised if you got more than a few days into the described process, so no, it certainly does not scale.

coroxout · on Feb 26, 2018

In my experience the books tend to have some pages which are always shown in the preview, some pages which appear never to be viewable in preview mode, and some pages which you may be able to preview if you have not reached your limit.

So after 100 days you'll probably still have 100 incomplete books.

golfer · on Feb 26, 2018

You can't just thumb through all the pages. It's search based, so you'd have to know what phrases to search for. Meaning, you'd need to have the book already.

Google actually thought this through.

b4lancesh33t · on Feb 26, 2018

No. There is a holdback set of pages that are never displayed. At least according to the lawsuit.

Old_Thrashbarg · on Feb 26, 2018

I suppose you could also read a website through Google Search previews by putting the last few words of the current preview.

mirimir · on Feb 26, 2018

I've never had much luck with that :(

amelius · on Feb 26, 2018

Have you tried switching to a different browser? Or a different IP-address?

mirimir · on Feb 26, 2018

No, I never cared enough. I could usually find what I wanted somewhere else. But then, I'm mainly talking about technical books.

discoursism · on Feb 26, 2018

(Four years ago. The AG eventually appealed to the Supreme Court, but were denied a hearing.

15-849 AUTHORS GUILD, ET AL. V. GOOGLE, INC. The petition for a writ of certiorari is denied. Justice Kagan took no part in the consideration or decision of this petition.

https://www.supremecourt.gov/orders/courtorders/041816zor_2c...)

character0 · on Feb 26, 2018

This was a great rundown of what happened

https://www.theatlantic.com/technology/archive/2017/04/the-t...

And unfortunately, it did not go well.

yohui · on Feb 26, 2018

It's frustrating how the opposition was so painfully naive. As the article says, it was so clearly a case of "perfect being the enemy of the good." The following paragraphs deconstruct the sorry state of affairs that resulted:

> The irony is that so many people opposed the settlement in ways that suggested they fundamentally believed in what Google was trying to do. One of Pamela Samuelson’s main objections was that Google was going to be able to sell books like hers, whereas she thought they should be made available for free. (The fact that she, like any author under the terms of the settlement, could set her own books’ price to zero was not consolation enough, because “orphan works” with un-findable authors would still be sold for a price.) In hindsight, it looks like the classic case of perfect being the enemy of the good: surely having the books made available at all would be better than keeping them locked up—even if the price for doing so was to offer orphan works for sale. In her paper concluding that the settlement went too far, Samuelson herself even wrote, “It would be a tragedy not to try to bring this vision to fruition, now that it is so evident that the vision is realizable.”

> Many of the objectors indeed thought that there would be some other way to get to the same outcome without any of the ickiness of a class action settlement. A refrain throughout the fairness hearing was that releasing the rights of out-of-print books for mass digitization was more properly “a matter for Congress.” When the settlement failed, they pointed to proposals by the U.S. Copyright Office recommending legislation that seemed in many ways inspired by it, and to similar efforts in the Nordic countries to open up out-of-print books, as evidence that Congress could succeed where the settlement had failed.

> Of course, nearly a decade later, nothing of the sort has actually happened. “It has got no traction,” Cunard said to me about the Copyright Office’s proposal, “and is not going to get a lot of traction now I don’t think.” Many of the people I spoke to who were in favor of the settlement said that the objectors simply weren’t practical-minded—they didn’t seem to understand how things actually get done in the world. “They felt that if not for us and this lawsuit, there was some other future where they could unlock all these books, because Congress would pass a law or something. And that future... as soon as the settlement with Guild, nobody gave a shit about this anymore,” Clancy said to me.

> It certainly seems unlikely that someone is going to spend political capital—especially today—trying to change the licensing regime for books, let alone old ones. “This is not important enough for the Congress to somehow adjust copyright law,” Clancy said. “It’s not going to get anyone elected. It’s not going to create a whole bunch of jobs.” It’s no coincidence that a class action against Google turned out to be perhaps the only plausible venue for this kind of reform: Google was the only one with the initiative, and the money, to make it happen. “If you want to look at this in a raw way,” Allan Adler, in-house counsel for the publishers, said to me, “a deep pocketed, private corporate actor was going to foot the bill for something that everyone wanted to see.” Google poured resources into the project, not just to scan the books but to dig up and digitize old copyright records, to negotiate with authors and publishers, to foot the bill for a Books Rights Registry. Years later, the Copyright Office has gotten nowhere with a proposal that re-treads much the same ground, but whose every component would have to be funded with Congressional appropriations.

discoursism · on Feb 26, 2018

What do you mean? It went very well. We still have Google Books after all.

londons_explore · on Feb 26, 2018

This court case killed Google Books.

They've pretty much stopped scanning new books, even new out of copyright manuscripts etc.

Google Books itself had loads of cool possibilities of new ways to make use of the data from those books. This lawsuit has pretty much stopped all innovation, and all the good engineers left the project years ago.

toomuchtodo · on Feb 26, 2018

The Internet Archive’s book scanning project is still in full swing. Yes, the indexing and presentation is not at parity, but I prefer a non-profit digital library to be the canonical reference instead of Google.

yohui · on Feb 26, 2018

That's great for material that's public domain or out of copyright, but the Authors Guild settlement could have digitized and made accessible orphan works that are still under copyright. It would have complemented the public domain projects, not supplanted them.

But instead academic opponents of the deal seriously thought they would have better luck pursuing copyright reform in Congress (!), and helped kill the settlement. Of course, in reality Congress did no such thing, and so the chance to rescue orphan works was lost.

toomuchtodo · on Feb 26, 2018

https://blog.archive.org/2017/10/10/books-from-1923-to-1941-...

yohui · on March 2, 2018

While a good step, this only makes up for a portion of what the settlement would have allowed. (Most obviously, it appears this only covers books from a 20 year period and it takes more work to ascertain that the books are not being sold.)

Moreover, this does not contradict the idea that the Authors Guild settlement could have complemented public domain efforts. Even today some of the books saved on the Internet Archive were retrieved via Google Books: https://archive.org/details/googlebooks&tab=about

toomuchtodo · on March 2, 2018

I agree entirely, but perfect is the enemy of good enough. We can still celebrate small wins while continuing to advocate for copyright reform.

yohui · on March 5, 2018

> perfect is the enemy of good enough

Funnily enough, that's also how the original article described the opposition to the Authors Guild settlement. As it turned out, killing the Google Books project didn't really move us closer to copyright reform.

robin_reala · on Feb 26, 2018

HathiTrust too has great content and indexing, it’s just a shame that it’s much slower than Google Books. But between the that and archive.org they’re a fine replacement.

eyeareque · on Feb 26, 2018

Please add 2013 to the title.

gordon_freeman · on Feb 26, 2018

This article is from 2013. Can someone please edit the title with Suffix (2013)?

wumpus · on Feb 26, 2018

(2013)

ronilan · on Feb 26, 2018

Three other old links that randomly popped into this comment box:

http://www.nytimes.com/2006/10/23/technology/23google.html (2006)

https://www.cnet.com/news/amazon-debuts-kindle-e-book-reader... (2007)

https://en.wikipedia.org/wiki/Pyrrhic_victory (279 BC)

pankajdoharey · on Feb 26, 2018

More than the authors it were the publishers getting greedy.

pseingatl · on Feb 26, 2018

2013 decision. Old news, nothing new. Why post this now?