Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A.P. Moving to Halt Use of Newspaper Articles on Web Sites (nytimes.com)
24 points by senthil_rajasek on April 6, 2009 | hide | past | favorite | 47 comments


Bizarre. It sounds as if they want to stop appearing in Google search results. Boy do these people not understand the web.


They want to stop appearing in Google search results as long as Google isn't paying them for the right to present those results, which I think is an important distinction to make.

Playing devil's advocate a bit, from the AP's perspective, how is Google really any different than any other distributor of AP content? It's just another company using their content to hang ads off of and generate revenue (like all of their other customers), except that Google doesn't pay for it.

AP makes money from volume, lots of papers subscribing to their feed. Google means that fewer papers are needed (it just needs to index one, doesn't it?), chopping the AP's customer base down and eventually making their current model completely obsolete. Seeing the writing on the wall and trying to prevent it from happening, that's not evidence they don't get it, that's evidence they get it all too well but just have no idea what to do about it.

Any thoughts on what to do about it? As far as I can tell, nobody has really figured it out yet, have they?


> Playing devil's advocate a bit, from the AP's perspective, how is Google really any different than any other distributor of AP content? It's just another company using their content to hang ads off of and generate revenue (like all of their other customers), except that Google doesn't pay for it.

Seriously? Google links out to the original copy. You see a headline and at most a couple sentences. If there's a fulltext copy it is licensed.


Yes, of course they do. But before you click through, there are a bunch of ads displayed off to the side. And Google just exposes the one newspaper that they're sending the traffic to, so the one with the best SEO wins while everyone else goes out of business, directly threatening the AP's bottom line.

So what's the AP going to do? They can restructure their rates so they capture more from the winners, and/or they can hit the whole ad stack (which Google is the first layer of, now). I mean, really, what else do you expect them to do? Just wither and die?

Another way to say it: when the distribution of a pile of money across a group of companies shifts from normal to power law, and you were getting paid by that group, you had better figure out how you're going to get more money out of the winners as the losers start dropping out.


"the one with the best SEO wins while everyone else goes out of business, directly threatening the AP's bottom line."

Versus what? The AP is somehow going to benevolently distribute all paid links equally amongst all of its various print client to help them all reach each month's traffic goals?


Um, no. What did I say that suggested that? Did you read the article? They're going to try to change their model so that they get paid more by the companies that are shifting to the spiky end of the emerging power law distribution, and capturing revenue from the whole ad stack.


AP's headlines and first sentences are probably their most valuable product.


I've been wondering whether the sight of various big, classic newspapers going bankrupt was going to spark some confused and panicked moves.

This sure could be one of them.


There's a reason this is a coordinated effort. No single newspaper would want to lose the incoming traffic from Google News, people would just go read from their competitors article. If they all act together, then A.P. thinks they might get some money out of Google. But these are the internets baby, if you will not let Google index it, someone else will.

The reason these newspapers aren't going after, say Drudgereport, is that they figure they can get some money out of Google. Fair use or not, why is Google rich and not the papers?

I'm having a hard time resisting the schadenfreude.


I doubt Google is even making much money off Google News. The wire services probably believe Google is making all the money they're losing, but that is just not the case: the pie is shrinking.


From the sounds of it they're still in the mindset where they want to do things their own way and not have to worry about the web. It's kind of sweet in a way, and I feel some sympathy for their operation. You spend however long doing things one way, it's comfortable, and you just want to lock out everything else.

The problem, of course, is that as old forms of media die, the rule is "shift or be shafted," and the AP is likely not powerful enough to take on indexing and last.


The article states that Google was not mentioned, and that legal licensees of the content (like Google is) would not be sued. Finally, the gist of the article is that publishers do not want their content appearing on unauthorized ad-supported sites, and Google does not appear to display full articles outside of the AP feed it licenses.


But what would happen if Google just removed them all from the indexes for a month or so. I for one would be giggling hysterically for quite a while :)

Then the papers (or what is left of them) will be demanding that Google index them as an issue of national importance. 'Do no evil' does not preclude showing them who's boss does it? Tough love and all that.


> But what would happen if Google just removed them all from the indexes for a month or so. I for one would be giggling hysterically for quite a while :)

I, for one, would be taken aback if Google started actively playing King-maker. It's sad enough that one company has the ability to do what you say, but to actively flaunt that ability would cause me to search for another search engine of choice, and hopefully I'm not alone.

Besides, playing games with index-censoring would open up space for competitors (competing on the basis of not censoring, while hopefully having comparable result quality), which is not exactly in Google's best interest.


It's effectively what the AP is asking though, or at least what they're insinuating that they're asking. Either to de-list, or to have Google list and index based on their stated (or paid, or bought) priorities. I wouldn't ever expect/want Google to actually just go for it, but if they did just flip the switch on the first bit, the results would be fairly amazing for that month. Google News would become even shittier than it is for that month, too.


you can block Google via robots.txt though?


Well what would happen to all the aggregators during this month if they had no one to link to? Would they have to buy newspapers and hand type their entries to get their content?


We can no longer stand by and watch others walk off with our work under misguided legal theories.

Is attacking fair-use now the official go-to play when you can't figure out how to deliver your content to your customers in a way they actually want?

I swear I'm having a flash-back to the RIAA five years ago with this press release.


There is a "share" widget next to the original article on the NY Times - you can repost the article to Facebook, etc, with the click of a button.

I am ... confused ... by what the newspaper industry wants us to do with their content.


I am ... confused ... by what the newspaper industry wants us to do with their content.

They are, too. No one seems to have a good idea about how to deal with the transition. Especially with the NY Times, you can't really blame them. They've been very open about the situation they are in and have been experimenting heavily. The AP, on the other hand, has been pretty aggressive and seems to believe they can fight off progress.


I don't even use Google News much. I let somebody else decide if it's worth reading and follow links from other sites and blogs. Most "news" is useless, largely wrong, or at best seriously incomplete. Following the "news" is almost as bad as watching television.


Google's Response:

"Hey douchebags at the AP, if you could, oh I dunno get out of your own damn way and actually get your content online and make it more accessible there wouldn't be a need for 'aggregators' like us."

And since when is 'free advertising' like the AP is getting all over the net a bad thing? As long as people aren't claiming the content is 'theirs' I see no problem with effective and productive aggregation services like Google News.


Not surprisingly, the publisher of this article wanted me to register before viewing it.

Here's a link to text only:

http://tinypaste.com/pre.php?id=5c678


watch...you'll get sued for copyright infringement or some other BS like that


Before the web, the AP provided a valuable service for newspapers. They provided national/international articles to newspapers who might otherwise be unable to provide national/international news, due to the expense of having reporters in every major city in the US/world.

Post web: what value do they provide in selling articles for publication (on the web, specifically)? It's just as easy (if not easier) to just link to an AP article (say, on the AP site itself) and not pay (that is, if a web-based newspaper run by a traditional print newspaper company could grasp that it's okay to lead people off their site) the AP subscription fee? The AP model now falls apart.


Have you run a content site of your own?

What you say is the standard line of thought on the AP that lots of hackers and new media folks take, but it's not based on reality.

If you link to an AP article instead of publishing the whole thing, you lose out on Google hits from both organic search and Google News. Even if 100 other people run the same article on their sites, Google's algorithm will reward you more for publishing it for the 101st time than it will if you just link to it on someone else's site.

Sending people away is great if it's the only thing you do. But if your strategy includes racking up pageviews with ads on them, then you MUST keep them on your site. So you rewrite other people's stories (Gawker), you quote/steal big chunks of other people's copy (Business Insider) and you make attribution links as small as posisble (Gothamist).

As long as you get more hits from running a full article over just linking to someone else's article, there's no incentive for publishers to do the sorts of things that seem so intuitive to us geeks.


Where do you draw the line between "unlawfully using" an AP story and linking to it?


I'm actually very near to launching an automated news aggregator and I'm wondering if this is going to affect me. My suspicion is that sites like Techmeme / HN or smaller will probably be left alone. What do you guys think?


I think Techmeme is the sort of site they want to shut down. Hacker News and Digg would be much, much harder.

The difference is that while Techmeme and Google news are only displaying the headlines and summaries, they are indexing the FULL article and making use of it to power their algorithms.

If they determine it that way, I don't see what the big deal is. If you want to use the Twitter API, you have to pay past a certain point. Maybe a year from now, if you want to index full news articles for your aggregator, you have to pay past a certain point as well. That wouldn't be too bad at all and would leave that act of linking safe and untouched.


That's a very interesting way to look at the problem - algorithms like mine and Techmeme's do indeed digest the full article whereas HN does not. I had thought the primary issue would be whether or not the site provided a summary / thumbnail (as I see WindyCitizen does), not whether or not it scanned the source's bits.

Having to pay for my algorithm to access this data would be a big deal to me - I'm operating on a shoestring budget and I don't want to do that.


Sure, it would make things hard for you, but again, if you were using any other sort of data, there'd be usage costs. The news folks are just figuring that part out now, while every tech-first company has that build into their business from the start.


Except for the fact that transmission/distribution costs are almost zero and 'news' is a broadly consumed information resource. It doesn't matter that there would have been usage costs in the past -- they would be silly now.


If every other API charges for usage, why shouldn't news sites?

If you want to index their stories, you can do 500 queries per day for free. After that, you pay, just like any other API.


But the API is HTTP and the news stories are syndicated across a hundred different sites. How do you limit the crawlers under this scheme? It seems like any serious attempt to limit crawling will require major software redeployment, cooperation of crawlers, widespread authentication, or some combination of these. Is there actually a feasible way to do this without breaking the web?


These are all good points. I don't have answers to any of them. Feasibility is a whole other issue.

My point is that if you look at online newspapers as online services, then they should be able to charge people for programmatic access to their service, just like any other tech service does through its API.

If I want to build an app on the back of Yahoo BOSS, I have to pay Yahoo.

If I want to build an app on the back of the New York Times, maybe I should have to pay the New York Times.


whats the name of your aggregator? we're doing something similar. Wanna chat? peter [\at] omgponi.es


This move may not be as naive as it may sound. Newspapers have built a user base and loyalty over the years based on and presenting perspectives that suit their readership base.

Google and other news aggregators break this ability of newspapers to "present a single perspective" and often present headlines from WSJ and nytimes side by side.

Imagine the advantage newspaper sites would have if you HAVE to go to online.wsj.com or nytimes.com to get your news instead of google.com/news or another aggregator.


When the NYT required you to register to read an article that a blogger was discussing I did not bother reading the article. Lots of people did just the same, it is news, it's not like other news sources will not cover the same event. I just have to wait an hour or so and it will be on the BBC where I can read it for free and without registering. The NYT does not have a lock on the events that it reports. Someone else will report the same event, if NYT cannot command my attention by the quality of their writing then putting up a pay wall is hardly going to entice me.

And as for the user base, why do you think that they are closing down, because the user base is no longer there.

The more stuff like this I see the more I am convinced that new papers are heading for extinction.

Perhaps I should buy some so that I can sell them on eBay in a decade or so :)


You do have to go to those sites to get your news. All you get from Google News is a two sentence blurb and maybe a thumbnail image.


Not without reading or having been exposed to an alternate view point in the the cluster of headlines presented by aggregators...


And what exactly is so pernicious about an alternate view point?


"usually headlines and a sentence or two is allowed under the legal doctrine of fair use. News organizations have been reluctant to test that idea in court"

Yeah, cause it almost certainly is fair use.

"There’s a bigger economic issue at stake here that we’re trying to tackle."

Yeah, your business model didn't scale with the Internet and you're too old/tired/scared to try new ones.


Can't they do a robots.txt or some other way of stopping Google from crawling their site? Or, do they want to let it happen so they can whine about it?

As others have pointed out, there's multiple sources of news (especially those in other countries).


The internet has made media non-excludable and the media aren't willing to accept this, for understandable reasons--a non-rival and non-excludable good doesn't make you much money! Short sell!


The approach of the AP here will be interesting to watch. I think there does need to be some more balance between the content creators and those that do nothing but have popular RSS feeds.


Lesson from the soon to be forgotten newspaper industry:

If you've dug yourself into a hole... dig faster and deeper, that's the way out!


I think marc andreessen said they need to play more offense and less defense. True here.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: