Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Wolfram Alpha: First-Hand Impressions (thenoisychannel.com)
21 points by hzzk on April 1, 2009 | hide | past | favorite | 8 comments


Without seeing a demo, it's hard to say anything factual - but in principle, the "problem" in any well-design/near-perfect system is still human beings. (context, language, HCI, "true", "news" etc) So while it is possible that Wolfram Alpha has successfully modeled a completely new area of search/information modeling - more likely, WA is another novel way parse language and information.

Also, Freebase and Wikipedia might the competitors (I agree with Nova Spivak) - but Wolphram Alpha won't be a panacea until it's open source data/technology. The wisdom of the crowds is the only way to model... the wisdom of crowds.


Actually, they're not really about natural language--that's what I'm trying to get across in my post. They're more about a place to store and use structured data and access it formally. That they have an NLP interface right now is, in my mind, a kludge.

And they aren't taking an open source / wisdom of crowd approach. Rather, they are taking a very retro curated data approach, more Britannica than Wikipedia. I'm a huge Wikipedia fan (see my recent post reacting harshly to Nick Carr's calling Wikipedia a "Potemkinpedia"), so my reaction was highly skeptical. But, for objective data like the populations of countries and atomic weights of elements, curated works just fine. In fact, for structured data, I suspect curated is probably the way to go.


Am very curious to hear from others who have seen Wolfram Alpha in action. They did a lot to address my initial skepticism (which was extreme--see my earlier post), but it's no slam dunk by any means.


A biz dev guy from Wolfram gave me a walkthrough recently (no NDA required, which was nice) - he had a big text list of queries that he ran through it, and although I didn't try to stump him with my own suggestions, the results returned seemed reasonable, if you weren't expecting an exact replica of a Google results page, and particularly if you had a rough idea of what information sources it contained, and didn't ask questions outside those bounds.

Some examples seemed gimmicky and of questionable value (e.g. dividing the price of gold by the price of silver) - but given the right context, I'm sure it could return useful results for more realistic queries as well.

In one instance, he used the wrong tense ("is" instead of "was") in a query ("How old is Barack Obama in 1967" or something like that), and no results were returned - but the error message wasn't smart enough to suggest why it didn't work.

One big challenge will undoubtedly be to "curate," as they say, enough common data sources to make it useful nearly all of the time...rather than just some, or even most, of the time. I think there's a threshold of usefulness (measured in percentage of successful queries - perhaps 98% or more) that Wolfram will have to reach if it intends not to languish as a mere curiosity like Cuil. Given that they've already got 200 people doing curation of just their existing data sources, and they plan to add many more sources, that seems like a pretty significant financial barrier to cross - I don't know whose pocketbook got the project to this stage, or how they intend to continue before it starts generating income.

If Wolfram Alpha IS a success, it seems like it could change the behavior of web users in two ways:

1. People will start expecting a higher degree of authoritativeness from information they refer to on the web. For instance, if you're making a decision with a committee, and there's an email thread going around that sums up facts upon which the decision will be based, nowadays those facts might include links to "non-expert" sources like Wikipedia. If you have the option of linking to the same information in Wolfram Alpha, you'll probably choose to link to Wolfram instead, since you know the information has been vetted by paid experts. There also appeared to be a "References" link at the bottom of each results page that had an AMA-citation-style list of sources. Although Wikipedia also often links to sources, it's arguably more of a crap shoot as to whether you should trust the source, the editor referring to the source, or even that the reference was made in the right context.

2. People will start using Wolfram as the single (or at least the first) place that they go to for factual, authoritative information. For example, nowadays, if I want authoritative information about a stock, I can go to Yahoo Finance and pull up the financial details - and be reasonably certain that it's accurate and trustworthy. However, if I also need to refer to another type of data - for example, official population data for a region - I'd have to find another authoritative source - perhaps the CIA world factbook. Instead of having to remember multiple sources, it's much easier to go to a single, trusted source for all types of factual data.


Sounds like we may have talked to the same guy. I agree that breadth of data is a challenge, and I didn't have the chance to push its limits. Though I was surprised that it knew about acids but not about the pH levels. That might be ok--and certainly fixable if it matters to enough people--but it's a reminder that there will always be holes.

It is interesting to imagine an approach like this raising the bar for information quality / provenance on the web. That would be a delightful outcome wrt information accountability.


The is_a(china, country) example suddenly made Alpha "click" for me: it's the "general knowledge" database that you feed to semantic web programs to correlate their domain-specific knowledge to.


Yeah, that's at least how I saw it. And I think that message is being lost in the rush to see it as the next Powerset. I know Powerset and NLP generally gets lots of hype, but I think that sort of positioning for them would be a death sentence.


i had no idea until about a month ago that these guys are all located down the street from me. cool to see them getting coverage considering we're in the middle of a corn field. </random>




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: