Regardless of whether PageRank is "bad math" (the author being the arbiter of what's bad), it was never about being formally anal, it was about solving a problem -- making search much, much better than the then-competition.
PageRank solves the problem with flying colors. There is nothing wrong about having hidden constants that you tweak until you get the results you want. The alternative would be to, instead of coding what has become Google, attempt to find a more general solution. Maybe you'll find it. Maybe. And if you do, by the time you have, someone else will have come and made Google instead of you. And for what? Mathematical purity? Phobia of constants?
I suppose the author also feels much of physics is also bad, since it's riddled with constants upon constants, all of which are "ticking time bombs": http://en.wikipedia.org/wiki/Physical_constant
You, and many others, seem to imply that being "bad math" (mathematically not sound) is the same as being "bad" (in the broad sense). The author never said that the method was useless, nor anything similar. PageRank was an obviously great advance at the time, but this does not mean that it can not be improved.
Now, if you accept that it can be improved, the author has a great point: trying to understand the role that this made up constant plays in the algorithm (and getting rid of it if possible) is a clear path to what might be an improvement.
Oh, and "physical constants" are not even remotely similar to using a made-up constant. Physical constants are measured experimentally, meaning that they "exist" in the object that the mathematical model tries to explain. Now tell me how do you measure ".85" being an "existing" constant in the "ranking of website importance" object that pagerank tries to model...
> You, and many others, seem to imply that being "bad math" (mathematically not sound) is the same as being "bad" (in the broad sense). The author never said that the method was useless, nor anything similar. PageRank was an obviously great advance at the time, but this does not mean that it can not be improved.
I did not say the author said PageRank was bad. I merely said that it being "bad math" was irrelevant to the real world.
> Oh, and "physical constants" are not even remotely similar to using a made-up constant. Physical constants are measured experimentally, meaning that they "exist" in the object that the mathematical model tries to explain. Now tell me how do you measure ".85" being an "existing" constant in the "ranking of website importance" object that pagerank tries to model...
I was just showing how flawed his π analogy (he was comparing π to PageRank constants), and his reasoning how π is a time-bomb are.
You miss the point. The damping factor is arbitrary. Changing the damping factor changes the results of PageRank. So Google's search results are, in a sense, arbitrary. When Google controls the fate of so many Internet companies (by determining if they get traffic) having their search results be determined by an arbitrary parameter is galling. At least when Google blacklists certain domains you get the sense it is a conscious decision as one that can appealed against.
Google was successful then with PageRank, but maybe now they need something better? It is my personal opinion but building something as huge (now) as Google search on such a shaky foundation as PageRank is dangerous.
I don’t think physical constants are made up and they certainly aren’t hidden.
PageRank is one of more than 200 metrics Google uses in its search engine. Your presumption that it is the foundation, or even the most important metric, is unwarranted.
Well, generally how search works now was actually one of the first things I learned when I started working for Google, but my point was that you don't know and are just making things up.
Ok maybe you do know. However, it's odd that you bring up that you work at Google and then add nothing new to the discussion (everybody knows about the 200 metrics). How about: "You're wrong, PageRank has never been the foundation of Google search"?
I never said PageRank has never been the foundation of Google search or that PageRank isn't a part of search, just that there are other parts that you are not taking into account.
As to why I'm not saying exactly how Google search works -- I'm obviously under confidentiality agreements to not disclose that information.
It's not that physical constants are made up or hidden, it's that we know most of them with very limited precision, because they are determined via experiment.
The math in the original 1998 PageRank paper might not be mathematically 100% sound, but why would they need that in the first place? Do you really think you need a formal analysis before you build something? This is not academia, you know - if you need a formal proof of everything you do, you'd never get anything done.
Besides, the paper you're referring to is 13 years old. Why drag it up now?
They found time to publish this in 1998 but have been too busy since? I don’t think so. The reason is that they decided to keep all new development secret. That’s why we have to speculate.
"But π=3.14159265358979 is a time bomb! Sooner or later it will fail you when it’s not accurate enough anymore."
Well, actually for almost everything humans do, this will never, ever fail you. In fact, I can't think of a single thing this will fail for outside of physics research or formal mathematics.
Well, sure. Google is a machine that turns small text queries into internet links. It does this remarkably well. If manually penalizing sites gets you solid improvements beyond what automated or clever methods give you, it is bad engineering to not do it. What's the argument against this? Sure, it makes engineers feel icky to have manual anything, and it's not nearly as scaleable or reliable as some hypothetical automated solution. The issue is that the hypothetical automated solution doesn't exist. I'm sure they have 500 of the smartest, most capable people in the world working on this every single day, that they haven't found it is good evidence that the solution is not obvious.
Surely a fully automated, self correcting algorithm exists to penalize spam sites and shitty content farms, and whoever comes up with it will be heavily rewarded financially for doing so, either because they will make major inroads in the search space, or because there will be a bidding war for the technology. That nobody has come up with it outside of Google either says it likely is outside of the engineering ability of humans to do it at the moment.
> But π=3.14159265358979 is a time bomb! Sooner or later it will fail you when it’s not accurate enough anymore.
If you know the exact diameter of the sun, and calculate the circumference with 3.14159265358979 as an estimate for pi, then your error will be about 10 microns. Using a 14 digit estimate of pi, is never going to be a timebomb for any practical task. If the earth was round to 14 significant digits the highest mountains would tower 10 nanometers above the deepest valley.
But the sun isn't a perfect sphere. If you're using pi to calculate the circumference of the sun, and need any sort of accuracy, you're doing it wrong - you're answering a question with no relation to reality.
PageRank solves the problem with flying colors. There is nothing wrong about having hidden constants that you tweak until you get the results you want. The alternative would be to, instead of coding what has become Google, attempt to find a more general solution. Maybe you'll find it. Maybe. And if you do, by the time you have, someone else will have come and made Google instead of you. And for what? Mathematical purity? Phobia of constants?
I suppose the author also feels much of physics is also bad, since it's riddled with constants upon constants, all of which are "ticking time bombs": http://en.wikipedia.org/wiki/Physical_constant