Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Heroku architecture (quora.com)
221 points by helwr on May 12, 2011 | hide | past | favorite | 17 comments


I posted that answer. If you have any additional questions or insights let me know.


I've been wondering about the internals for a while, and only had a brief notion of what it was like. Thanks for doing all this research!


Well done sir.


The routing mesh is less discriminating than you might think. There is no "global queue".


I thought it was terrific that you did this - I emailed the answer to myself so I can read it on the bus...

It looked like you did a lot of research into it based on the sources you posted.

How long did it take you to create that post?

What is your primary area of interest -- If I post a question about some other infrastructure, would you give such a detailed response, or was this a one-off?


It took maybe 4 hours, but it was fun and I learned a lot (not just about Heroku).

I've been using Heroku since near the beginning (we were in YC W08 together) so I was already vaguely familiar with the architecture. It was mostly filling in the details.

Infrastructure is a side interest that I'm trying to learn more about, so I can't guarantee I'll be able to answer your questions, but it's worth posting them anyway. Hopefully other people can answer as well (of course that's what I thought when I posted the Heroku question... I ended up answering it myself)

Rather than directing the question at me, just post a normal question on Quora, and post the link here or message me.


Might be worth noting that Heroku has released Doozer as an open source project and link to its home at: https://github.com/ha/doozer


It's already mentioned in the answer under "misc tech". I couldn't find a lot of info on how Doozer is actually used at Heroku though.


If this stuff sounds interesting to you, we're always looking for excellent engineers (and much more): http://jobs.heroku.com/


Take a look at the CloudFoundry project -- http://github.com/cloudfoundry/vcap

It's open source and maintained by an uber-capable group at VMware.


Surprisingly good link. While we're discussing Heroku, does anyone have any deep[er] insights into the differences between Zookeeper and Doozer?


Does this mean that 3 proxy servers serve all the traffic for ALL apps hosted on Heroku?


It looks like they have about 6 total. Plus some customers pay $100/month for their own front-end proxy server so they can use SSL on a custom domain name.

Of the domains I checked (a few hundred found through the "Find Subdomains" tool here: http://www.magic-net.info/), appname.heroku.com (including proxy.heroku.com) will return 3 of these:

    50.16.215.196
    50.16.232.130
    50.16.233.102
    75.101.145.87
    174.129.212.2
It's possible Heroku's DNS is returning different IPs based the load of the reverse proxies, but when querying heroku.com's 4 nameservers directly I got different subsets of those 5 IPs. A random distribution of IPs probably gives good enough load distribution.

And Heroku's documentation says to point A records to these three:

    75.101.163.44
    75.101.145.87
    174.129.212.2
It's also interesting to note that apps aren't tied to specific proxy servers. If you set the "Host" header to your app's subdomain, a request to any one of those IPs will work.

I'll add this info the the Quora answer.


Yup. Amazon has 3 too!

     $ dig amazon.com

    ; <<>> DiG 9.7.0-P1 <<>> amazon.com
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41998
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0

    ;; QUESTION SECTION:
    ;amazon.com.      IN  A

    ;; ANSWER SECTION:
    amazon.com.   36  IN  A 72.21.194.1
    amazon.com.   36  IN  A 72.21.211.176
    amazon.com.   36  IN  A 72.21.214.128


Not exactly. The 3 IPs that are returned for proxy.heroku.com are not fixed. They are a just subset of the available front-end proxy servers.


Exactly - its a common practice to use DNS servers to spread load across backend instances. Example djbdns/tinydns by default "only" returns up to 8 IP addresses so if you setup more then that those IPs will be randomly returned.

http://cr.yp.to/djbdns/balance.html


This is really cool. Providing this as a service requires a very different approach than running open source code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: