> there's actually proportionally less failures in Product Hunts busiest period
This is a really interesting post! I think there's a little survivorship bias. As Product Hunt grew 2015-2017, users posted old projects of theirs which were already popular and successful.
My guess would be that URLs for the categories eliminated after that period (eg. Books and Podcasts) are more likely to remain stable and available, even if the product was a flop.
This is advice that seems reasonable but is actually pretty harmful.
Take a startup with a few users. The senior engineer decides they need pub/sub to ship a new feature. With Kafka, the team goes to learn about Kafka best practices, choose client libraries, and learn the Kafka quirks. They also need to spin up Kafka instances. They ship it in a month.
With postgres, they’ve got an MVP in a day, and shipped within a week.
I can set up an application to use AWS SQS or GCP PubSub in a day and it will scale without a second thought. I don't think it's productive to compare the worst case of scenario A and the best case of scenario B.
> How does any of this equally not apply to PostgreSQL ?
1. Postgres is easier to setup and run (than Kafka)
2. Most shops already have Postgres running (TFA is targeted to these shops)
3. Postgres is easier to adapt to changing access patterns (than Kafka).
----
> Is this some magical ...
Why must your adversary (Postgres) meet some mythical standard when your fighter (Kafka) doesn't meet even basic standards.
> With postgres, they’ve got an MVP in a day, and shipped within a week.
And the next week they realize they want reader processes to block until there is work to do. Oops that's not supported. Now you have to code that feature yourself... and soon you're reinventing Kafka.
The article does a good job of explaining the difference between WHERE and HAVING. The simplest resource I've found for this is Julia Evans' "SQL queries run in this order" [0], which points out for example that SELECTs are one of the last things to run in a query.
I've managed software teams and data engineering teams, and both teams get tripped up with even moderate SQL queries. To simplify, we encouraged teams to use a clearer subset of SQL. Most HAVING can be replaced with a WHERE inside a more readable and explicit subquery. Similarly, we got rid of most RIGHT JOINS.
This is quite a bit of infrastructure for an app that hasn't launched yet. If it's not too late, consider simplifying by removing RabbitMQ or Redis. Perhaps even getting rid of both, and only using MySQL. Maybe your workers could become cron-jobs or threads.
For hosting, consider Heroku and Heroku add-ons for MySQL, Redis, and RabbitMQ. You could run workers in Heroku as well.
It would be possible to run this entirely in a VPS as well and fairly straightforward. I've also had success running the web app (with postgres and redis) in Heroku but the workers on a VPS.
Hmm, it may be tricky to redo the infrastructure to remove Redis. I use a triple storage system of Local Cache => Redis => MySQL, where each is a fallback for it's previous. I also use Redis for other things that can be done quicker than sql, like a real-time leaderboard.
RabbitMQ is also integrated heavily between the different services.
Looking at Heroku, it seems a bit overpriced for MySQL and Redis, the cheaper offerings don't offer much RAM.
I might look at VPS/Bare metal to host all the infrastructure. Thanks!
https://postgraphs.com/ - it’s an old weekend project of mine that I’d like to finish up soon