Hacker Newsnew | past | comments | ask | show | jobs | submit | bysin's commentslogin

Online retail happened, and it's not just America [1].

[1] https://en.wikipedia.org/wiki/Dead_mall#Changes_in_the_retai...


and yet i don't see any of this in Canada, malls packed, more malls built and Toys R Us still in all of them. What could be the difference between Canada and USA?


I'd wager that Canada has/had many fewer malls than USA (per capita). Even many small nowhere cities would have a mall.


https://upload.wikimedia.org/wikipedia/commons/0/0d/CrossIro...

that was built out of the city middle of no where because there was not enough water and land to do it in the city, and yet still full and now Costco and others opened next to it, plus a new mall built across it. there are like 10+ malls withing 20km

something is different maybe the weather leads more people to liking to go to a warm place in the winter? but then again many parts of the world malls are doing well.


In the wikipedia entry you provided, there's an example in Rust, similar to the 9th code snippet, that claims it is variable shadowing:

https://en.wikipedia.org/wiki/Variable_shadowing#Rust


I'm glad you said this. There's an async cargo cult going on, where every service must be written in "performant" async code, without knowing the actual resource and load requirements of an application.

From the last benchmark I ran [1] async IO was insignificantly faster than thread-per-connection blocking IO in terms of latency, and marginally faster only after we hit a large number of clients.

Async IO doesn't necessarily make your code faster, it just makes it difficult to read.

[1] http://byteworm.com/evidence-based-research/2017/03/04/compa...


A ~20% improvement in throughput and latency while using 50% less memory (which could allow more workers per-box) is not a "marginal" improvement in my book.


const users = await getUsers();

const tweets = await getTweets(users);

console.log(tweets);

Is async code really harder to read?


Javascript's async feels a bit more natural than Python's.

In Python, you've also got to run the event loop and pass the async function to it. This makes playing with async code in the interpreter more difficult. Also don't forget that async is also turtles all the way up (same as in JS). It'll infect any synchronous code that touches it.

I've written a Tornado app which makes heavy use of asyncio, and while it's pretty efficient, I would reconsider writing it the same way if I had to go back in time.


It's not bad anymore with async/await and promises/futures, but that featureset is still bleeding-edge in most languages. Older-style async code was much more annoying.


In your example the async code doesn't really help anything though - the next statement has to wait for the response from the previous one before continuing.

In your example you'd probably want to be using Promise.all to run two IO operations simultaneously.


The next statement has to wait, but the runtime can yield to another waiting async task so you aren't blocking the total throughput of your program (assuming it's async-all-the-way-down).

The benefits are generally larger-scale than a single method.


Thats hardly applicable async code. You're awaiting the actual async operations, which originally have to be distributed asynchronously from the main thread for these async operations to execute, and at that point its the same speed as just doing sync operations inside of an async operation.

Actual asychronocity, usually with event based systems, gets very ugly, very fast, because you end up having to make callback chains and queueing up your async work. There can be a good benefit to doing it, but its going to be a lot less readable than most sync code, and sometimes not any faster, in the case of Node.JS and its community forcing the usage of async function in places where they don't need to be used.


That code probably represents one function in a event loop webserver processing more than one request at a time. Non blocking behavior is important for work involving UIs.


Throw an exception and look at the stacktrace.


This looks pretty readable to me

https://repl.it/H547/2


Sorcery. Why don't my JS stacktraces look nice? :(


Depends on you dev environment. Almost all of the browser dev tools should catch up eventually. The fun of an ecosystem with multiple competing implementations.


> OS-level multitasking won't be able to achieve the same level of concurrency

Do you have a source for this claim? I've seen it repeated many times, especially in the node.js community but I've yet to see any evidence to back it up. From what I've read, a synchronous threaded model can be just as fast as an event-based system [1].

[1] http://www.mailinator.com/tymaPaulMultithreaded.pdf


A big problem with one-thread-per-connection is that you open yourself to slowloris-type DoS attacks.[1] Normal load (and even extreme load) is fine, but a few malicious clients can use up all of your threads and take down your server.

This is touched upon in the slides you linked to. On slide 62 (SMTP server) a point says, "Server spends a lot of time waiting for the next command (like many milliseconds)." A malicious client could send bytes very slowly, using up a thread for a much longer period of time. If the client has an async architecture, it can open multiple slow connections with little overhead. The asymmetry in resource usage can be quite staggering.

1. http://en.wikipedia.org/wiki/Slowloris_(software)


You seem to be imagining a case where you only allocate a small fixed thread-pool and when it runs out you just stop and wait. I think the slide deck is advocating that you just keep allocating more threads.


I'm talking about hitting OS or resource limits. Let's say a server is configured to time-out requests after 2 minutes. A malicious client could do something like...

Every second:

1. Open 40 connections to the server.

2. For all open connections, send one byte.

Repeat indefinitely.

Steady state would be reached at 4,800 open connections. At 1 byte of actual data per second per connection, data plus TCP overhead would use around 200KB/s of bandwidth. The server would have to run 4,800 threads to handle this load. Depending on memory usage per thread, this could exhaust the server's RAM.

There are ways to mitigate this simple example attack, but the only way to defend against more sophisticated variants is to break the one-thread-per-connection relationship.


What i am truely missing is a good benchmark and comparisons between async vs sync. It seems true that everybody says that async is best but i don't see much evidence. For example, how should 4800 threads exhaust the servers RAM when the thread stack size can be as small as 48kB. That's a round 200MB of memory.

I'm not saying that the threaded approach is better, but that almost everyone comes around with some theoretical statement but nobody seems to care to find hard evidence.


You are right to distrust these claims. The reality is that threads can be significantly faster than async -- async code has to do a lot of bookkeeping and that bookkeeping has overhead. OTOH, threads have their own kind of overhead that can also be bad.

The slide deck that bysin linked above is pretty good:

http://www.mailinator.com/tymaPaulMultithreaded.pdf

This is by Paul Tyma, who at the time worked on Google's Java infrastructure team with Josh Bloch and other people who know what they're doing. Apparently he found threads to be faster in a number of benchmarks.

Ultimately which is actually faster will always depend on your use case. Unfortunately this means that general benchmarks aren't all that useful; you need to benchmark your system. And you aren't going to write your whole system both ways in order to find out which is faster. So probably you should just choose the style you're more comfortable with.

Async is kind of like libertarianism: It works pretty well in some cases, pretty poorly in others, but it has a contingent of fans who think they've discovered some magic solution to all problems and if you disagree then you must just not understand and you need to be educated.

(Note: The code I've been writing lately is heavily async, FWIW.)


Why is 4800 threads a problem, and 4800 heap-allocated callbacks not a problem? Are you assuming a thread consumes significantly more memory than the state you'd need to allocate in the async case? This isn't necessarily true.


It's the design that has allowed tools like nginx and HA Proxy to scale so well. There's a lot of good material here:

http://www.kegel.com/c10k.html


To be fair, that link is almost 15 years old. Back then we had 32-bit address spaces, and that was the main limiting factor for threads (because you'd often allocate 2MB of address space for each stack). And we didn't have multi-core processors.

These days you could actually reasonably have 10k threads. In theory switching between threads shouldn't be much different performance-wise than switching between callbacks in an event loop (either way you take some cache misses), and the thread stack is probably more cache-friendly than scattering objects all over the heap (and certainly easier to use).

But now you have the problem that synchronization between threads (whether mutex locking or by lock-free algorithms) is complicated and surprisingly slow, specifically because you have to worry about all the ways simultaneous memory access might confuse the CPU or its caches. Whereas with single-threaded async each callback is effectively a transaction, without requiring any slow synchronization.

Of course if you're doing single-threaded async then you probably aren't fully utilizing even one core. You see, even if you think you are doing everything in a non-blocking way, that's not really the case all the way down the stack. If you try to access memory that is paged out, guess what? You are now blocked on disk I/O. And because you aren't using threads, the OS can't schedule any other work while you wait. And even if you're pretty sure you never touch memory that is paged out, you surely do sometimes touch memory that is not in the CPU cache, which also takes a while. If your CPU supports hyper-threading, it could be executing another thread in the meantime... but you don't have any other threads.

And then multicore. The previous paragraph was a lot more interesting before multicore, but now it's just obvious that you can't utilize your CPU with a single thread.

The heavy-duty high-scalability servers out there (like nginx and I'd guess HA Proxy) actually use both threads and async, but while this gets the best of both words, it also gets the worst: complicated synchronization and callback hell.

Basically, all concurrency models suck.

https://plus.google.com/+KentonVarda/posts/D95XKtB5DhK


It can be either TCP or UDP. Regardless, UDP has a checksum as well.

The likelihood of a bit-flip is very uncommon. Even taking into account the volume of computers out there, the likelihood of a bit-flip at the exact spot (in the domain name) and moment of a DNS lookup (before the checksum is calculated) is astronomically uncommon. This article is garbage.


I wonder if there are more sources of error than just RAM bit-flipping. The bit could be flipped anywhere it's stored or passing through. If that is the case, the error rate would be orders of magnitude higher.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: