Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Concurrency is a Myth in Ruby (and Python) (igvita.com)
29 points by igrigorik on Nov 13, 2008 | hide | past | favorite | 14 comments


The title is trollish: concurrency is not a "myth" in either Ruby or Python -- you just need to use the appropriate techniques (process-level parallelism, non-blocking I/O, etc.) to achieve it. The existence of the GIL is well-known.


GvR gave an interesting answer on Google's "Ask an Engineer" app, to a question about adapting Python to multi-core architectures:

http://moderator.appspot.com/#15/e=c9&t=ff&q=2f40...

His answer: threading is not worth it, and something Actor-like built on top of the new multiprocessing module should be written eventually. (Pythonic, no?)


Concurrent programming in Python: http://docs.python.org/library/multiprocessing.html


I have had less than stellar success with multiprocessing in non-trivial applications. There seems to be a lot of undesired "magic" going on, such as workers having access to global imports which causes all sorts of issues. If I want a worker to have access to a module, I should have to import it in that worker. The irony in trying to make a multi-processing module stop sharing state isn't lost on me.

I didn't actually read this article because it basically sounds like "this is a myth because of the GIL" which is just downright moronic. I completely disagree with the idea that concurrency is a "myth," I do however believe that it can be more trouble to successfully implement than in other languages designed for concurrency, even if (theoretically) all the tools are there.


RE: multiprocessing - Did you file any bugs on it? I agree that there is slightly too much magic, and I'm looking at reducing it - but bug reports and/or patches help me. Right now, my focus is on fixing the docs for 2.6.1 and 3.0 final and further expanding the tests.


I didn't file a bug report because I'm not entirely sure what I would file it as; "Excessive use of magic?".

What happened in my case is that when using multiprocessing with Django, if a database connection is opened by the main process to make a query that connection is kept open and workers end up using it instead of spawning their own. If I manually close the connection, they all make their own. This isn't sufficient, however, because global imports related to transaction support still manage to step on each other when queries are run concurrently. This turns out to be a pretty rare race condition, but it does happen.

You can read more at the comment here, including my workarounds: http://ericholscher.com/blog/2008/nov/10/announcing-django-c...


Fair enough :) I'm going to make it a task to document, clean up and possibly remove some of the magic.


There's other options beyond using JRuby for doing true multi-threading in a dynamic language. Perl has had true threading support (perldoc threads, perldoc perlthrtut) for a while now.

I agree that for most cases non-blocking I/O and select()/epoll() makes more sense, but there are cases where true concurrency (threading or multiple processes) is needed and either memory needs to be explicitly shared (IPC is too costly/complex or memory requirement is high) or the cost of spawning a new process is prohibitive (you're dealing with an SLA).


I actually think this is a good thing.

Writing (correct) multi-threaded programs is actually really hard to do. The underlying philosophy of both Ruby and Python is that the programmer's time is more important than the computer's. Using multiple processes instead of threads may require slightly more memory and computation time, but the programmer will spend less time debugging.


Neither Ruby nor Python makes it impossible write threaded programs, with all the ensuing dangers -- their standard implementations just don't implement threading very well. I think it's a stretch to say that this constitutes an intentional judgement about the downsides of using threads. For all their disadvantages, threads can sometimes be the most effective way to structure a concurrent program.


Stackless Python?


The right way to do concurrency for most applications is the CSP style. Look at Bell Labs' Limbo or Alef which have channels. And Stackless has picked up on this in some ways.


Concurrency in javascript: http://www.mozilla.org/rhino/scopes.html


At least in Python, most C coded modules release the lock so asynchronous I/O is possible.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: