In the linked article, the author also included some videos of himself implementing some of the presented parts e.g. exclusive Fibers [0], including writing the necessary C code using the Ruby C API.
Recording yourself writing code and then publishing it takes quite a lot of courage, even moreso if it's a complex topic such as the one presented. Additionally, it's such a valuable resource: You can follow the whole implementation step by step.
That's even super awesome. I didn't watch the whole thing yet, but he seems to comment extensively, as opposed to staying quiet on the inner thoughts during the whole thought process – which is of course the most crucial part.
This is really cool. Do you have any recs for similar vid of someone hacking on c python API? I want to gain a better understanding of how these langs interface w their implementation, but Ruby isn't really in my wheelhouse.
The internals aren't THAT different, so while I can't really answer your question (I'm not working on CPython), you might still find it interesting from the POV as to how interpreters work.
It's fascinating how many event-driven IO libraries and associated web servers have come in and out of fashion in the Ruby community over the last decade. I hope this or another effort finally sticks and becomes part of the stdlib, so we can have a real lasting ecosystem built around it.
I've run into an issue where a Ruby library is technically thread-safe but there is some C library code that blocks (i.e. connect with TinyTds). This makes it practically unusable with threading under special scenarios where your SQL servers are not guaranteed to be available.
Thread-safe just means it won't result in data corruption, race conditions or other similar issues when used from multiple threads. That's still quite different from being able to execute concurrently.
Actually, completely sequential programs can have parallelism bugs due to the underlying hardware being out of order and other such features. I know, maybe a bit of a stretch by your definitions, but I'm also not sure that "not being able to execute concurrently" implies "thread safety" in practice (even though theoretically pretty sound).
Yes, agreed, all I'm trying to say is, as software engineers we have some assumptions about processors (and compilers/interpreters) that they do things in the order we specify... but that's not always the case, and sometimes that introduces non-determinism.
Processors (and compilers) are not supposed to leak the implications of their non-determinism into user code... but unfortunately it does happen. Rewriting the order of instructions sometimes happens incorrectly (which is why we need memory barriers).
It's not my area, but two things that come to mind, are out-of-order execution where the CPU will re-order micro-ops, and speculative execution, where the CPU will execute one or more branches before the result of a conditional is known.
Both of these issues have lead to bugs in the past and will continue to be a source of bugs in the future.
None of that has anything to do with software bugs, it doesn't change the behavior of a program. People can understand that or not, it doesn't affect the correctness of the software.
If you checked my background section, you'll see even the sqlite3 gem for Ruby blocks and doesn't release the GVL which is a big issue (https://github.com/sparklemotion/sqlite3-ruby/issues/287). So, completely agree that it's really tricky for users to understand what is going on and how to actually build scalable systems.
It's hard even for developers to fully grasp where and how execution might switch. Yes, releasing the GVL will allow other threads to run, but so could other calls which end up calling Ruby code for reasons you would never consider, and that code, or other threads may have effects you didn't consider. I sometimes imagine a warning poster saying, "Do you know whether your RARRAY_PTR is still valid?"
I've been thinking for a while how we could allow individual C extensions to opt in to greater concurrency in TruffleRuby. At the moment we have a single mutex for all C extensions, but maybe we could allow an opt out at compile time so an extension could be built that won't claim that mutex. There's some fiddly details to get right round this though, because I wouldn't want the existence of the opt out to have a negative impact on performance.
I completely agree it is difficult and that's also my point - that shared multi-thread is very tricky to get right, and there is a significant benefit to isolated event loops (even if they are running on threads, they should be isolated).
I have used multi-(fiber/thread/process) extensively, and at the right place they are useful tools. But by far, threads which share data (especially opaque) are the most tricky to get right. The combinatorial explosion of program states is very hard to deal with in practice and "it works for me" is a very common testing strategy.
I am referring to that project but the issue I'm referring to is an issue with connecting to multiple servers concurrently. If the servers have poor connectivity then you can only connect to one at a time despite using threads.
1) Setup 100-500 sql servers.
2) Artificially increase the amount of time needed to finish connect sequence (> 10 seconds). I have no idea how to do this but it's a likely scenario in my use case of servers on cellular connections.
2) Setup X number of threads to make a connection with TinyTds.
3) Those threads should return immediately but with my testing they did not. Only one connection could be in progress at any given time.
Yep, I almost inherited a Rails application which uses SQL Server. It's the first time I saw it paired with Ruby and I found many deadlocks in the logs, which are basically not existent with PostgreSQL and MySQL. I didn't investigate it deeply because hopefully other people will take care of that project (not a nice technical setup, not because of SQL Server) but I immediately thought of the database driver.
As you can imagine, deploying a new technology to production can have issues :p
I've been dogfooding falcon (https://github.com/socketry/falcon) for the past week, which is built on top of async/Ruby. The HN hug of death + Reddit hug of death is a really great traffic test.
I think actually it's been pretty solid, but something caused the instance to run out of swap space, even though it had plenty of free memory. It's something I'll have to try and reproduce so I can understand how it's happening.
I haven't touched Ruby in a while, but are there any common multithreaded use cases? It seemed like the direction was to go multi-process for web workloads (E.g. with Unicorn).
Both Puma (https://github.com/puma/puma, a popular server these days) and Passenger Enterprise (paid) provide multithreaded web support. Also on the background jobs side, Sidekiq https://sidekiq.org is very popular.
Those are both solid choices for servers. However neither of them have a scalability model suitable for HTTP/2 or WebSockets. That's something I wanted to try and address.
I used EventMachine a lot about 8-10 years ago. I'm excited to see Ruby getting some concurrency love again. What are the goals and improvements of your underlying design in general, and especially those that make HTTP2 and WebSockets work?
Concurrency is one of those core features which is hard to add after-the-fact, and so the initial design strongly determines the course of the language's life. It requires re-opening such fundamentals as what does it mean to call a function, or assign to a variable.
"Nobody is using Ruby for multithreading" is both cause and effect.
That's why I'm not terribly optimistic about projects like this (or the proposed Swift 6). That's not how these things work. Can you imagine a language which features good concurrency support today (like Erlang or Clojure) having been launched without it, and then announcing 5 (or 25) years later "We're going to address concurrency now"?
Completely agree with you and to me that's why it's an exciting challenge. I'm not expecting to solve every problem, but I'm trying to carve out a solution which I think works for these legacy issues. Even if we didn't have a solution for the last 25 years, no harm in adding one now! :)
Not a common use case, but a GUI with background processing is terrible without parallelization.
Even if the background process is I/O intensive (which is supposed to be most of the time waiting, therefore freeing the CPU for the foreground process), it doesn't mean it won't end up still blocking (I've experienced this with filesystem operations).
Actually, I was looking at how audio loops work, and it seems like the low context switching overhead of fibers could be really great for stacks of effects and filters. Because the overhead is very small and predictable, and the ergonomics of fibers is easier to deal with, it could make for a really nice interface.
Is "adopting a native font stack" different in some way from simply not using web fonts and doing it the old-fashioned way instead?
Edit: Other than using the magic font names "-apple-system" and "BlinkMacSystemFont" it looks like it's just specifying the Windows, Android and Mac fonts in order. This is not going to use the native system font on other platforms or if the user happens to have Segoe UI installed on Android.
I noticed this, too. On Fedora KDE, it looks fine in Chromium, but poor in Firefox. It seems Firefox can't find the "PT Sans" font, even though I have it installed, and Chromium uses it correctly.
Working complex systems develop from working simple systems. This is trying to get a working complex system by creating another initial working complex system. If only ruby didn't have that darned assignment operator...
I definitely agree with your first statement. However, I think the Fiber + Reactor approach is about as simple as it gets, taking into consideration actual practical, scalable, concurrency. Every approach has trade offs, but I think this design is pretty good. My goal was to build enough of the stack to prove that.
- https://github.com/socketry/async "An awesome asynchronous event-driven reactor for Ruby"
- https://github.com/socketry/falcon "A high-performance web server for Ruby, supporting HTTP/1, HTTP/2 and TLS"
I can recommend following him on twitter too: http://twitter.com/ioquatix