Rabbitmq has at-least-once semantics (unless requeue = false, in which case all bets are off, since you could get a message multiple times due to cluster partitions^ or not at all due to a drop without requeue. Note that unlike disque, requeue is set on a rejection/recovery operation, not on a message).
It has transactional publishing (which not all clients support as the Confirm method is non-standard) and uses a per-queue leader election so all publishes have to go to the master queue, guaranteeing ordering. However, requeued messages have undefined order. However however, all messages with the requeued flag unset still have strict ordering relative to each other.
Messages are only removed when acked. If I understand correctly, disque will report back a message publish failure in an OOM condition, whereas RabbitMQ will simply refuse to read from the connection and make TCP do the publisher throttle work, which is a massive hack that occasionally causes problems. I prefer disque's method, especially since clients should never be deluded into thinking a publish can't fail for other reasons.
Overall, the main differences between this and rabbitmq, from my reading, are:
* RabbitMQ has stronger ordering guarentees...until your message gets requeued, in which case things get more complex.
* Disque has a clearer clustering story, stating outright that it is AP. However, this means that messages might or might not always reach their destination, producing (as the grandparent comment calls it) "lonely widows". RabbitMQ can be CP or AP depending on configuration, but its CP mode is somewhat fraught with pitfalls and strange performance issues (as I have personally experienced in production).
* RabbitMQ has the advantage of the whole suite of AMQP features: exchanges, headers, routing rules, as well as some non-standard ones like federation, dead letter queueing. Of course, complexity comes at a cost, and Disque is considerably simpler.
* RabbitMQ supports per-message durability and per-queue mirroring options. This reflects its history as a single-node application which had clustering added later. I prefer Disque's approach of always-clustered with the option to write to disk (especially having that option only apply on graceful restart).
Overall, I might use Disque for high throughput applications where I don't mind messages rarely being lost, such as metrics. However, the AP semantics worry me. Dissatisfied as I am with Rabbit's CP mode, it's still my preferred option of these two in most use-cases.
^ (edit) As the sibling comment notes, the default mode for clustering is AP and message can be lost (this is the same as disque). A few times in my original post I accidentally defaulted to talking about its CP mode, which is the only mode I've used (IMO, the only mode that should be used).
Thanks for your comment. I think there is some misunderstanding about AP / CP and message semantics. Messages in a (proper) AP system are immutable and replicated multiple times, and never dropped if not acknowledged by a client, so there is no case where the message with a replication level of N can miss delivering if just N-1 nodes failures occurred. This is the story of a Disque message:
1. A client produces the message into some node with replication factor of N.
2. The node replicates to additional N-1 nodes (or to N nodes under memory pressure to avoid to retain a copy).
3. When the replication factor is achieved, the client gets notified. Or after a timeout, an error is reported to the client, and a best-effort cluster-wide deletion of the message is performed (only contacting nodes that may have a copy).
4. At this point the message is queued only in a given node, but N have a copy.
5. The message is delivered, but let's imagine is not acknowledged by the client.
6. Every node that has a copy, after the retry time elapsed, will attempt to re-queue it again, using a best-effort algorithm to avoid requeueing multilpe times. However the algorithm is designed that under partitions what could happen is only multiple nodes putting the message ready for delivery again, not the contrary.
7. Eventually the message gets acknowledged. The ack is propagated across the cluster to the nodes that had a copy of the job, and the ack is retained (if there is no memory pressure) to make sure that no node will try to deliver again the message. However when all the nodes report to have the acknowledged, the message is garbage collected and removed from all the nodes having a copy.
So basically you can count on Disque trying to deliver the message at any cost UNLESS the specified message time to live is reached. You can optionally specify a max life for the message, for example 2 days, if after 2 days no delivery was possible, the message is deleted regardless of the acknowledged state or not. This is useful because sometimes to deliver a message after a given time does not make sense.
However if you set, for example, retry to 60 seconds and TTL to 2 days, it means that all the nodes having a copy will try every minute to deliver a non acknowledged message again for 2 days. There is just to keep in mind that TTL means to destroy messages after some time.
So what if, during a partition, I publish another message? Does it get rejected for not being able to reach N servers within timeout, or has "N" been adjusted down to the size of the currently-reachable machines.
ie. is a partitioned cluster effectively "split-brained" where publishes only appear on one side, or does it stop accepting new messages?
N is a number you specify via the API call with the REPLICATE option. By default is set to 3 to provide a reasonable durability. So if you are in any side of the partition with at least 3 nodes, you can continue without issues.
Two sides of a partitions are basically two smaller cluster that operate independently, if we consider new messages. But what about old messages? They'll wait (if there is no memory pressure) to get garbage collected if copies are split among the two nodes. However during the partition the side where the message gets acknowledged will stop ASAP from re-queueing it.
Note that even in CP mode, committed messages can be dropped as well (one of the pitfalls I assume you've experienced) when the cluster heals from partition. cf link in my sibling comment.
When using mirrored queues, Rabbit does ensure all the active mirrors are written to before confirming a publish:
"in the case of publisher confirms, a message will only be confirmed to the publisher when it has been accepted by all of the mirrors"
So if my understanding is correct, wiping the contents of a re-joining mirror shouldn't matter, since no new messages should have been accepted since the partition (unless the "pause" part of pause-minority is only happening after other things like re-election or dropping "dead" slaves, in which case yes pause-minority is useless - this seems doubtful, however).
Hence why I think the problem is synchronized slaves.
Basically, when a slave is created (eg. in response to another slave dying), it only receives NEW messages, not existing messages. So suppose the following sequence of events on a 2-mirror queue:
Publish A
Master and slave both contain A
Slave dies
New slave created
Master contains A; Slave contains nothing
Publish B
Master contains A,B ; Slave contains B
Master dies
Slave promoted and new slave created
A is lost
The way around this is with setting the policy "ha-sync-mode": "automatic". In which case the act of creating a new slave also replicates the current contents of the master. To the best of my knowledge, if the same Call Me Maybe tests were run with that policy in place, no messages should be lost.
But yes, this is precisely what I meant by "fraught with pitfalls". The pause while messages replicate can be disastrous on its own if the queue is large, another issue that has bit me in production.
I do love RabbitMQ but I wish there was a good, planned-from-the-beginning as clustered CP AMQP broker out there. Maybe I'll try to write one.
It has transactional publishing (which not all clients support as the Confirm method is non-standard) and uses a per-queue leader election so all publishes have to go to the master queue, guaranteeing ordering. However, requeued messages have undefined order. However however, all messages with the requeued flag unset still have strict ordering relative to each other.
Messages are only removed when acked. If I understand correctly, disque will report back a message publish failure in an OOM condition, whereas RabbitMQ will simply refuse to read from the connection and make TCP do the publisher throttle work, which is a massive hack that occasionally causes problems. I prefer disque's method, especially since clients should never be deluded into thinking a publish can't fail for other reasons.
Overall, the main differences between this and rabbitmq, from my reading, are:
* RabbitMQ has stronger ordering guarentees...until your message gets requeued, in which case things get more complex.
* Disque has a clearer clustering story, stating outright that it is AP. However, this means that messages might or might not always reach their destination, producing (as the grandparent comment calls it) "lonely widows". RabbitMQ can be CP or AP depending on configuration, but its CP mode is somewhat fraught with pitfalls and strange performance issues (as I have personally experienced in production).
* RabbitMQ has the advantage of the whole suite of AMQP features: exchanges, headers, routing rules, as well as some non-standard ones like federation, dead letter queueing. Of course, complexity comes at a cost, and Disque is considerably simpler.
* RabbitMQ supports per-message durability and per-queue mirroring options. This reflects its history as a single-node application which had clustering added later. I prefer Disque's approach of always-clustered with the option to write to disk (especially having that option only apply on graceful restart).
Overall, I might use Disque for high throughput applications where I don't mind messages rarely being lost, such as metrics. However, the AP semantics worry me. Dissatisfied as I am with Rabbit's CP mode, it's still my preferred option of these two in most use-cases.
^ (edit) As the sibling comment notes, the default mode for clustering is AP and message can be lost (this is the same as disque). A few times in my original post I accidentally defaulted to talking about its CP mode, which is the only mode I've used (IMO, the only mode that should be used).