More

SMFloris · on May 21, 2021

So 400 writes per second is not exactly what I would think about when thinking about a "Production" environment.

zie · on May 21, 2021

For Google sized, obviously not. But lots and lots of production stuff will likely live quite comfortably in 400 writes per second.

gpapilion · on May 21, 2021

I’d also add this is on a tiny vm, so not amazing, but more than adequate for a small website.

dstick · on May 21, 2021

True, until it isn’t? It’s quite a big gamble and a large lock-in.

What would the benefit be for choosing Sqlite over MySql?

thechao · on May 21, 2021

Well, first you can tune SQLite to get a 1000x speedup over 400w/s, without compromising safety too much. Second, the use case is small & local — I use it as an fprintf replacement.

Second, 400 concurrent users per second is a lot? If I had 400 users per second, with a 10 minute engagement time for 4 hours a day, I’d have ~10000 daily users. If they paid a monthly SAAS of 5$/user, I’d be making 600k$/year.

So… considering the implementation simplicity, that seems like a good trade off?

yjftsjthsd-h · on May 21, 2021

Does mysql still risk eating your data? https://stackoverflow.com/questions/6158568/mysql-myisam-dat...

kbenson · on May 21, 2021

That's MyISAM. Nobody uses that engine for production without specifically choosing it and hopefully for a good reason.

InnoDB has been the default storage engine since 5.5, released in late 2010, so for over a decade.

yjftsjthsd-h · on May 21, 2021

And to be explicit: It looks like InnoDB is ACID compliant and crash consistent, so indeed this criticism appears out of date. Thanks for correcting me.

kbenson · on May 21, 2021

Yep, and even before 5.5, InnoBD was present (since the early 2000's IIRC), and is what most people chose to use for any production level deployment (MyISAM never even supported transactions, so people had to use something different when they got serious), it just wasn't the default engine until that point so you had to specify it in your create table statement or use a config setting to change the default.

MySQL 3.x was mostly just okay for simple website that didn't do much. (MyISAM is good for either extremely read heavy with few writes or vice versa, was very performant in those cases for the time)

MySQL 4.x added subquery support and InnoDB as a standard component, even if not the default.

MySQL 5.0 added cursors, triggers, stored procedures, views.

I think from about this point MySQL was fairly usable in place of some enterprise DB's depending on use case, and most of the essential features were there, even if not always the default.

MySQL started as a super easy to use and accessible DB, and slowly accreted enough enterprise level features to be able to compete in that field, but it's been in use successfully there for quite a while.

In comparison, my (super limited and possibly incorrect) understanding of Postgres's history is that they focused on stability and enterprise level features first, and then later added some things for more convenience. I think they aren't super different in capabilities now, at least for the core functionality most expect.

RhodesianHunter · on May 21, 2021

Do you frequently dismiss technologies for failings they fixed a decade ago?

yjftsjthsd-h · on May 21, 2021

When there are enough failings to generalize and the product is owned by a company that I feel is evil incarnate, I'm happy to dismiss them and then proceeded to not pay attention for a very long time, yes. In this case, it appears that this particular feeling may indeed have been fixed.

zie · on May 22, 2021

It's not that large of a lock-in, you can fairly painlessly move from sqlite -> postgres. They share a lot of commonality.

I have no experience moving from sqlite -> mysql, but I imagine it's only a bit more difficult.

But I agree with thechao's comments also.

gpapilion · on May 21, 2021

Easy to scale distribution of files vs having to scale a db server.

cy_hauser · on May 21, 2021

With no claims to suitability. Just back of envelope based on 400 write transactions per second.

-- Assume an employee filling out a form on a web site that takes on minute to complete.

-- Assume submits are randomly sent. (A bad assumption)

Per minute that's (1 * 60 * 400) 24,000 people submitting a form. Or over a million people submitting their form per hour. Nothing Google scale but enough for most any internal business application.

Rapzid · on May 21, 2021

These are also likely just serialized commits per second. Parallelized non-blocking writes and/or using explicit micro-batching(SQLite supports batch inserts) and you'll be blowing most gov sites out of the water.

imhoguy · on May 21, 2021

Definitely fast enough if you want to implement centralized crypto blockchain ;) /s

SMFloris · on April 30, 2021

Shout out to the great people in the KDE Community.

I was in high-school and being enamored with the open-source concept ("You mean I can modify anything I want about it?" - hah) I wanted to make my mark in a simple little way. I remember fondly when lurking around in IRC and somehow decided I would change the login screen. Got pointed to the KDE greeter channel and that is how it all started for me. I kept annoying them, did not know how to build, compile, nothing. Thanks to d_ed for answering my annoying questions, that is how it all started for me.

SMFloris · on April 30, 2021

Recently experimented a bit with Rust and I found the reverse to be true. You cannot compose types in Rust. You compose behaviors not types. Very important distinction as I found out the hard way.

Take the following example I have found on the net: https://play.rust-lang.org/?version=stable&mode=debug&editio...

In that example, both the bicycle and the car have the property `speed`. Imagine you have multiple types now that need to have the `speed` property. You would need to copy-paste the same code for each new type in order for you to be type safe.

Apparently it is called *Monomorphization*: https://cglab.ca/~abeinges/blah/rust-reuse-and-recycle/#mono...

From the article:

* But if I want a single queue to be able to handle different tasks, then it's not clear how that could be done with monomorphization alone. That's why it's called "mono"morphization. It's all about taking abstract implementations and creating instances that do one thing. *

Which was exactly what I was experimenting with: A single queue worker that can handle different cases. Honestly, it made Rust almost not worth it for me. Sadly, I was too deep to turn back so I wound up doing the whole thing in Rust. I have tons of copy-paste code. It is ugly and it is bothering me.

dwattttt · on April 30, 2021

You actually can have a single queue that handles the different cases. You want dynamic dispatch for that, the syntax looks like this (quick addition to the playground sample): https://play.rust-lang.org/?version=stable&mode=debug&editio...

rst · on April 30, 2021

... which does put both cars and bicycles in the same queue, but doesn't eliminate the copy-paste for each new type completely; 'car' and 'bicycle' still wind up with separate 'get_speed' impls, which are textually identical aside from the type names.

Tyr42 · on April 30, 2021

Sure, traits talk about functions, not properties/ strict fields, so you need to provide trivial getters if you want to abstract over properties.

True, but not that interesting? Sure we could have some Ruby :get_attrs magic or whatever.

cyphar · on April 30, 2021

Usually the way people solve this in Rust is with macros.

twic · on April 30, 2021

I've heard "monomorphisation" to refer to something a compiler does, but not something a programmer does. I think this is just "repetition"!

The need for something to solve the problems inheritance solves has been known in Rust for a long time. Mostly it's been motivated by the need to implement the HTML DOM, which is fundamentally an inheritance hierarchy, in Servo. There's a longstanding RFC about it:

https://github.com/rust-lang/rfcs/issues/349

Inheritance is one possible solution. There are others.

SMFloris · on April 30, 2021

I do hope they adopt one solution or the other. Rust for me seems incomplete because of this.

Another issue I have with it is perfectly described in this article: https://theta.eu.org/2021/03/08/async-rust-2.html

d1plo1d · on April 30, 2021

For reference I have over a decade of JavaScript experience in industry and my async Rust rewrite of a large JS project was *more* concise then the heavily refactored and polished NodeJS version (a language I consider more concise then most). If you are having to copy and paste excessively in Rust that is an issue but it is not necessarily intrinsic to the language.

For what it's worth traits largely prevented copy and paste and where traits fail there are macros. The classic inheritance example you link to is a tiny percentage of my code and an orders of magnitude smaller time sink when compared to the code maintenance problems I faced in other languages.

leshow · on April 30, 2021

Monomorphization is not a user action of copy pasting, it's something the compiler does with parametric code. If we're talking about Rust it means when you write:

    fn id<T>(t: T) { }

    fn main() {
       id(String::from("foo"));
       id(1_usize);
    }

The Rust compiler will generate a method of `id` that works for both String and usize, so there will be two copies of the function with a slight variation in your binary. That process is called monomorphization.

> Which was exactly what I was experimenting with: A single queue worker that can handle different cases. Honestly, it made Rust almost not worth it for me. Sadly, I was too deep to turn back so I wound up doing the whole thing in Rust. I have tons of copy-paste code. It is ugly and it is bothering me.

You can easily use generics, there is no reason to copy paste code like this

ptato · on April 30, 2021

Would that copy-paste code ever amount to more than some getter methods, like the get_speed() example you provided? If the speed field had some special behaviour, couldn't you wrap it in its own Speed type, with its own methods, and use that from the enclosing types?

adrianmsmith · on April 30, 2021

I think the solution you've proposed is probably how you'd do it - but isn't inheritance more elegant in this case (i.e. using a language which supports inheritance if this is important to you)

sanderjd · on April 30, 2021

No, inheritance is not an elegant way to mix in shared data and behavior. It can seem like it is in a simple case where there is only one set of data and behavior you want to mix in, but it scales very poorly. Nothing is fundamentally a single kind of thing, and multiple inheritance is a mess. It is more elegant to mix the behavior in through delegation, because it scales to however many things you want to mix in. Carrying on the synthetic example in this thread, you can mix in Speed and Pedals into Bicycle but Speed and Cylinders into Car.

ptato · on April 30, 2021

Maybe not on every situation, but I do agree that inheritance is more elegant in a lot of situations, in the sense that it gets some behaviour "out of the way".

CRConrad · on May 3, 2021

Sorry, I'm not getting it -- could you please explain further?

> You cannot compose types in Rust. You compose behaviors not types.

But I thought in OOP behaviour is type? (Or, IOW, type includes behaviour.) To me, judging only from this, it feels like Rust isn't quite OO... Is it perhaps just not quite finished yet?

Another aspect: That whole "Rust has something called 'monomorphization', which leads to lots of copy-pasting" reinforces that impression. Is this the same problem that C++ tries to overcome by the "select which inherited implementation to use" operator, and other languages by allowing inly single inheritance?

estebank · on April 30, 2021

Beyond some small changes I would make to use new-types all over the place, I personally like to rely on Deref impls to mimic inheritance (although not everyone agrees this is a good idea to do too often): https://play.rust-lang.org/?version=stable&mode=debug&editio...

As you can see I also used a macro by example there to remove some of the duplicated code that you would otherwise have.

devit · on April 30, 2021

You can use a macro instead of copy and pasting.

You can also do this:

  struct Vehicle
  {
    speed: f32,
    type: VehicleType
  }

  enum VehicleType
  {
    Car(...),
    Bicycle(...)
  }

Or this (although this is the least common):

  struct Vehicle
  {
    data: VehicleData
    type: Box<dyn SpecificVehicle>
  }

  struct VehicleData
  {
    speed: f32,
  }

  trait SpecificVehicle
  {
    fn quack(&self, data: &VehicleData);
  }

  impl SpecificVehicle for Car {...}
  impl SpecificVehicle for Bicycle {...}

SMFloris · on March 17, 2021

Recently I started delving into gRPC vs RPC over RabbitMq using json as the mesage format. I saw that for small to medium sized messages, gRPC is actually slower. Of course, this was just a small scale experiment so I don't think nothing of it.

Does anyone have a sort of infrastructure/architecture guide at a bigger scale for gRPC?

My biggest questions range from: How do you actually load balance the servers? What happens if you have a sudden influx of requests but don't want to auto-scale? Do you still need a sort of queue-ing system in front of the gRPC server?

In my research I wasn't able to find some noteworthy articles about this and thus triggered my curiosity.

tgma · on March 17, 2021

I don't know if there is a unified playbook, but there are random blog posts. One that comes to mind (disclosure: I worked on the system, mostly after the blog entry) is https://dropbox.tech/infrastructure/courier-dropbox-migratio... (and perhaps https://dropbox.tech/infrastructure/how-we-migrated-dropbox-...)

I suppose the problem with an easy go-to playbook is if that were easy to pick one solution for every problem, gRPC would have likely picked it by default (I can attest the team, and the external contributors, are top notch, having worked alongside them). Unfortunately some problems, at scale, need to be answered depending on architecture with the holistic system in mind, and are not just gRPC issues. Truth is large scale systems are hard to build and operate. Perhaps that is why you get help from experienced individuals and consultants.

I do agree some documentation/tooling is lacking and could be improved to guide folks through the process.

SMFloris · on March 17, 2021

Thank you very much for these articles. Really top-notch work and provided me with some great insights!

barrkel · on March 17, 2021

I found gRPC not particularly fast on a per-message basis unless you're using the streaming feature. If you have iterated calls, consider streaming instead.

Because it supports full duplex streaming, there's a risk of tunneling your own less than fully specified protocol on top of gRPC. In some circumstances that may be worth taking advantage of, because gRPC takes care of session management, reconnecting, authorization (i.e. it has ways for you to add authorization, like headers) etc.

If you need queuing I think you should use a queue instead.

KMag · on March 17, 2021

Very interesting. Did you also try Protobuf or MsgPack / SMILE over RabbitMq?

SMFloris · on March 17, 2021

Never tried with MsgPack, but I did try Protobuf. The reason I dislike Protobuf is that there is an extra code generation step I need to do. When using a compiled language, I guess you don't really mind the extra code generation step since you also get some safety from the compiler so you don't wind up miss-using the generated code. It is not the same when you are using an interpreted language that doesn't have really good strong typing. You will have to run a static code analyser and be very careful every time you do a change to the interface.

Since in my experiments I was calling a method in Php through RabbitMq from a Rust worker, so it just proved allot simpler to just use json. Also, I measured the time it took to:

1. Make a request to the Php API

2. Php sends the rpc message on the queue

3. Rust processes that message

4. Php catches the response from the worker

5. Php returns the response to the http client

It was <10ms running the cluster locally regardless if I used Protobuf or simply json.

sandGorgon · on March 17, 2021

The protobuf vs msgpack benchmarks are not too bad. Msgpack performs very decently.

https://github.com/alecthomas/go_serialization_benchmarks

cduzz · on March 17, 2021

One observation / warning I have WRT msgpack is that I have seen situations where data getting serialized directly to msgpack can't be rendered as proper json. Specifically JSON requires some proper encoding of characters while msgpack is perfectly happy to convey some binary characters.

The specific situation I saw was 1) antique perl application barfs a sql dump into msgpack 2) fluentd takes that, turns it into a thing like json but with a control character in it (think old time cyan blinky happy face or \0x03) 3) that thing gets dumped into kafka 4) logstash pulls that thing out of kafka and tries to feed it to elasticsearch 5) elasticsearch reports that it doesn't like the blinky happy face, generates an extensive log, and stores the event minus the key/value pair with the blinky happy face.

Certainly many of these steps have an implicit "don't do that!" or "Update to a newer thinger!" but the root cause (perl serializer into msgpack renders a thing that's valid msgpack but not valid json) is surprising in an unpleasant way.

TheGuyWhoCodes · on March 17, 2021

You didn't define small or medium size and number of messages. From my experience server stream while great for memory and giving the client time to process messages as they come can be slower than unary. One way to solve that is to have a batched message with a repeated field of the underline message you want to send, it's an order faster.

I read sometime ago about about server pushback, maybe you need to override/implement your own StreamObserver to notify the client to slow down as the server isn't ready yet or configure the amount of data the gRPC server can queue there are quite a few configurations (a lot of them obscured) you can make to the server/grpc service.

SMFloris · on March 17, 2021

Seems so much work for something already implemented with messaging brokers, though.

In my opinion, clients themselves shouldn't worry about if the server is ready or not, only handle if the server does not respond in x seconds and then simply crash or error out. It is the same you would to with any other external service call.

I think the biggest change to gRPC is to the way I thought. Dumb clients was always something that I chose because it was an order of magnitude simpler to reason about. gRPC comes in and changes this by making the clients smart and the servers smarter, which brings allot of complexity to the table.

Also, indeed the configurations are so obscure.

TheGuyWhoCodes · on March 17, 2021

It's just a different framework for a different use case. The fact that it's http 2 and support bi-directional streams is nice for low latency applications.

I doesn't make RPC over message brokers deprecated and if that works for you under you conditions that's great.

You can say the same about message brokers. Is it LIFO, FIFO or something random? what about acknowledge, is it late or not? Can the queue hold responses (Rabbit advices against using it as a result store)?

In that regard message brokers are complicated, lets just do http calls and let the load balancer take care of routing, it's much simpler and the client is very dumb.

SMFloris · on March 16, 2021

Where Conan doesn't really shine is crossplatform dependencies. For example, when a dependency requires to have been built with a specific toolchain. Thus, if your goal is write once and run everywhere you'll still have to build dependencies yourself.

simfoo · on March 16, 2021

When a dependency requires to be built with a certain toolchain, Conan can't do anything about it. At best it really only surfaces a problem that might be hidden if you were just be using pre-made builds.

SMFloris · on March 7, 2021

Loved it! It gets crazy with the pawns, I was so surprised. The visuals of the game are so and so, but they are really cool nonetheless. Two thumbs up from me!

SMFloris · on March 7, 2021

I didn't see it back when I was doing Qt, but writing a Qt app for every platform is hard. In theory you can have Qt apps on Android, iOs, Linux, etfc but in all fairness, it is just too much of a hassle. Big headache with toolchains, dependencies, it is just not worth it nowadays.

"Write once, run everywhere" was a nice motto for Qt and I even bought it for a while but yet again does not work in practice. Flutter, on the other hand, actually works. Sure it doesn't have all the bells and whistles of OS integrations, but it is getting there. I hate the language itself, but I can't deny that it works beautifully.

If you think about it, the Qt way of doing things was quite novel back in the days and had allot of promise, but I feel that the main thing that held it back was C++ (still no official package manager and no cool CLI tools in 2021!!). I am sad that I left Qt behind since it was an amazingly modern platform when I used it, but I am glad I don't do C++ anymore.

krzyk · on March 7, 2021

How does e.g. Telegram does it? Isn't it QT?

skybrian · on March 7, 2021

Are you doing cross-platform development in Flutter, or by “works” do you mean on the platform you tried?

SMFloris · on Feb 24, 2021

A while back we explored the use of Cassandra. We wanted to keep some event related data there and for it to be relatively fast read-wise in order for us to do all sorts of reporting based on it. So we wrote allot and wanted to read fast. Seemed like a perfect store for our timestamped events, especially since we wanted to not even use deletes and has in-build record deduplication via its primary key. Turns out, it is not that perfect.

Other than what the article described, I can also add:

1. It has a steep learning curve, but you do get to see the advantages while you learn it. But then, everything comes crumbling down.

2. The setup is a pain locally. Then it is a pain to set it up in prod and manage it. The tooling itself feels very unfinished and basic.

3. No querying outside primary index on AWS Keyspace if you want it managed. Also, any managed variants are EXPENSIVE. I mean, every database is fast if you only query by the primary index so why pay extra?

It is just not worth it. For example, we winded up using MongoDb and it turned out to be fast, scalable, had mature tooling and we can keep tons of event related metadata in it and it is easy to manage and doesn't cost a fortune.

killtimeatwork · on Feb 24, 2021

I think Cassandra is a better fit for interactive use cases, not for reporting. Also, basically it's super heavy duty and it should start to shine when you're really serving entire Internet (on the scale of Reddit, Expedia etc.) and your Cassandra cluster is distributed across DCs across the world.

I haven't really worked in this space for a couple of years so I don't know if the cloud offerings have already completely matched Cassandra's features and robustness.

AtlasBarfed · on Feb 25, 2021

Dynamo allegedly has global replication now.

Of course, I have been totally unable to determine how they merge rows/partitions without cell timestamps. It's a black box.

I was just on a "Keyspaces" meeting where the sales dude basically described dynamo billing, dynamo provisioning, feature shortfalls obviously due to dynamo, but would not admit it was dynamo.

It was bizarre.

snapetom · on Feb 24, 2021

We initially looked at Cassandra. We liked its use case, we liked its scalability. However we also ran into maintenance and setup pains.

We ended up going with ScyllaDB, which is a drop in replacement for Cassandra. It’s written in C. Much easier in resource demands and we didn’t have to deal with Zookeeper directly.

bbromhead · on Feb 24, 2021

Cassandra doesn't require Zookeeper

snapetom · on Feb 25, 2021

Ah, good to know. Our admins set up Zookeeper with Cassandra, so I had always assumed it was part of the deal.

AtlasBarfed · on Feb 25, 2021

And Scylladb may have a better thread-per-core model and no GC pauses, it basically has the exact same management challenges as Cassandra.

The parent comment almost seems generated by AI.

snapetom · on Feb 25, 2021

Dumb statements all around, but clearly you've never been in a Cassandra environment vs Scylla. Scylla is far, far more reliable, easier to get up and running, and required a bit less supervision than Cassandra.

SMFloris · on Feb 24, 2021

Hah! Brings back old memories. I actually did something similar for an RPG I was trying to create from scratch using Qt3D (back in the day when it first came out and was getting stable, QT 5.8 maybe).

Had the whole isometric world down and used qml to script monsters and abilities.

Great and fun experience, thanks for sharing!

ognarb · on Feb 24, 2021

This sounds like a fun project? Did you publish your source code somewhere?

SMFloris · on Jan 27, 2021

I think software takes longer to build now than it did before even with all our software advancements. Abstracting the market requirements completely, the new technologies are very slow to create an mvp out of.

For example, a week ago I started a Symfony (php framework) project and let it on default server side rendered setup. I bought a $20 css+html template. I created some entities and then just inserted a form for that entity into an html file using one line of code. It took me less than 1h to do this, with containerization and db setup and figuring out how things work in Symfony.

It blew my mind. I totally forgot how easy it was and I could focus on the things that mattered. To do the same thing in React/Angular would have taken me much longer since I simply had to build 2 things (a frontend and a backend). I think this is why today software takes longer to build: You simply have to build more behind the scenes so that the end customer sees a form/page/button.