I was really impressed that I got line speed Ethernet between the Pis, 112MB/s doing a wget of a 1G file from one to another. Things have improved a lot since the original Raspberry Pi from 2012.
I need to figure out how to get the 8th Pi working though, since I'm using one of the PoE ports to power a Mikrotik that I'm using as a wireless bridge onto my existing broadcast domain. Technically I don't need the bridge, since each Pi can connect to my wireless AP over 5GHz 802.11ac just fine, but it feels nicer to funnel upstream data through one point. More importantly, it reduces WiFi collisions since there wouldn't be 8 devices broadcasting simultaneously and interfereing with each other (especially at those close distances) if they all needed to download something from the Internet at the same time... like a Docker image for a Kubernetes deployment :)
Nice project! I just started researching the pieces I need to build something similar.
What kind of PoE hat did you use? I was also thinking of having one of the PIs have also an SSD with bootp/dhcp and let the others boot with PXE.
I had been thinking of building a PIs cluster for a while now and I was looking with interest at the turingpi board project [1] they are taking preorders for the second batch if anyone is interested. But I would like nodes with more than 1GB of RAM so I guess I'll have to wait for that.
Have you tried to experiment with 802.11s and mesh networks to remove some of these Ethernet cables by any chance? I'm looking into some "advanced" router features these days, and I was surprised that there is a standard for this, although it seems badly supported. That would seem like a good option for a more flexible (in terms of networking) cluster if the nodes are that close to each other that wireless networking shortcomings aren't much of an issue.
I couldn't tell by your question if you were referring to the Pi cluster or asking an unrelated question about 802.11s mesh clusters... :)
Here are my answers for both:
Pi cluster: meshing won't save any additional wires here since there is already a wireless bridge set up (between the black Mikrotik device pictured, and an off-camera Mikrotik wireless router.) I chose to wire the cluster simply to guarantee a reliable connection within the cluster. You can absolutely use a Raspberry Pi over WiFi without an Ethernet cable, it would just be a less reliable connection (jitter, congestion), and then of course it would have to be powered conventionally rather than via PoE.
General 802.11s meshing: If all you want is fewer network cabling in your home specifically between switches and access points, then it could certainly help. However, the best choice network-wise is almost always a proper deployment with a central router and several APs (broadcasting the same SSID) wired back to it.
I guess I was curious about both, thanks for yours answers! I hadn't thought about PoE, having never used it myself, so indeed at one cable per device there's not much more to do.
As others have correctly pointed out, this is a fun, ongoing project on server clusters built using real hardware with real limitations. My laptop from 2018 has 64GB RAM... and would it even matter to say that I already have several high-performance machines? I am not sure what your point is. I did not buy Raspberry Pis to save money, I bought them to have a bunch of small physical servers to play around with. I could have easily started 8 VMs bridged to the network in the same way with virt-manager/QEMU and got 8 much more powerful Linux instances for free if that's what I wanted.
I'm sure that if their goal was a fast server with a modern CPU, then they would have gone for it.
I see their project as a fun hobby. It's very fun to put a lot of tiny low-power computers together in a cluster and learn as you go. Sure, you should take into account the limitations, but you should not feel discouraged if all you want to do is to have fun, learn and build something that is quite unique.
In terms of understanding distributed computing and the failure modes that can occur, the GP's approach is definitely better. If you build distributed systems only testing them through virtualisation, you'll run into all sorts of issues that come from adding real world constraints that are otherwise masked through the virtualised abstraction.
One small thing they're nice for is if you want to play with a cluster (e.g. nomad, kubernetes, whatever) on a budget. Of course it's not going to be great bang for buck just for performance but if you're trying out some distributed computing it's nice to be working to real limitations, network and RAM and particular.
Does LXD still do the thing where it's only distributed via snap and therefore force automatic updates and restart? It's neat tech (especially supporting containers and VMs), but I just can't bring myself to invest in a platform that forces instability like that.
On Ubuntu it is still only available via snap. I used to run a 4 node Raspberry Pi 4 lxd cluster. One of the nodes failed over a weekend while I was out of town. The containers which were running on the node properly restarted on a different node. That should have been the end of it until I was back in town and could restore the node to operational status. Unfortunately later that day the lxd snap decided to automatically update. It updated properly on the 3 nodes which remained operational but of course the failed node could not update. Once the other nodes were upgraded the cluster refused to come up because the failed node wasn't running the same version of lxd as the rest of the cluster. I mean technically it wasn't running anything but as far as lxd was concerned it was running a differing version even though it was down.
I noticed the entire cluster was down because my Plex server ran on it and I couldn't watch a movie when I tried later that evening. Luckily I was able to ssh in and determine the issue quickly. Force removing the failed node from the cluster got things going again.
Lxd was quickly replaced by K3S which is unfortunate because the lxd cluster better fit my needs, except reliability, and was much easier to configure/manage.
Even for small platform edge K8S clusters I actual find microk8s much easier than K3S but that would put me back in snap upgrade hell.
I hope Mark Shuttleworth and Stéphane Graber are reading this. Someone has sabotaged a perfectly good docker replacement with a series of really bad decisions.
Having said that, did you try to pin LXD version? For example
With the release of 4.6, there was a change in the way LXD handles its dqlite dependency that should mean the long effort to have it packaged for Debian will be coming to a close.
That's rather the tail wagging the dog, isn't it? Besides, it kinda rules out most of the options; almost nobody packages it naively, probably because their build/vendoring/packaging process is insane (see https://linderud.dev/blog/packaging-lxd-for-arch-linux/).
There are quite a few Linux distributions with native packages in their main repository, that includes ArchLinux, Alpine, Gentoo and OpenSUSE for those that I'm aware of with active package maintainers. There also are packages for Fedora/CentOS/RHEL through maintained COPR repositories and I'm sure I'm forgetting some distros.
Go can be a bit annoying to package in general and LXD was made a bit harder by also having a stack of C libraries for some bits, though with 4.6 we kicked out the need for a custom sqlite and for a coroutine library, so on the C side, outside of the C library, it's down to liblxc, libraft and libdqlite making it a bit easier to package.
For distributions that have a policy of splitting every single Go package into their own source packages (as is the case with Debian), our recommendation is to stick to the LTS release of LXD which only gets released every 2 years and where the bugfix releases don't normally alter dependencies.
The normal feature releases come out every month and those just aren't a good fit when it may take you more than a month to get any new dependency packaged independently first...
Anyway, as an upstream, we are very happy to work with packagers to get native packages in as many distribution as possible.
We do maintain the snap package ourselves and certainly do enjoy it as an upstream since it gives us very large distribution and release coverage with a single package.
That being said, at the end of the day, all we care about is that our users get to run an up to date, secure, LXD. How they get is doesn't really matter to us as an upstream :)
And it's maybe interesting to point out that the majority of our userbase these days are on Chromebooks, effectively using the Gentoo ebuild package of LXD!
LXD works fine on WSL2 as unlike WSL1 you're now getting a full Linux kernel.
The main issue left is that WSL2 doesn't start your normal init system and so makes it harder to run a daemon... You end up having to manually start it every time which is a bit annoying.
But my understanding is that Microsoft is actively working on this and that we should have the normal init system run in some way in the near future.
LXD is so weird. It feels like the halfway point between chroot and Docker proper. Can someone more familiar with containers explain what its usecase is meant to be?
Not sure if you know docker was built on top of LXC container and later moved to write its own library trying to replicate lxc. In general LXD is more secure than docker in its general configuration as containers in LXD are mapped to userid and use shiftfs [1], nothing like this in Docker yet including kubernetes. Usually docker always had more security vulnerabilities than LXD.
So LXD is much better than docker except docker in spite of being an inferior solution became popular with marketing money spend on it due to hype. LXD stayed with people who believe in pragmatic simplicity. Docker is plagued by privilege escalation for a very long time. Check the details in general Docker has more vulnerabilities than LXD. [2] [3]
LXD is not a replacement for docker or for k8s as it offers a different feature set from both of those.
Last I wanted what one of the features it offered, which is a persistent whole OS container, I tried to install LXD on fedora, and after trying to get lxc running, failing to do so and seeing it's horror show of a systemd setup while debugging [1], I looked elsewhere and instead settled for rootless podman with --rootfs.
Well LXD, Docker and k8s have a lot of overlap in functionality and features, so it’s just disingenuous to say otherwise.
For over 90% of startups Docker with k8s is just not necessary and tie them to managed version of specific vendor or cloud provider with higher costs and a lot of overheads. It’s pretty hard for a startup to manage self hosted k8s, given large number of moving parts and management of k8s infrastructure is as big as a task of managing the startup product itself.
LXD is decent enough to build a good high available horizontally scalable cluster on cloud of your choice or bare metal and can be managed by startup teams. Obviously once the startup begins to reach millions of customers and users and have enough revenue than k8s might be viable.
In majority of the projects LXD provides much better infrastructure.
> Well LXD, Docker and k8s have a lot of overlap in functionality and features, so it’s just disingenuous to say otherwise.
Tractors and cars have a lot of overlap also. They are still different things and people who want one normally do not consider the other as an alternative.
> For over 90% of startups Docker with k8s is just not necessary and tie them to managed version of specific vendor or cloud provider with higher costs and a lot of overheads.
This has not been my experience but at least there are managed k8s offerings from multiple vendors with some intersection of functionality. Sure different vendors have different extensions but there is only one LXD vendor and the support for LXD is kind of flaky in my experience whereas I have ran 3 different k8s distros on my fedora laptop without really breaking a sweat.
> In majority of the projects LXD provides much better infrastructure.
If you are happy with what LXD provides, and if you can actually use it, then great use it. I have yet to be in a situation where I could use it or where I really wanted what it offers. I tried installing it via snap, I tried installing it from copr, I tried manually installing it, then I concluded that I don't actually want anything from it that I cannot get elsewhere.
I run LXD on some machines at work and overall I love it.
Think of it as VM hosting, but very little overhead thanks to the shared kernel. (But can for some reason now also be used to manage KVM VM's)
Generally I use it for two use-cases, where I need to do minimal work:
* Giving co-workers a container they can use to host small utilities on a static IP to share, instead of them hosting it on their own desktop. Think compiler-explorer ect.
* Giving co-workers access to our 64 core ThreadRipper for heavy workloads, sandboxed from one another.
When does the sandboxing become useful between co-workers? We have a bunch of powerful workstations but we just ssh in and use them. Does this facilitate users installing packages system-wide in their sandbox? Honest question, just trying to work out what we might be missing wit our historic setup.
To me most Ubuntu things like LXD, Mir, snap feels like it was made by someone who did not understand the existing solutions and could not be bothered to understand it. There may be some points where LXD is "better" (they keep bringing up uid/gid mapping) but it really does not provide the same functionality as OCI containers and does not enable the same workflows as OCI containers.
LXD is not an alternative to Docker or K8S, it is something different which offers different features.
And if we are just talking docker, and not k8s, then all the security you can ever want can be found in podman which by default operates rootless and daemonless and works on stock standard OCI containers.
If we are talking k8s there are already runtimes which support rootless operation like cri-o and there are k8s distros that support rootless operation https://github.com/rootless-containers/usernetes - these maybe are not as widely used as they should be and work is ongoing but you will soon see more of them I think.
LXC existed before Docker and indeed as I said Docker initially built on top of LXC. So LXD is not an afterthought as you tried to put it, it’s way before Docker and k8s. Docker and k8s became popular given marketing money put on them.
Also rootless container has been a feature of LXC since 1.0 in 2013-14, which could not be incorporated in Docker as they tried to re-invent the wheels by writing their own libcontainer which eventually resulted in many vulnerabilities which even impacted k8s even in 2019.
Still today unless one use a managed version of k8s or use managed service by major cloud provider the infrastructure will be insecure with k8s given most of the Docker images still not tested as rootless containers. Also for a small team it’s pretty hard to have secure self-hosted k8s infrastructure given sheer complexity and moving parts.
Has been using LXC and later LXD in our startup. It’s a pragmatic simple container and now VM management platform and can help companies to build vendor neutral cluster and high availability system leveraging the knowledge of HPC and Linux cluster community.
Kubernetes is good for google size companies, for startups it’s an additional overhead and tie their products to vendors specific kubernetes’ distributions given the complexity and moving parts.
Try the above setup and than see the simplicity. Also give a try to kubernetes and see which one is easier and better able to help. In my view for installing database and blob storage infrastructure as part of the application VM’s are still better than containers.
LXC/LXD both are open source projects not dependent on specific vendor or cloud provider to run. Indeed one can use LXC/LXD combination or use Proxmox, OpenStack, Opennebula cloud management platform all of them suppor LXC/LXD now. All these communities using LXC/LXD contribute to it.
Funny enough indeed Debian, RedHat, Suse and Canonical contribute to LXC project as well.
If you had issues with LXD try asking your questions on LXD forum and community will support you with your issues. It will not need Canonical’s blessings to support you. It’s unlike if you had issues with anthos k8s from google, you need to be a paid subscriber to get support.
Speaking of LXD, can we talk about the mess that LXD on WSL2 is?
Ubuntu 20.04 on WSL2 doesn't use systemd which means no snapd which means no LXD.
This happens due to a series of really bad decisions by canonical and Microsoft. The biggest one being "apt install lxd" attempts to install the snap version but fails silently when snapd is missing.
So you would get everything appearing fine but when you dig down you find something is dying but because of their bad engineering choices it is incredibly difficult to figure out why.
I was really impressed that I got line speed Ethernet between the Pis, 112MB/s doing a wget of a 1G file from one to another. Things have improved a lot since the original Raspberry Pi from 2012.
I need to figure out how to get the 8th Pi working though, since I'm using one of the PoE ports to power a Mikrotik that I'm using as a wireless bridge onto my existing broadcast domain. Technically I don't need the bridge, since each Pi can connect to my wireless AP over 5GHz 802.11ac just fine, but it feels nicer to funnel upstream data through one point. More importantly, it reduces WiFi collisions since there wouldn't be 8 devices broadcasting simultaneously and interfereing with each other (especially at those close distances) if they all needed to download something from the Internet at the same time... like a Docker image for a Kubernetes deployment :)