Global Storage Forum

 View Only

An Inside Look: My IBM Storage Ceph Home Lab!

By Fraser MacIntosh posted 17 days ago

  

I’ve been an IBM Storage Customer Success Manager for the last year, and have also worked across the Storage portfolio at IBM for the best part of a decade and a half. Below, I share the backstory on how I came to develop an Ceph Home Lab based on Raspberry Pi small single-board computers. It was a fun project – I hope you will find it useful for you. Following is a conversation I had with John Sing, IBM Storage Principal Technical Leader, about this project.

John: Fraser, how did you have the idea to build an IBM Storage Ceph Home Lab and why did you do it?

Fraser: Why? I’ve found that my personal way of learning new technologies and new skills is to make something with that technology, if at all possible. This has been difficult with many emerging technologies – it’s not exactly easy to build a disk array or a IBM Storage Scale system in the home lab – even with ubiquitous virtualization systems. IBM Storage Ceph, however, represents an opportunity to run an upstream version of IBM Storage Ceph on affordable and ubiquitous single board computers. In addition, it is possible to operate these nodes so that they are passive cooled, removing any noise in my home office as a concern.

John: What components did you use to build your IBM Storage Ceph Home Lab?

Fraser: It’s built from Raspberry Pi small single-board computers, 4 x 1TB Ceph Nodes and 4 x Fedora 38 Docker Nodes (currently being rebuilt to run Microshift in the Docker nodes).

Front view:

The Ceph nodes are separated out as the system has fairly limited compute, so running Storage separately to Compute prevents compute being overloaded by Storage tasks. The rear view consists of Node Modules and Cable / Patch Management.

Rear view:

John: How long did it take to assemble your IBM Storage Ceph Home Lab?

Fraser: I suppose end-to-end the project has taken about four or five years, but that's to get it in the form it is at the moment. At the very start, it was a single Raspberry Pi running IBM's Node Red software to do home automation. Commonly available mass storage was used for network attached SMB from a Windows VM at this point. The initial Node Red on Raspbian Linux, transformed into a "learn containers" project on CentOS Linux, which opened up the option to run more nodes.

In terms of the nodes’ configuration - the Raspberry Pi 4 has SSD to USB3 for Mass Storage and is passive cooled – there are no moving parts and it’s silent in operation!

John: What was the total cost and what kind of online vendor(s) could be used to get the components from?

Fraser: It's hard to say the total cost, because it was made over several years since the release of the Pi4 in June 2019. Currently the Pi4 4GB is retailing at ~£50. The SSDs are Kingston A400 960GB and cost ~£50. The NetGear switch I'd had kicking around for a while and bought second hand from ebay, at something like £40.

Likewise, the rack shelf was from an old rack mount that was hanging round in the garage. Raspberry Pi power supplies are about £15 each. The Dual USB-C Anchor power supplies are no longer manufactured, but cost about £25. 

See above the heatsink case, which I used the top of, but machined so I could bolt down through to the rack base - it's ~£12. The 1U Pi racks were about $40 (IIRC), but I needed to use a contact in the US to send them to me, as the Amazon postage if I bought them through UK Amazon, would have made it cheaper to fly there and pick them up myself.

Below are the total list of components:

  • 8x Raspberry Pi 4 (4GB+, if possible)
  • 8x 32GB Micro SD
  • 4x 1TB 2.5” SSDs
  • 4x SATA to USB3 Storage Adapters
  • 4x Raspberry Pi official PSU (for Ceph nodes)
  • 2x 2 Port Anker USB-C PSU (for Container nodes)
  • 1x 16 Port Gigabit Managed Ethernet Switch
  • 2x 1U Ultronics Raspberry Pi 3/4 Rack
  • 1x 1U Cat-6 Patch Panel
  • 24x Cat-6 Ethernet Patch Cables
  • 1x 1U Rack Shelf
  • 1x 4U Rack Strip Pair
  • Rack Nuts, Screws, Washers

John: What were some lessons learned? What do you run on the cluster on a day to day basis?

Fraser:  Key lesson: DO NOT SKIMP ON POWER SUPPLES!! The whole thing can become incredibly unstable if you do. This is simply because a cheap USB power supply designed for charging doesn’t necessarily provide anything like the kind of stable voltage required to operate a computing device such as the Raspberry Pi. I have also found that quality USB-C cables for the power supplies which don’t come with one already attached can resolve some power related stability problems.

I would make sure you read up on how to deploy Ceph to devices such as a Pi from the Ceph web site. Don’t expect your first build to be the one you end up using, it’s worth hacking around, making sure you know what you are doing and experimenting a bit. Then blow it all away and start again!

Get the most RAM you can afford on the individual nodes – the RAM isn’t upgradable, so if it’s available and you can afford it get the 4GB or 8GB versions.

A number of SD card manufacturers, such as Samsung make cards which have high write tolerance. These are useful as a less expensive way to provide the OS storage for the individual nodes and the various Ceph bitmaps, if you can’t afford to get additional SSD or NVMe storage.

Finally – a good managed network switch it worth the investment, if you can aggregate the link ports, run 2.5GigE or even 10GigE this is really worth doing even with nodes that are individually connected at 1GigE.

The containers I normally run from a day-to-day basis include:

  • Node Red - Home automation
  • Mosquitto - MQTT Broker
  • Z-Wave JS - monitors and controls z-wave devices
  • MariaDB – RDBMS
  • Syncplay - Enables synchronised playback of videos over the internet
  • Minio - S3 target for backup and archive
  • Unifi Controller - Monitors and configures Ubiquiti unifi devices points
  • httpd - Apache Web server
  • Plex - Media Server
  • CC2MQTT - A custom container that pulls data from a CurrentCost energy meter and sends it to an MQTT broker


After the base build of the OS, I pull all the configuration scripts and docker compose files from github. The configuration scripts are run, which does all the updates, installs docker, etc. then containers are pulled and run. This will likely change when I move over to Microshift at some point, but I'm still aiming for a single command to get config from git and build everything...  In addition to these, there's the stuff that I use to learn about how it works...

Summary: I find making something, is the best way to understand that thing. When I first tried to learn Linux in the 90s, I quickly learned that I needed to have an aim, otherwise I would not make progress. That initial project aim became a now defunct “MythTV” Digital Video Recorder, but I learned a lot on the way. 

With the move of my storage to Ceph, I can learn about how IBM’s Storage works in an affordable manner, which allows me to transfer my knowledge from a tiny home lab to some of IBM’s largest storage customers.

John: Thank you, Fraser! This is a fun project, thank you for sharing your thoughts with us.

Want to learn more?
Reach out to me, Fraser MacIntosh any time, via LinkedIn or email.  
Or, you can post a thread on the IBM Storage Community and let’s discuss!


#Highlights
#Highlights-home


#Spotlight
0 comments
16 views

Permalink