OpenSSD – Open firmware for SSDs

pedrocr · on Dec 24, 2014

Is there any reason we'd actually want this firmware at all vs just using a flash filesystem running on the host CPU? Is there really any significant performance advantage to be had?

I really wish normal SSDs just implemented a passthrough mode to the real flash with enough metadata about blocks that the OS can just deal with it directly. Having to implement filesystems on top of these abstractions just seems wrong. We're setting us up to later find out that we need to join the two layers like ZFS did with RAID.

userbinator · on Dec 24, 2014

Performance advantage, maybe not, but in terms of reliability (which I think for storage devices is far more important than absolute performance) there's definite advantages to having the SSD have its own processor; an SSD is running a realtime control firmware so it can react quickly to events like sudden power loss and act appropriately to flush pending writes and update the BMTs before the power completely dies, whereas the host CPU would basically have no chance at that due to the fact that it's probably doing something else at the time and the latency of communicating to the SSD.

pedrocr · on Dec 24, 2014

I'm not suggesting the SSD shouldn't have it's own CPU, I'm just suggesting it be dumbed down and not run complex and potentially buggy wear leveling algorithms pushing those up the stack.

feld · on Dec 24, 2014

Sure, if you never want a drive warranty. Because they'll probably end up blaming your OS for trashing the flash.

al2o3cr · on Dec 24, 2014

So your suggestion for reducing the bugginess of wear-leveling algorithms is to remove them from a dedicated, known-at-compile-time dedicated hardware environment and instead run them on whatever mystery-meat general-purpose stuff the user has?

Can I get a hit off whatever you've got over there? ;)

jrockway · on Dec 24, 2014

His suggestion is to not trust random hacked-together-for-a-last-second-release vendor code, and instead use a continuously-developed firmware developed in the open with tests and a broad userbase.

The first rule of hardware is that hardware companies can't write software.

baruch · on Dec 24, 2014

It would actually make quite a bit of sense to just have raw flash accessible by the CPU and be managed completely by it. The problem is then that some of the flash trickery is for very specific flash and requires careful calibration, things like programming time which is per-flash chip and sometimes also changes across time.

The ONFI standard only specifies the protocol and the wiring but not the soft parameters, these may be added in there if people wanted but the flash companies are evidently only trying to grow up into the market rather than just provide building blocks as that allows them to squeeze out more profits so I can't see any incentive by them to expose such an interface and support the ecosystem that will spring up and take all of their extra profits and leave them to build massive foundries for peanuts per unit.

baruch · on Dec 24, 2014

Another reason would be "laziness" of the corporates, there is no real interest in spending lots of time and effort for creating the bottom end parts that may or may not improve the overall system and instead the preference is to build the top-level features and let the ssd vendors deal with the flash itself. Especially when the flash properties change between generations and even in the lifetime of the flash itself. Flash generations also come by very quickly as it is already.

justincormack · on Dec 24, 2014

Well you could use these to build a straight passthrough layer, and then work on the file system layer on the host. Probably the easiest way to bootstrap it.

There is probably a benefit in adding some processing on the controller side - eg PCI virtual functions and multiple queues. But mainly you want a block device that reports errors correctly. Not sure where you want to do the error correction (thats like TCP checksum offload, generally done on card vs whole TCP processing).

akira2501 · on Dec 24, 2014

You would, at minimum, need a new interface framing and command format; as it is, something needs to handle the SATA commands on the other end of the bus and translate them to operate the flash.

This would likely require new interface ICs on motherboards and PCI cards to handle this, either that, or you run everything over USB3.0 (which is only 5GB/s).

superuser2 · on Dec 24, 2014

Wouldn't that waste a whole lot of CPU cycles? Granted a lot of CPU cycles are just waiting for disk anyway, but directly managing flash would probably distract the CPU from processing data for at least some workloads.

pedrocr · on Dec 24, 2014

The same argument is used for TCP offload to NICs and from what I've seen it ends up being a bad idea because host CPUs advance much faster than the ones you put in your SSD/NIC/etc. Add to that the fact that you are almost surely leaving optimizations on the table from separating the two layers and I'd very much doubt there would be much performance loss.

And I'd gladly take a performance hit if it meant SSDs become a safe commodity product like hard drives mostly are vs the current situation where a crappy product may very well erase all its data on an unclean shutdown. Manufacturers don't want this of course because there's more margin to be had with the current stuff. Maybe some low end manufacturers could start to implement low-level flash access and Linux using it for the situation to change.

baruch · on Dec 24, 2014

The big pain in such an endeavor is the high cost of an hardware company and the low quantity and the long time it will take to build an ecosystem around such an offering. I'd buy one or a few but beyond the tinkerers there aren't going to be many buyers.

Unless someone of the Google/Facebook clubs will decide it is in their best interest to have such a thing and to enable it to be sold to others as well it is not that likely to happen.

ghshephard · on Dec 24, 2014

But isn't it the case that Hard Drives have firmware managing most of the "on-drive" behavior?

pedrocr · on Dec 24, 2014

They manage the hardware bits (rotation speed, head movement, etc) but they don't present much of an abstraction layer over that (they do remap bad blocks). The SSD equivalent would be a firmware that just presents the block structure with wear counters per block and possibly does the read-modify-write dance for writes smaller than a block and remaps broken blocks to a few it keeps hidden. All the other wear leveling stuff it can just let the OS handle.

zokier · on Dec 24, 2014

I'm hoping that efforts like ONFI and UBIFS would bear some fruit, but that does not seem likely. Backwards compatibility seems to trump architectulal elegance almost always.

baruch · on Dec 24, 2014

A while ago I tried to buy one to play with it, they quoted me $3000 and it is well above my personal budget for playing with it. It is only attainable by a university or a company not mere persons.

I really wish I could get my hands on an SSD that I could program the firmware for to play with things. It would require more than one sample or at least the ability to replace the flash modules since it's likely I'll burn through them with the initial failed attempts. The OpenSSD I looked at did have replaceable flash modules.

jacquesm · on Dec 24, 2014

It's a low volume PCIex card, those aren't cheap to make. There are some cheaper alternatives, maybe you could adapt this firmware to run on one of the cheaper cards.

baruch · on Dec 24, 2014

If there was an option to take an existing high volume product and get the tools and interface documentation to create and modify the firmware for I'd jump on that bandwagon as well. So far I didn't find anything like that.

userbinator · on Dec 24, 2014

There's plenty of existing SSDs using the Indilinx Barefoot:

http://en.wikipedia.org/wiki/Indilinx

You probably need to do a bit of reverse-engineering, but the project has released the firmware source code and controller programming information, so go for it!

justincormack · on Dec 24, 2014

Any examples of the cheaper options?

jacquesm · on Dec 24, 2014

OCZ has a bunch of 'revodrives' that are sub $500, they're PCIex cards as well.

justincormack · on Dec 24, 2014

But they wont run this firmware. There is no information available to start to build a firmware for that platform that I know of.

jacquesm · on Dec 24, 2014

Another 5 to 10 years and flash will be memory mapped and on the mother board. It's simple economics, fewer parts = lower costs. Maybe it'll even go on-die at some point since the footprint of flash is a lot smaller than the footprint of RAM.

ghshephard · on Dec 24, 2014

When you say, "On the mother board" - are you suggesting that there will be dedicated interfaces for flash drives? Or that they will actually be manufactured (soldered on) to the mother board when it's shipped?

Why would Flash Drives be shipped as part of the mother board, but CPUs and Memory aren't? (Unless you are purchasing a Macintosh)

jacquesm · on Dec 24, 2014

Cost and performance are major drivers in computer tech, miniaturization is another. All of these push integration and it's a good bet that what is right now separate will eventually converge, it all converges on a single self contained device (SOC). For RAM/CPU integration the story is a bit different because RAM chips tend to occupy a large amount of space and generate a good bit of heat due to power consumption, but there are right now plenty of systems with RAM and CPU soldered in place.

Factors in favor of placing the RAM closer to the CPU are increased speed and reduction in size but I think that the power consumption will remain a problem for the foreseeable future, process differences and the amount of die space required would be another.

Technically we already have flash drives on the motherboard, they are just a connector away from being a part of the whole. I think longer term the 'upgrade, repair or discard' factor is a lot lower with solid state memory than with spinning drives, stuff tends to get more compact over time and connectors are a source of trouble. So if the connector is already on the motherboard and the device lasts roughly as long as the mb and isn't a huge cost (flash is cheap compared to RAM) then I think it will make economic sense to at some point drop the connector. Once the mSata connector is out of the way there is no real reason not to widen the bus, thats just a couple of traces.

The logic could then be simplified because there is no real reason to simulate a spinning harddrive for a bunch of (slower) memory.

And then the next step to incorporate it into a chip further upstream isn't a big one, especially since it is a relatively compact die, there are already plenty of examples of CPUs with on-die flash, no reason why x86 wouldn't follow that trend.

The biggest stumbling block on that road would be the fact that there are also different processes used for manufacturing flash than for a cpu so you'd be looking at a single carrier with multiple dies or a flash device directly connected to one of the bridge chips.

Cost wise it would make good sense, reliability wise as well. Time will tell.

clarry · on Dec 24, 2014

CPUs have been a part of the motherboard for a long time in laptops, gadgets, and appliances. This trend is increasingly leaking into the desktop space with thin clients and other mini PCs. In fact, all three PCs I've bought in the past six years have CPUs soldered on board..

_h8ft · on Dec 24, 2014

I already have two PCs with "flash on the motherboard" (i.e. mSATA connectors).

uaygsfdbzf · on Dec 24, 2014

In 10 years flash will be dead and we'll be using memristors.

serf · on Dec 24, 2014

I read that ten years ago, too. Not that your wrong about this ten years, but damn have I been patient.

dalke · on Dec 25, 2014

Ain't that the truth.

6.6 years ago, here on HN at https://news.ycombinator.com/item?id=177865 , was a link to "Scientists Create First Memristor: Missing Fourth Electronic Circuit Element" at Wired. User rms said: "I don't think we'll have any keeping up with Moore's Law. In 5 years memristor storage will be everywhere. IBM will develop memristor processors for the Blue Brain project." User TrevorJ said: "I fear that the huge inertia that is the software and hardware industry ... will keep this out of mainstream for 5-8 years."

In 2010 Engadget (at http://www.engadget.com/2010/08/31/hp-labs-teams-up-with-hyn... ) described a collaboration between HP Labs and Hynix. "Williams hopes to see the [memristor] transistors in consumer products by this time 2013, for approximately the price of what flash memory will be selling for at the time but with "at least twice the bit capacity.""

If anything, the optimism has become more pessimistic, as the future horizon lengthened from 5 years to 10. :)

rasz_pl · on Dec 24, 2014

no it wont. Nand flash is

-full of bad cells straight from factory -block erasable

you cant simply plonk it on the board, memory map and expect access like ram.

jacquesm · on Dec 24, 2014

That just requires one level of indirection, which is easily done through a lut of some kind, either in hardware or software.

danielvinson · on Dec 24, 2014

Toshiba NAND has an extraordinarily low failure rate out of the factory. Bad NAND in general is a QA problem which is expensive to fix (because it involves throwing away a lot of usable product) but is eminently fixable.

serf · on Dec 24, 2014

but they solder memory on to boards now, and that doesn't even harbor the same benefits as on-die.

userbinator · on Dec 24, 2014

There's some good technical information here:

http://www.openssd-project.org/wiki/Jasmine_Technical_Resour...

What pleases me most about this is that it's probably the first time this level of documentation has been released for a commercially-used SSD controller; in fact the biggest thing I see coming out of this is not the hardware, but the possible development of alternative open-source firmware for existing commercial SSDs with the same controller. The majority of them are going to be virtually identical to this reference design. The schematics are theoretically enough for anyone to make their own. That's why, for their other platform based on an FPGA, I don't think it's as interesting.

Unfortunately this comes a bit late to save all those bricked OCZ Vertexes (or would that be Vertices) out there, but maybe similarly nonfunctional/damaged drives with the same controller could make good test platforms for this firmware...

Open SSD firmware would be another way to prevent maliciousness; see http://spritesmods.com/?art=hddhack for example.

Someone · on Dec 24, 2014

Prevent maliciousness? I foresee root kits that hide themselves. For example, your SSD could magically develop the equivalent of bad blocks after the OS has loaded the boot sector (first read of block X, if done within Y seconds of power on, returns root kit's boot sector. Any other call returns the uninfected version)

Your anti-malware firmware would have to disable firmware updates to prevent that, moving it into the OCZ Vertex category.

userbinator · on Dec 24, 2014

Government agencies like the NSA already know how to do this.

The usual firmware update happens over the regular SATA interface, and is also controlled by the firmware itself; however, there is a "factory mode" that requires physical access and is always available - it's how the initial firmware is loaded - so even if you use firmware that doesn't allow updating via regular means, you can still update it if you really need to. The factory mode might be via JTAG, or require a specific voltage on a pin upon reset to enable, and that's something that no malware can silently do...

I wonder if OCZ might've not suffered the same fate had they open-sourced their SSD firmware after they bought Indilinx, since one of the biggest problems they had was firmware bugs.

rasz_pl · on Dec 24, 2014

OCZ skimped on simple things like capacitor to sustain last write on power loss.

rasz_pl · on Dec 24, 2014

Wiki mentions OCZ's Vertex/Vertex Turbo/Agility/Solid drives. Does this mean I can buy one of the bricked OCZ drives and start playing with firmware using openssd codebase?

Lack of reasonably priced hardware platforms limits this project to academia.

danielvinson · on Dec 24, 2014

Not all of the Vertex 2/Agility 2 drives which were bricked were due to firmware issues. Around 25% of drives which failed were due to NAND failures and the lost firmware or corrupted firmware fixes often only fixed about 50%-75% of the drives which they were applied to.

Edit: FYI these numbers are purposely not very accurate.

ck2 · on Dec 24, 2014

What's also interesting is the new F2FS by samsung on many linux distributions now (open source).

Shows higher performance on SSD than ext4 and virtually all other file systems.

sanxiyn · on Dec 24, 2014

I am so happy to see a project from South Korea hitting the HN front page.