Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is a fascinating and elegant little virtual machine design that I think many here would find interesting.


What blows me away is how practical it is. They've ported it to every piece of junk they had laying around, and wrote tons of software. It's not just a design on someone's web page, they're making tools, games, art.


For gaming it's good, but sadly for software it lacks Unicode support. I wouldn't mind basic support with GNU unifont in further revisions.

Altough the complexity would skyrocket.

Then, if I wanted to write an adventure in Spanish I would omit the tildes, starting question/interrogation symbols and I would map ñ to ny.


> for software it lacks Unicode support

That's not an inherent limitation of the VM at least, just of the current software ecosystem. It would be possible and practical to make a tiny unifont-based Unicode text library. Though handling bidirectional text, joined scripts and vertical scripts would be something else… :(


This illustrates just how complex even the "basic" stuff turns out to be in practice. Text rendering hates you[1]. Even if you take obvious shortcuts/compromises like limiting yourself to monospace, you'll get bitten by "double width" characters. Kudos to those who try, rather than bundling a copy of Chrome with their chat app.

[1]: https://faultlore.com/blah/text-hates-you/


I tried to tell other people that (and more; there are even more problems with Unicode than mentioned in that article) but they don't believe me and they believe that Unicode is good anyways.

I design my own programs and specifications to avoid Unicode as much as possible, even when multilingual text (sometimes even in languages that Unicode does not have) is desirable.


Yeah, that's why ASCII with a few of accenter chars and glyphs [áéíóüñ] would be enough with just a small extended western table. Optional as a library, OFC. By default you write ASCII chars from a table with a "display" device. I prefer uxn's simplicity over my own complex locales. Anyway, as I said, a tiny .tal code for extended chars wouldn't be very big.

Some fonts for uxn (look at the ~rabbit repos at git.sr.ht) already bring extended chars.


> [...] ASCII with a few of accenter chars and glyphs [áéíóüñ] would be enough [...]

Making a bespoke ASCII extension would be a step back - by about 30 years. You don't need a lot of code to support UTF8; if you're concerned about runtime memory usage, you can make your rune type take 8 bits and support only the U+0000-00FF range[1]. It happens to cover all of [áéíóüñ] and a whole bunch of other languages - unfortunately, not my native one, which would leave me gravely upset ;P

[1]: https://www.unicode.org/charts/PDF/U0080.pdf


Then you could just implement ISO-8859-1 instead. It is the same range but a simpler encoding, and external programs could be used for conversion if necessary.


1. UTF8 encoding is trivial and everything already speaks it by default nowadays; 2. it is useful to explicitly differentiate byte arrays from text; 3. if you'd change your mind later on and decide you do want to support Japanese (like here: https://100r.co/site/niju.html), you haven't dug yourself into a hole.


1. UTF-8 is not an unreasonable encoding, but it is an encoding of an unreasonable character set and is also unnecessary here. It is better to avoid needing unnecessary conversions that will then be needed in both programs, even though it should not be necessary. Not everything is using UTF-8 and Unicode. I continue to use (and write) programs that do not use Unicode. (And, if it is necessary, conversion program can also be written in uxn; the fact that uxn does not use Unicode does not prevent this; you can implement whatever character sets/encodings that you want to do.)

2. Sometimes it might, but that has nothing to do with uxn. Sometimes it isn't helpful to be differentiated anyways, and sometimes this differentiating byte arrays from text causes problems, too (it isn't really so uncommon).

3. The niju program does not use Unicode and does not need it; it works better without it. If you do want more sophisticated Japanese text, even then there are better ways than using Unicode.


The problem it's uxn it's ported to tons of devices. Would utf8 work under platforms like the Nintendo DS or DOS with a simple header file?


UTF8 is an extremely simple and lightweight text encoding. Check out Plan 9's man page on UTF, it would fit on a t-shirt: https://plan9.io/magic/man2html/6/utf

Unicode is also just a representation for text, and a handful of common operations - you work with arrays of characters, rather than arrays of bytes. It was worth its cost on 1992 hardware; Nintendo DS is over a decade more recent.

I recommend studying libutf in sbase[0]. It's not a single header file solution (although utf.h[1] is an excellent place to start reading), but it does provide a fairly comprehensive implementation. There's also a good introduction to Unicode in Plan 9's C programming guide[2]. Even if you choose to only support runes that fit in a single byte, you gain the ability to tell byte blobs apart from text, which is useful both for reasoning about your program, and for future-proofing it, in case you needed to put places like Łódź or Πάτρα on your map.

[0]: http://git.suckless.org/sbase

[1]: http://git.suckless.org/sbase/file/utf.h.html

[2]: https://plan9.io/sys/doc/comp.html


Should do! All the detail would be inside uxn, and the whole idea is that uxn virtualizes away from the underlying environment.


For a more mainstream take on the concept it's also worth to check out RISC-V. Implementing a baseline RISC-V interpreter VM without any extensions is a ~day of work (I've done it; it's a little more complex than their ISA, but not by much), and it can essentially run any piece of software you can throw at it as real compilers can target it. (You still need to supply the rest of the VM though to get I/O and such.)





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: