Apparently, all I know about strings is correct.

ableal · on May 10, 2013

"Thank you, Mark Pilgrim".

(Unless you already knew it ten years ago. Back when he wrote "Everything you thought you knew about strings is wrong." it was quite true of most every programmer, and it was thanks to this piece and similar ones, by Spolsky and others, that the information got spread around.

There must be some phrase for this opposite of the "self fulfilling prophecy": the cautionary phrase that causes itself to become false in the future ;-)

ygra · on May 10, 2013

As for me, I found Spolsky's article lacking, too. But lurking for years on the Unicode ML is probably not something most people do. You learn a lot there, though.

Roboprog · on May 10, 2013

And, he didn't even touch on the horror that is EBCDIC. Once you've had to touch that, the idea of "code point" for a character is something you can't ignore, hoping that things just work "most of the time" -- ASCII A != EBCDIC A.

reeses · on May 14, 2013

EBCDIC A != EBCDIC A :-)

It's easier to read text on punchcards in EBCDIC of whatever variation, though.

sluu99 · on May 10, 2013

Are there still (many) of people using EBCDIC?

Roboprog · on May 10, 2013

Yes. High volume printing is often done using IBM's AFP/MODCA print language, which typically has the text in IBM's EBCDIC encoding.

(disclosure: I once worked at a company that made tools to port code and data from IBM minicomputers to Unix & MS platforms, and also at the largest [format,] print & mail shop in the US)

Roboprog · on May 10, 2013

Or the 6 bit funky set that the old CDC Cyber mainframes used to use! ("What is this lower case 'a' you speak of???")

sluu99 · on May 10, 2013

My thought exactly too haha