Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
ARM64 and You (mikeash.com)
296 points by zdw on Sept 27, 2013 | hide | past | favorite | 106 comments


This is a reasonable, short overview of the programmer-visible side of changes in A64.

For those interested, there's an another side. A64 drops all the features of the ISA (inline variable shifts, conditional execution, variable-width instructions) that are hard to implement in a fast, high-power CPU. If a cpu is to not have any 32-bit ARM compatibility, there's no reason one couldn't make a 4GHz 4-wide superscalar one based on A64.


This is true, but that last sentence is a doozy. The AArch64 ISA drops that stuff. The A7 CPU, which much remain compatible with legacy code, must not. So yes, a theoretical CPU would have a much easier design task, but in the real world we'll never see one.

And in any case: there's another reasonably well-known company out there with an even cruftier ISA making 4+GHz 6-wide superscalar cores with backwards 32 (and 16!) bit compatibility modes. Instruction decode is the sort of thing programmers understand, so we tend to get hung up on it when talking about CPUs. It's really not a meaningful design limitation to a CPU core implemented with hundreds of millions of transistors.


I would not be surprised if an iPhone without support for 32-bit apps came out within a few years. Apple has been pretty willing to break existing apps with iOS updates, so there's no real expectation that something which works today will continue to work indefinitely with no changes.


This.

Apple's track record proves that they are more than happy to drop anything more than a couple of years old if it means general forward progression.

My real bugbear is that to some extent they've created the modern day IE6 by not updating Safari on the old iPad.

Also, if you look at some legacy iOS apps on the Appstore (designed pre 6.x), they are pretty much 100% broken on 6.x+


There is already the ARM "big.LITTLE" architecture which pairs fast high-power cores with slow low-power ones on the same die. In the future I could imagine that extended so that a subset of cores are 64-bit only.

As others mentioned, though, I don't think Apple will need to do that since they've got a closed ecosystem. They can just declare one day that they won't accept App Store submissions that don't include a 64-bit version and very quickly move the entire universe of iOS software. There will be some no-longer maintained apps that aren't updated but they can just say "Sorry, this app doesn't support iOS 8"

There are also binary-translation techniques like what they did with Rosetta in the PowerPC->Intel transition. This wouldn't need to be done on the phone; they could just do the translation once on the app-store side.


> in the real world we'll never see one.

Not from Apple, maybe. I know there is at least one in the works for the server market.

> And in any case: there's another reasonably well-known company out there with an even cruftier ISA making 4+GHz 6-wide superscalar cores with backwards 32 (and 16!) bit compatibility modes. Instruction decode is the sort of thing programmers understand, so we tend to get hung up on it when talking about CPUs.

You absolutely can implement a fast core out of almost any ISA (just look at IBM System z!) if you have sufficient resources. The point is that A64 is amenable to being made into a fast CPU without Intel-scale design effort.


I'm not so sure we won't see a AArch64 only version. A server version would be fine with dropping the 32-bit part.


As would a Desktop/Laptop which have no legacy 32 bit applications/drivers to be concerned about.


ARM already has multiple-cycle instructions and variable clock rates. Why shouldn't they release a faster-clocked processor where every instruction in 32 bit mode takes 2 clock cycles?


Wait, AArch64 really drops conditional execution?

When i was learning x86 assembly, discovering CMOV was fantastic, drastically simplified all my code (versus cmp+je+hundreds of extraneous labels.. notwithstanding macros). The fact ARM could do that for almost all instructions was one of the main reasons in my mind why ARM was considered a "cleaner" architecture than x86.

EDIT: crisis averted, a comment further down ( https://news.ycombinator.com/item?id=6458457 ) clarifies the change.


Wait, AArch64 really drops conditional execution?

Some, yes (but it is retained in 32 bit mode).

The reason it was dropped is that it's hard to implement. It adds an implicit data dependency between an instruction and previous results. In a pipeline, the results of previous instructions aren't available immediately, there is usually a delay of many cycles before a "previous" instruction actually finishes and retires. So if your "current" instruction needs the result of the one before it, it's going to have to wait, stalling the pipeline. Adding extra dependencies such as instructions that depend on the state of the flags register is not desirable. IIRC it also makes register renaming hard when one implicit register (the flags) is heavily used.


There is one - power consumption.


Well yes, it won't fit on a phone. The point was that making a CPU that rivals current desktop cpus is possible using A64, so long as you are willing to expand your power budget to meet them.


Yes its interesting to see a really new ISA, much newer than the old high end contenders like Sparc and Power.


It never occurred to me that they'd be using tagged pointers for Objective-C runtime stuff. Of course it's obviously a good idea, but only after hearing it does it become so. Objective-C is always more dynamic than you think it is, so taking implementation cues from other dynamic language runtimes makes perfect sense.

It appears that they've been using tagged pointers on the desktop since 10.7, which I never realized: http://objectivistc.tumblr.com/post/7872364181/tagged-pointe...


This was one of the most interesting things the post detailed for myself, makes so much sense - we do a substantial amount of allocs and reads for NSDecimalNumbers, etc in our payment based app and I can imagine the heap savings and mem write/read savings we will get as a result would of some significance. Pretty interesting innovation.


Tagged pointers have been in use since late 60s in most garbage collected languages of the day.


It's a good idea that caused a world of hurt in the 24 bit to 32 bit transition. There were some macs that didn't have 32 bit clean roms, because they doing things with the pointers that would never have to reference more than 4 megs of ram.

Interesting that Apple now has the coordination and control to make it a non nightmare thing now.


I was surprised they did not use them already since tagged pointers is an ancient idea implemented in many languages. I guess they might not have been useful enough in ObjectiveC for 32 bit architectures.


They already use them in 64bit Objective C on the desktop.


One of the biggest impacts of moving to 64-bit is increased memory pressure. While all of Apple's apps and daemons are running 64-bit, most users will be actively using third party apps that are 32-bit-only for a while. This means that on average there is less memory available in the system, because the amount of RAM is unchanged in the 5s, and there will now be code from both 64-bit and 32-bit binaries resident, rather than just 32-bit binaries.

Apple has done some work to alleviate this extra memory pressure at the kernel level. grep for WKdm in the xnu sources if you're interested.


It is interesting to watch ARM finally adopting many of the great architectural solutions that MIPS used 22 years ago, back in 1991, when it launched the MIPS R4000 family of 64 bit processors. [1]

[1] http://groups.csail.mit.edu/cag/raw/documents/R4400_Uman_boo...


The original ARM ISA felt very VAX-inspired to me, such as the elegant (but ultimately inefficient) use of a general-purpose register for the program counter.

I've only just started looking at AArch64 but I agree that it feels a lot more like MIPS though. I think that's a good thing.


>This allows compiling if statements and similar without requiring branching. Intended to increase performance, it must have been causing more trouble than it was worth, as ARM64 eliminates conditional execution.

Probably because so many projects use Thumb (the default for iOS projects in XCode, for example) which doesn't include most instructions for conditional execution. From what I can tell, it also sounds like compilers weren't making very effective use of those instructions anyway.

Also, these were originally meant to compensate for a lack of branch prediction, which as I understand it, has changed drastically in recent years.


> With ARM64, there are 32 integer registers, with a dedicated zero register, link register, and frame pointer register. One further register is reserved for the platform, leaving 28 general purpose integer registers.

but http://www.arm.com/files/downloads/ARMv8_Architecture.pdf says:

31 general purpose registers accessible at all times * Improved performance and energy

* General purpose registers are 64-bits wide

* No banking of general purpose registers

* Stack pointer is not a general purpose register

* PC is not a general purpose register

* Additional dedicated zero register available for most instructions

Which one is it?

By the way, the ARMv8 resources are quite interesting overall and a bit more in-depth than the article. http://www.arm.com/products/processors/armv8-architecture.ph...


I'm not seeing the conflict between what I wrote and your quote from the architecture docs. Is it just confusion because the dedicated link register, frame pointer, and platform-reserved register are part of the ABI rather than the ISA?


There is no dedicated frame pointer-- that would be part of the ABI. The stack pointer and zero register are both encoded as register 31, and the meaning depends on context.


My statement encompasses both the ISA and the ABI, which I hope was implied by my previous comment.... From the perspective of a userland software writer, there's not much point in trying to distinguish between the two.


Is the frame pointer register really "dedicated" by the compiler? (I haven't had the time to upgrade everything I need to upgrade yet to install the new Xcode and check this directly.) With 32-bit ARM, there is a register denoted "fp", but with iOS 2.0 Apple started compiling without a dedicated frame pointer (instead using that register temporarily for some kind of thread-local storage variable, before moving that elsewhere and freeing up the register entirely). I was under the impression that dedicated frame pointers are only used when you have less-than-awesome compilers that are unable to keep track of the moving stack pointer target as it performs optimizations.


It's "required" by Apple's ABI that the frame pointer always point to a valid stack frame; just the same as on armv7. It's for debugging and ensuring backtraces are always valid. Not modifying the current frame pointer complies with this, which is what -fomit-leaf-frame-pointer does. But it's an ABI violation to use it as a general-purpose register.

It sounds like you're partially confusing this with r9, which is a completely different story - r9 was globally reserved by the system until iOS 3.0, then allowed. This equivalent in arm64 is x18, which again is globally reserved by the system.


Yeah, I got the backstory mixed up with r9, I think because it coincided with fp being largely renamed to r11 due to iOS using r7 as the frame pointer instead (which I now remember the patch I had to merge for). Sorry :(.


“fp” is an alias for r11; despite its name, it was never used as a frame pointer in iOS (the iOS tools set up a frame pointer in r7 by default, as many of the original thumb instructions cannot easily access r8-r12).

More generally, a frame pointer of some sort is necessary to deal with variable-sized stack allocations, and often useful for performance analysis and debugging tools, so many compilers set them up by default even when they aren’t strictly necessary.


The jumbling around of the register purposes (r11 moving to r7 despite r11 being "fp") combined with the early change to the meaning of r9 is what bit me; sorry about the confusion :(.


The link register isn't reserved in either AAPCS64 or Apple's ABI so it's effectively 29 GPRs. Yes it has a specially-defined purpose at function bounds, but it's perfectly fine to use it for whatever within a function, just like it was in armv7.


There are 31 GPRs. The zero register shares and encoding with SP, and is it’s own separate thing.

From the 31 GPRs, x30 is the link register and x29 is the frame pointer. x30 can be used for other purposes within a routine, but x29 must always hold a valid frame record in the iOS ABI. Additionally, iOS reserves x18 (“The platform register”) for all use. So there are really 28 GPRs, or 29 if you include x30/lr, which is something of a hybrid.


Thank you for the breakdown of how performance is affected with the new architecture.

I've had a few quibbles about where performance gains would be, and all too often I was told that the performance increases would be solely realized in the larger memory addressing space. That just didn't seem right to me.

I really like the use of the otherwise unused space in the 64-bit pointers.


"On ARM64, 19 bits of the isa field go to holding the object's reference count inline." That's really awesome.


I hope by "Perform an atomic store of the new isa value." he means "Perform an atomic compare-and-set of the new isa value."

A64 doesn't eliminate conditional execution completely. It just pares it down to the basics: branch (obviously), add/sub, select, compare (for flattening conditionals like `a && b && c`).

Another thing removed from A32 was the optional shift on operand 2-- which was taking up 7/32 bits for most instructions.

This has a few more that were missed: http://nominolo.blogspot.com/2012/07/arms-new-64-bit-instruc...


It's not a compare-and-set. Rather, it uses ARM's atomic instructions where the load creates a reservation on the memory address, and the store succeeds only if the reservation is still present, with any other stores to that address (or nearby addresses) breaking the reservation.

You can use this pattern to implement compare-and-set, but you don't need compare-and-set to use that pattern directly.

Edit: I wasn't sure how to encode this into the steps in the article, so it's a bit vague on that part. Suggestions welcome.


Also commonly called load-link, store-conditional: http://en.wikipedia.org/wiki/Load-link/store-conditional


Great write-up, thanks!

I expect we'll see ARMv8 architectures in the next round of flagship phones. Apple's a little ahead of the curve, but it won't be long till competitors catch up.

In the context of Apple, it's interesting to think about how they're going to take this next. ARM process and architecture improvements are likely to lead to chips with high-enough performance to be used in mainstream desktop applications – Is it possible we're going to see something like an ARM/x86 dual-processor Macbook platform that allows ARM's low power consumption supplement Intel's performance?


Macs with ARM processors don't seem like a possibility. Intel Haswell processors are shown to have comparable Performance per watt which is expected to get better with Broadwell.

ARM64 Apple chips are play for iPads. Current iPads are lagging on performance when we compare it to something like a Baytrail Intel tablet or a Haswell equipped surface tablet. There is going to be convergence point for Intel where a tablet with Haswell level performance with a Fanless chasis and 500$ price. Apple need to converge there to compete.


The bit about memory-mapped files, considering the fact that these devices aren't using magnetic discs, is something interesting. The conventional file API of seeking and streams suddenly feels a bit anachronistic. Of course, flash memory is often optimized for sequential reads, but still - it's far more amenable to the memory-mapped model than magnetic media ever was.


Memory mapped files have a problem (that may be less relevant for iOS) with error reporting.

Explicit APIs can have explicit error codes. Memory accesses don't have much opportunity to report errors, so have to resort to awful signals and such (that nobody handles properly).


Apple's NSData API even has a flag just for this, NSDataReadingMappedIfSafe. Basically, it uses memory mapping if the file is on the root filesystem, and otherwise just reads it all in conventionally. This is because if you end up memory mapping a file on a USB stick and the user yanks it, you'll segfault, and nobody likes a crashing app.

On the subject of memory mapping and magnetic disks, one amusing bit of history is that GNU's Hurd kernel originally implemented filesystems by memory mapping the entire hard drive and working from there. This worked fine at first, but started to cause major trouble when HDs grew beyond 4GB and Hurd was still running on 32-bit CPUs. I believe they ended up redoing it all without memory mapping so they could grow beyond that limit.


Why didn't ARM call it ARM64? It's hard to believe it didn't cross their minds and decided AArch64 is the better name, so it could be another reason.


The "ARM123" naming scheme was used to refer to specific ARM cores prior to the "Cortex" naming scheme. While "ARM64" isn't ambiguous in and of itself, it's troublingly close to ARM60, the first ARM CPU with a 32-bit address space.


I still wonder why in the world Apple went with just 1GB of RAM on the 5s. Even the Nexus 4 that I bought contract-free for $200 comes has 2GB of RAM.


1) Because the iPhone doesn't need 2 GB RAM.

2) They took the money they saved and devoted it elsewhere (perhaps the Sapphire home button? :)

When it comes down to the bill of materials, every cent really does count when it scales across several million units sold.


2GB of DDR also takes twice as much power to keep alive vs 1GB of DDR. If you don't need it - it's a waste of battery.


1) Because the iPhone doesn't need 2 GB RAM.

I disagree. Retina assets are huge. When programming for iOS (mobile in general) the biggest issue is memory usage and running out. Do anything with a lot of graphics and you start to bump into limits.


The iPhone doesn't need a 64-bit desktop-class processor, either.

I personally think Apple strategically held off the RAM upgrade 'til next year's iPhone, so that they could have a "killer feature" to lean on if they don't manage to figure out a more novel^Winnovative one in time.


> The iPhone doesn't need a 64-bit desktop-class processor, either.

It doesn't need it, but it certainly helps, as benchmarks and my own development of an app that does live video effects has shown.

so that they could have a "killer feature" to lean on if they don't manage to figure out a more novel

Apple has never marketed, nor revealed (to my knowledge) what amount of RAM is in an iOS device model. This is always determined later by a 3rd party.


The extra RAM also helps. Any good OS will not fail to find a use for extra RAM.


Are you seriously claiming that Apple kept the RAM below 1GB so that "Now has >1GB' could be their headline feature for next year?


Yes. What's wrong with the idea?


It seems way more likely that in the decision to go with a 64-bit CPU vs. 2 GB RAM, the pros outweighed the cons. I mean, who knows, but these kinds of decisions aren't made based on the simplistic criteria you're claiming they are. There are pros and cons (and benefits and compromises) to any engineering decision (and product design decision, and business decision, and marketing decision, etc. etc. etc.).

I could easily see a situation where a bunch of people were sitting around saying, "y'know, we don't absolutely positively NEED a 64-bit processor, but it doesn't cost much more and there might be some performance gains and if nothing else it might be a good bullet point for marketing. On the other hand, doubling the RAM will double what we pay for RAM, it won't help performance, and it will eat into the power budget, and it wouldn't make a good bullet point for marketing. The only downside is that denim_chicken will think we're being nefarious and will tell the world about it on HN."

I have no idea if that's what happened, but Occam's Razor suggests that it's more likely than the combination of pure evil and pure incompetence that you've been postulating here.


(none of which means that they aren't going to do exactly what you say they will, either, just not for the reasons you seem to be basing it on)


But why didn't Apple increase the RAM _and_ upgrade the processor?


Because increasing costs and increasing power usage wasn't outweighed by any marketing/sales gains to be had by doing so?


On the other hand, doubling the RAM will double what we pay for RAM, it won't help performance, and it will eat into the power budget, and it wouldn't make a good bullet point for marketing.

The power consumption angle is a complete and utter non-issue. It generally comes up as an apologetic canard to justify Apple's choice here, but in the holistic sense the power difference between 1GB and 2GB is negligible. Note that the ARMv8 processor, however, is a serious power pig.

However the limited memory should be an issue for people. I personally seriously considered a 5S to replace my GS3, largely for the fantastic camera, but the window of credible life for the 5S is simply too short -- 1GB just isn't enough, and seems especially deficient compared to such a fantastic processor.


The power consumption difference between 1GB and 2GB is far from negligible, both in auto-refresh (active) and self-refresh (sleep) mode.

It's a significant fraction of total power draw when asleep. You also have to account for PMU efficiency being pretty low when running at low current, so minor changes are an even larger difference. There's plenty of publicly available figures you can find to research this - find the "IDD6" self-refresh current in an lpDDR datasheet. Scale to the size of DDR you want, compare with battery rating adjusted for voltage.


It's a significant fraction of total power draw when asleep.

So a device with 2GB should have a significantly smaller standby time than one with 1GB, right, given that, by your claims, it's a significant fraction. Only that isn't true at all, and straight comparisons between, for instance, the GS3 international (1GB) and the GS3 Snapdragon (2GB) shows absolutely no reduction in two-week+ standby time. Of course all else isn't the same (it never is), and there are other power profile changes between them, but it certainly isn't remotely significant of a power draw.

Because in the profile of a smartphone it is absolutely negligible. Your phone is always in radio contact with the cell tower, that absolutely dwarfing all other power consumers. When you turn it actively on, the screen and the CPU absolutely dominate power consumption. There is no case where memory on smartphones is remotely a significant power consumer.

The iPhone has 1GB because that maximizes Apple profits. Every justification are like the hilariously silly claims when the iPhone was 3.5" so many had to justify why 3.5" was the ultimate size and aspect ratio. And it'll immediately shift again once Apple adds 2GB and a 5" screen to the iPhone 6.


Of course Apple tries to maximize the profit. I fail to find any business that doesn't try to do the same. It's called for-profit after all.

The question is, why doesn't Apple try to earn even more money by keeping the same A6 processor as in iPhone 5? Why bother to upgrade to A7 at all? Also, including an extra GB of RAM would cost Apple much less than upgrading to a whole new processor, don't you think?


There is no such thing as an 'ARMv8' processor that can even be a 'power pig'. And you haven't given any reason why 1GB is not enough. You are just making this stuff up.


We are talking about the Apple iPhone 5s. In that context, anyone not a moron knows exactly what I am talking about. Shush, you have zero interesting things to say and are just a defensive blowhard. Maybe Apple will send you a t-shirt or something.


I'll take every down arrow, but it is nonsense that melange isn't sitting in the sub-zero realm as well. Their garbage post was factually wrong, added absolutely nothing to the discussion, and is the classic demonstration of a buyers' defense.


Please offer some kind of reference supporting your claim that the arm chip in the iPhone 5s is power inefficient.


You mean the ARMv8 processor that doesn't exist? How about the fact that the iPhone 5S includes a significantly higher capacity battery, but in any CPU usage scenario sees a sometimes significant longevity regression.


So iOS 7 plays no part in power consumption? only the CPU? how about apps?

This is AnandTech's iPhone 5s battery review: http://www.anandtech.com/show/7335/the-iphone-5s-review/9

The 5s outperforms iPhone 5 in four out of five tests. Regression was only seen in one scenario. The thing is, I don't remember coming across any smartphone review, iPhone or otherwise, that single out the CPU as the source of power consumption change. It's always a combination of different factors.

Now it's totally possible that the new CPU is a power pig as you claimed. Unless you can provide proofs that validate it, though, I have to agree with others that you're making it up.


The 5s outperforms iPhone 5 in four out of five tests. Regression was only seen in one scenario.

The same review that notes the increased power consumption of the CPU? That one?

The iPhone wifi, display, and surrounding platform is identical to the iPhone 5. The LTE/3G chipset is improved (not surprising as it's a considerable power consumer, which was why Apple held out on LTE for a while). On the wifi test, where all else is the same as before, the iPhone 5S saw a 10% longevity decline despite a 10% larger battery.

Quite humorous seeing so many so desperately defensive about this, when the original (and completely unsubstantiated) claim was that going to 2GB would be see a marked increase in power consumption. We know from these very results that you provided that the device did see a 20% or more decrease in longevity, mAh to mAh, despite the fact that the CPU is generally a small consumer of power (for the whole device to consume 20% more power, the CPU had to have increased significantly more). Power "pig" is relative, and obviously it's a ridiculously low power processor by any normal metric, but compared to the one they replaced it with...yeah.


>The same review that notes the increased power consumption of the CPU? That one?

What's your point? It's a beefier SOC, and possibly more power hungry. Nobody disputes that.

You've tried to frame rebukes of your posts as "desperately defensive". For me I take issue with what you said below:

> The power consumption angle is a complete and utter non-issue. It generally comes up as an apologetic canard to justify Apple's choice here, but in the >holistic sense the power difference between 1GB and 2GB is negligible. Note that the ARMv8 processor, however, is a serious power pig

You seem to be absolutely sure, with numbers to back it up. You dismiss the OP as "completely unsubstantiated", but ironically you can't prove your points either. Should I take you seriously?


If there was a reference to support your claim, you would have provided one. As I said before - you're just making this stuff up.


You are a boring blowhard. You continually demand evidence of others while providing absolutely nothing of substance yourself, aside from a demonstration that you're a flag waving Apple "fanboy". You are what is wrong with technology discussions. Again, shush. Go somewhere else.


Well, for one, Apple has never marketed the amount of RAM in iOS devices. It's not even listed on the tech spec (neither is CPU frequency).

Apple's marketing tends to stick to what users will understand. Saying "Your apps will run twice as fast" people get, "Your phone will have twice as much memory", people don't.

Now, granted, saying 64-bit is somewhat idiosyncratic for Apple, but if you look closely they're saying that it's the first phone with a 64-bit processor. While people probably don't know why they want a 64-bit processor, they do know why they want a phone with cutting edge technology. As for why it's good, they keep referring to speed, which people can relate to. Still slightly unusual though.


Apple never mentions how much RAM is in iOS devices.


Well, if nothing else, Apple has never, ever mentioned the RAM in marketing material (in fact, last year, most people assumed the iPhone 5 had 512MB until it was available to look at).


Because the iPhone doesn't need 2 GB RAM.

Neither does the Nexus 4. It's a performance optimization that allows applications to stay resident without being freeze-dried, so to speak, helping multitasking. Could the 5s benefit from more memory? Absolutely, if you're jumping between various applications it makes a difference.

In any case, I wish my iPad had 2GB. With iOS 7 it seems to even discard images on richer webpages that are scrolled out of view. It is starting to seem very confined, and it's hard to view the 5s as a longer term option when it is already living in tight bounds (despite the fact that it has an amazing processor).


Power/battery. Extra RAM doesn't come for free power-wise, it costs energy to keep it's state.


The new ARM processor is said to be more power hungry than an extra 1GB of RAM.


Of course it is... when active. However, you can do a lot more about idling a processor than you can about idling RAM; at a minimum, DRAM requires power for the refresh cycle.


From the article:

  The biggest change is an inline retain count, which eliminates the need to perform a costly hash table lookup for retain and release operations in the common case. Since those operations are so common in most Objective-C code, this is a big win.


Only using 33 bits for memory addressing is troublesome. 33 bits is 8GB ram which is small potatos for a desktop. Why couldn't they have left it at 38 or even 40 bits? Or is this limitation only part of the objective-c runtime?


It comes down to the OS. Basically, when a new process is created, the OS sets up its address space and decides where it will allow new memory to be mapped. For whatever reason (I'm not entirely clear just yet), iOS 7 in 64-bit mode goes for an 8GB address space.

As far as I know, there's nothing preventing that from being increased on future hardware or even on the 5S with future OS updates. I believe the CPU itself supports a 48-bit virtual address space.


CPython has reference counts as a part of the object in memory. The claims of "large memory consumption" are nonsense, especially since small integer objects and strings are aggressively interned.

And increasing just one aligned integer is certainly cheaper than the bit masking the solution here entails (all of which is neatly hidden away in the 'increment of the correct portion' part).


Remember that this decision was made back when the entire system might have 32MB of RAM. Does CPython even fit in that, as a single process, let alone a full multitasking UNIX?

Additional RAM consumption has costs of its own, in terms of cache usage. Adding an extra 8 bytes for every object in the system is not insignificant. Masking and shifting is extremely cheap.

If you've run the benchmarks and can show your approach is better, by all means, please share.


Anybody know when 64-bit arm processors might be released as a blade or mini-server to work with?


Not sure what the pricing will be like, but AMD is aiming to release their Cortex A57 based CPUs in 2014[1].

[1]http://techreport.com/news/25338/new-amd-embedded-roadmap-sh...


First, a note on the name: the official name from ARM is "AArch64", but this is a silly name that pains me to type. Apple calls it ARM64, and that's what I will call it too.

What ARM calls ARM related periphery is canonical, whether you think it's silly or not.

However the overarching entity is called ARMv8, with the 64-bit state called AArch64 (which can be contrasted with the AArch32 state, which is also a part of ARMv8) and the instruction set is actually called A64.


Not all official names survive a confrontation with reality, where 'easier to remember' and 'easier to pronounce' have value, too.

Do you use the terms IA-32e and EM64T, too (both are/were Intel's official names for what people now typically call x64 or x86-64)?


EM64T & x86_64 only exist because Intel has too much pride to call it what it is: AMD64.


The name "x86_64" existed before AMD fully released their CPUs, and was latched onto by some of the open-source communities, so that specific one isn't exactly Intel's. But you're right about EM64T. :)


I do not believe this is true. Do you have a reference that used that name x86_64 before Intel shipped its first AMD64 compatible CPU?


http://en.wikipedia.org/wiki/X86-64 says "Prior to launch, "x86-64" and "x86_64" were used to refer to the instruction set. Upon release, AMD named it AMD64." and refers to https://wiki.debian.org/DebianAMD64Faq which says ""AMD64" is the name chosen by AMD for their 64-bit extension to the Intel x86 instruction set. Before release, it was called "x86-64" or "x86_64", and some distributions still use these names."

I, myself, had an AMD Opteron machine on which I ran an x86-64 build, with that name for the architecture, of SUSE Linux before Intel released their EM64T.


Here's a source to back up maggit's Wikipedia citations:

http://www.amd.com/us/press-releases/Pages/Press_Release_715...

Dated 8/10/2000.


Awesome thanks. I stand corrected.


I've covered this previously here: https://news.ycombinator.com/item?id=5498682

In short, IA-32e was intel's internal implementation name, which was changed to EM64T which was changed again and is now called INTEL64.

x86-64 / x86_64 / x64 are common names for the ISA created by various OSs, INTEL64 and AMD64 are implementations, the distinction is important because the implementations are not identical.


"What ARM calls ARM related periphery is canonical, whether you think it's silly or not."

And why, exactly, should I care?


Because the canon contains the words of consecration, by which we dedicate ourselves to divine purpose.


Maybe you don't, however the comment was really more generally targeted given that we're discussing it here. However when you start renaming things, other things start to not make sense. For instance-

"It's important to note that ARM64 includes a full 32-bit compatibility mode that allows running normal 32-bit ARM code without any changes and without emulation."

If ARM64 == AArch64 (which is what you decided), then no, this doesn't make sense. ARMv8 includes both AArch64 and optionally AArch32, the former running A64, the latter running traditional ARM. There are ARMv8 designs that actually can't run any traditional 32-bit code.


As I pointed out, that's Apple renaming it, not me. I'm just following their lead.

As for 32-bit compatibility, that has nothing to do with renaming, it's just me being imprecise. It would be equally imprecise if I had used "AArch64". Thank you for pointing it out, though, I've fixed it to say that it is the A7 that includes 32-bit compatibility.


I personally use Debian port names. Debian is pretty sensible about it.

Debian uses "amd64" and "arm64".


> What ARM calls ARM related periphery is canonical, whether you think it's silly or not.

Ah, yes, that must be why everyone in the world calls the 64bit x86 instruction set either AMD64 or IA-32E.


In this case the author directly conflates an architecture, state, and instruction set, which is exactly why this sort of 'well I will call it...' Nonsense is ill advised. But hey, good try right?


Thanks for your correction. Oh wait, you didn't correct anything!

Do you really think that if I had called it AArch64 throughout instead of ARM64, I wouldn't have written... whatever it is you're offended by in terms of conflating various parts? You're nuts!


Is there also a new registers but 32 bit pointers model like x32?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: