Thank you, 0x12 and all of you, for your interest in my quirky creation. Naturally I follow modern FPGA technology avidly, but I guess it's obvious that my project predates that sort of thing. 2901 style Bit-Slice components were the wonder of wonders not so long ago! But modern programmable logic brings us to another level altogether.
Another revolution of course is the community of the Internet. My KimKlone project was conceived and implemented as a one-man effort, and for years it remained a de facto secret since I had no way to publicize it and almost no-one to discuss it with. These are magical times we live in, and I'm grateful to share your ideas and comments.
This is nice stuff. I built a TV-Typewriter using Don Lancaster's cheap video cookbook. He pioneered this technique of 'co-processing' with the 6502. However, if you were going to do this today, you would be better off just building a 6502 into an FPGA [1] or buying an existing design [2] and modifying it. Jan Gray did a nice RISC cpu [3] that clearly got some inspiration from the the 6502 and it was very small in terms of gates (although that isn't such a big deal these days).
It is well worth your time to get an FPGA evaluation kit (my all time favorite is the Altera DE-2) and learn for yourself how straight forward it is to build your own CPU which is has instructions that do things you need vs what the manufacturer thought you might need.
World record for a stock 6502 without cooling is ~ 25Mhz, done one semi-drunk Friday evening in a lab at Commodore by Leonard Tramiel and associates.
"Let's see how fast this thing can go before the smoke gets out."
[Told to me by Leonard. I miss Friday beer-bashes surrounded by lab equipment, all kinds of stuff can happen. Richard Frick has the Atari distance record for a reverse-biased 555 timer; it blew its top about 15 feet.]
Extremely clever use of the illegal opcodes in the original 6502, check out the way he fakes the output of the databus to the processor by pretending a different opcode was read than the one that was actually read.
If you haven't watched this video about reverse engineering the 6502 (and how the decoder works, etc.), find the time. It's well worth the ~50 minutes. Mind blowing stuff: http://www.youtube.com/watch?v=reIYvmuWHhk&sns=tw
Interesting work, but I couldn't help to notice one part: "The KimKlone represents an architectural extension of the 65C02. The most striking improvement is efficient linear access to a 16 Mbyte Address Space."
Incidentally, this is what the 65816 was made for.
Nope, the 65816 address space is flat. If you read the docs and look at the opcodes I see how it can give the impression of being segmented (with all the "bank" terminology), but it's a full 24-bit address space with no restrictions.
Which is probably the case also for this hack. It's probably possible to devise logic that would remove this limitation, but I assume that would be mentioned in the article.
It's conceptually simple, but there are two problems:
1) you have to detect near jumps that jump back inside one bank. (at least from instruction ending on 0xffff to instruction starting on 0x0000, on the other hand you can plausibly ignore this as it is very unlikely case)
2) more importantly, I understand that this coprocessor contraption does not interact in any way with lower 16 bits of address bus. You would need to actually snoop on address bus and detect the wraparound and as this thing is built from MSI logic, detecting transition 1 -> 0 on all bits of address bus - while conceptually trivial - would require significant amount of hardware. You can detect 1 -> 0 transition only on 15th bit of address, but then you really need to detect jumps in microcode and disable this logic in case of jump.
Guy expands 6502 to 16M address space by intercepting the databus and re-mapping unused opcodes and clever use of the spurious signals generated by the cpu when executing other undefined opcodes, adds a few registers to make the whole thing transparent from an assembler programmers point of view. In other words, there is no difference to the programmer between native and newly minted instructions.
On top of that he boosts the speed of his forth interpreter by concentrating on a frequently used construct called 'NEXT' in a way that should make anybody that has tried to optimize the inner loop of some VM or language proud. After all, what better way to optimize in such a situation than to be able to mold the instruction set to your desire.
He then uses this home-brew Frankenstein contraption as his benchtop computer for multiple years to do real work (instead of just shooting some pretty pictures and calling it a day).
The most significant part (I think, probably wrong) is on page 5[0] where he details the invalid instructions that do more than a NOP and why they're useful.
Sorry about the TLDR situation; it's on my list to revise the article by prepending an abstract. BTW suggestions and questions about the article are welcome.
@mjhall - you're quite right; the strange, phantom 65c02 operations dramatically expanded what I could do with this project. It's extremely cooperative of the CPU to generate a memory address while leaving all registers unchanged! Although it's true I could've used the PROM to map NOPs onto CMP or BIT instructions and gotten my addresses that way, that approach preserves the registers but still stomps the Flags. In contrast, the "LDD" operations are ideal for the job -- an opportunity handed to me on a silver platter!
Another revolution of course is the community of the Internet. My KimKlone project was conceived and implemented as a one-man effort, and for years it remained a de facto secret since I had no way to publicize it and almost no-one to discuss it with. These are magical times we live in, and I'm grateful to share your ideas and comments.
-- Jeff