Does anyone else feel sad that you'll never get to design silicon at this kind of low level? I hope that someday nanotech reaches a point where I can design and print 1 u-meter process chips or something.
(I guess I could always buy hundreds of thousands of transistors and start soldering...)
There a a few different techniques that can be used. Most of the lithography techniques use Hydrofluoric Acid, but if you are okay with slower build times you can etch the circuits directly with Ion beams or Electron Beams.
But the short story is you need an overhead projector, a microscope, and a few gnarly chemicals. Otherwise fairly straightforward.
Hmm, I thought you explained the FDIV bug was (in part) because the Boron† used carry-save adders to generate quotient digits? I don't see how you'd combine carry-save adders and Kogge–Stone carry lookahead. This post does mention the carry-save adder and link the other post, but doesn't explain the relationship between the two adders (are they the same in some galaxy-brain way I can't imagine? does one of them feed the other?)
The short answer is that a large carry-save adder feeds into this 8-bit carry-lookahead adder to generate the table index for division. In more detail, the division algorithm uses a ≈64-bit carry-save adder to hold the partial remainder during a division. The problem is that a carry-save adder holds the result in two pieces, the sum bits and the carry bits, which is what makes it fast. However, the division algorithm needs to use the top 7 bits as an index into the infamous division table, but this won't work if the value is in two pieces. The solution is to add the two pieces together using the carry-lookahead adder and then you have the table index.
The obvious question is why didn't they just use a carry-lookahead adder in the first place? The answer is that a carry-lookahead adder works better for smaller words (e.g. 8 bits), since its size is O(N^2) or O(N log N), depending on how you implement it. So you're better off with a large carry-save adder and a small carry-lookahead adder.
I see. I guess I had the impression that the carry and sum bits had been used directly to index the table, which is of course a thing you can do; it's just that 7 bits in the carry-save adder is the equivalent of ≈3 bits.
I just remembered your article on the standard cells of Pentium and there you noted bicmos was used by some gates for a reduction in propagation delay in some instances. Were any of the gates in the adder structures bicmos?
No, the adder was all CMOS. But there are some BiCMOS drivers visible at the bottom of one of the photos. The NPN transistors are big squares, unlike the CMOS transistors.
> In the TMS 1000, the program counter steps through the program pseudo-randomly rather than sequentially. The program is shuffled appropriately in the ROM to counteract the sequence, so the program executes as expected and a few transistors are saved.
Does anyone else feel sad that you'll never get to design silicon at this kind of low level? I hope that someday nanotech reaches a point where I can design and print 1 u-meter process chips or something.
(I guess I could always buy hundreds of thousands of transistors and start soldering...)
While it's not "easy", I was surprised to learn recently that DIY microprocessors are far more accessible than I thought.
Here is BreakingTaps fabricating home made chips, he has a fair bit of equipment though
https://youtu.be/RuVS7MsQk4Y?si=c8PH-FMfcfWLJLiG
There a a few different techniques that can be used. Most of the lithography techniques use Hydrofluoric Acid, but if you are okay with slower build times you can etch the circuits directly with Ion beams or Electron Beams.
But the short story is you need an overhead projector, a microscope, and a few gnarly chemicals. Otherwise fairly straightforward.
At this level? You can use your favourite HDL and a FPGA, and then look into things like TinyTapeout and such. It's not as inaccessible as it seems!
HDL is fun and all, but actually choosing the physical layout and sizing of transistors seems like a much more fun puzzle.
It does take a time investment, but it’s absolutely possible today. https://tinytapeout.com/specs/analog/
If you’re willing to do something that’s not manufacturable, but way easier to understand, then you might try this: https://tinytapeout.com/siliwiz/
If anyone is interested in adder taxonomy and why one might select one architecture over another in VLSI I can highly recommend this slide deck.
https://pages.hmc.edu/harris/cmosvlsi/4e/lect/lect17.pdf
In particular slides 36 and 37
I especially like slide 36, the "time cube" diagram of adder architectures.
Author here if anyone has questions about the obscure details of Kogge-Stone adders :-)
Hmm, I thought you explained the FDIV bug was (in part) because the Boron† used carry-save adders to generate quotient digits? I don't see how you'd combine carry-save adders and Kogge–Stone carry lookahead. This post does mention the carry-save adder and link the other post, but doesn't explain the relationship between the two adders (are they the same in some galaxy-brain way I can't imagine? does one of them feed the other?)
______
† https://en.wikipedia.org/wiki/Systematic_element_name
The short answer is that a large carry-save adder feeds into this 8-bit carry-lookahead adder to generate the table index for division. In more detail, the division algorithm uses a ≈64-bit carry-save adder to hold the partial remainder during a division. The problem is that a carry-save adder holds the result in two pieces, the sum bits and the carry bits, which is what makes it fast. However, the division algorithm needs to use the top 7 bits as an index into the infamous division table, but this won't work if the value is in two pieces. The solution is to add the two pieces together using the carry-lookahead adder and then you have the table index.
The obvious question is why didn't they just use a carry-lookahead adder in the first place? The answer is that a carry-lookahead adder works better for smaller words (e.g. 8 bits), since its size is O(N^2) or O(N log N), depending on how you implement it. So you're better off with a large carry-save adder and a small carry-lookahead adder.
I see. I guess I had the impression that the carry and sum bits had been used directly to index the table, which is of course a thing you can do; it's just that 7 bits in the carry-save adder is the equivalent of ≈3 bits.
I just remembered your article on the standard cells of Pentium and there you noted bicmos was used by some gates for a reduction in propagation delay in some instances. Were any of the gates in the adder structures bicmos?
No, the adder was all CMOS. But there are some BiCMOS drivers visible at the bottom of one of the photos. The NPN transistors are big squares, unlike the CMOS transistors.
> In the TMS 1000, the program counter steps through the program pseudo-randomly rather than sequentially. The program is shuffled appropriately in the ROM to counteract the sequence, so the program executes as expected and a few transistors are saved.
Two wrongs make a right.