rhfish

's profile
image
CTO

Russell's achievements include co-designing the Sh-Boom Processor included in IEEE's"25 Microchips That Shook The World". He has a BSEE from Georgia Tech and an MSEE from Arizona State.


rhfish

's contributions
  • 02.14.2012
  • Future of computing - Part 3: The ILP Wall and pipelines
  • Embedding eDRAM in a logic process such as a CPU is about 3X more dense than static RAM. That is why IBM and INTEL use it for their CPU caches. http://en.wikipedia.org/wiki/EDRAM However eDRAM costs about 10X as much as the same memory size built using a memory process. Performance tends to be about half that of commodity DRAM.
  • 02.14.2012
  • Future of computing - Part 3: The ILP Wall and pipelines
  • Re: eDRAM - In our original 1989 work, Moore and I described making the CPU from DRAM transistors. However the flurry of merged CPU/memory efforts in the mid to late 90s used embedded DRAM instead. The primary advantage of adding eDRAM to an existing logic design is ease of implementation. Designers could use all the existing tools, logic libraries, and even existing architectures such as MIPS. The primary disadvantage of eDRAM is stupendous cost. Logic process transistors cost 500x transistor made using DRAM process. Adding the steps to fabricate the eDRAM capacitors increases cost even more. (Confirm the DRAM cost for 1 billion transistors on www.dramexchange.com.) Furthermore eDRAMs are only about 1/4th the density and 1/2 the performance of similar true DRAMs. However to design TOMI Aurora and TOMI Borealis we had to create our own logic library using DRAM process parameters. Autorouters were not happy with only 3 layers of metal in a DRAM, so our CPUs were laid out by hand. If 1980 microprocessors cost $50k instead of $100 there would be no PC business, no cellphones, and no iPads. We thought that $2, fast, low-power, multicore chips optimized for Big Data might be interesting. Apparently so do a few others. Russell Fish
  • 11.17.2011
  • The future of computers - Part 1: Multicore and the Memory Wall
  • Massive multicore chips such as Greenarrays, Tilera, Nvidia TESLA, and INTEL's Knight's Corner are good choices for applying intense computing power to small datasets. Each of the above options has particular strengths. Greenarray's is power. Tilera's is their crosspoint switch. Nvidia's is their experience in graphics. INTEL's is compatibility with a familiar legacy architecture. Example applications would include graphics rendering, encryption/decryption, video encoding/decoding, and some specialized scientific computing such as fluid flow or thermal modelling. As a class, these applications could be called "small data" since much of their operation is performed on datasets that will largely fit within their caches. As such they are less limited by the Memory Wall than "big data" cloud apps.
  • 11.17.2011
  • The future of computers - Part 1: Multicore and the Memory Wall
  • From my perspective the primary advantage of 3D stacks (like INTEL/Micron's HMC) is space saving. Speed improvement from HMC mostly comes from the interface chip driving a low-voltage differential bus. It sort of looks like the old Rambus, and that bus rather than the physical proximity of the stack probably is responsible for the speedup. The differential interface chip also reduces the module power. The primary disadvantage of 3D stacks is cost. If a single chip in the stack is bad, the entire stack is trash. The TSV *through silicon via" technology is pretty tough to make and in the case of Micron's chips I believe it adds more than 20% to the die size. Heat is the biggest problem when you try to add a CPU to an already thermally challenged DRAM stack. Even small CPUs like ARM generate a lot of heat when running at speed, and that heat increases DRAM leakage. DARPA does seem to favor the stack approach however.