If you're building complex hardware devices like home-designed CPUs, this may come in handy! Prototyping home-designed CPUs isn't an easy task at all. One might say designing one is difficult, but let's assume you've already done that. State machines and function generation can be a nightmare to debug. Once you've found the bugs, actually fixing them may prove to be even more of a nightmare. Depending on the school of thought, some people use FPGAs, PALs or CPLDs. They have a wonderful collection of tools already.
Some of us prefer to use ROMs. A number of hobbyist hardware designers instinctively run for the ROM programmer whenever anything like an ALU needs to be built. This article and the Python class it describes is for them. It's the result of making too many of these with individual programs, and needing to generalise the code to save my own time.
The ROMtools Python package helps you generate ROMs for painfully complex multi-input, multi-output functions. It can help you fit a function in a single ROM, or, if you prefer, generate bit slice ROMs.
The Theory
Any function that doesn't need to keep state (i.e., no flip-flops) that can be coded using a PAL can also be coded using a ROM, PROM, EPROM, EEPROM, or Flash RAM device (or even a RAM device, if you don't mind the transient nature, or if you're using it in software). The function's inputs are the ROM's address pins. The function's outputs are the data pins. Since a ROM's contents are arbitrary, there are no limitations to what function or data can be coded in there. An entire ALU can be constructed and programmed into a ROM, and various hobbyists have done exactly that. My own CFT CPU's ALU is calculated and made this way.
The only constraint is the size of the ROM, and the fact that the vast majority of parallel ROMs are 8 bits wide.
ROMtools to the Rescue
ROMtools provides you with a simple way of defining your table's inputs and outputs and filling the table with data. When you write the tables to disk, ROMtools takes care of slicing the data into multiple ROM images, if necessary. In this way you can get outputs of more than 8 bits at a time. When using multiple ROMs, they don't necessary have to be identical — different sizes may be stipulated, and ROMtools will happily output images for these.
In its current version, ROMtools makes binary ROM images and Verilog ‘binary’ (i.e. human-readable ASCII binary) files. It's quite easy to output in whatever other format you might prefer.
ROMtools is written in Python. It should run out of the box on every 21st Century operating system (so, pretty much everything except Windows). For Windows, you'll need to download a Python interpreter.
Examples
Like most things Python, it's easiest to explain the simplicity that is ROMtools by example. Let's make a 4-bit full adder with carry in and carry out.
Four-Bit Full Adder
from romtools import FunctionTable
tt = FunctionTable('c_in b:4 a:4', 'c_out y:4')
for a in xrange(16):
for b in xrange(16):
for c in xrange(2):
total = a + b + c
inputs = dict(a=a, b=b, c_in=c)
outputs = dict(y=total, c_out=int(total > 15))
tt.put(inputs, outputs)
tt.writeBin('full-adder')
Declaring the FunctionTable
object describes its inputs and ouputs. Each input or output is given a name and a width in bits. Inputs and outputs are given in most significant to least significant order, as you would in most hardware description languages. In this case, we have three inputs: a
(four bits), b
(also four bits) and c_in
(carry in; one bit). Note how single-bit signals don't need to be specified explicitly using :n
. Our inputs total 9 bits; carry in is the most significant bit.
There are two outputs: c_out
(carry out), which is the most significant bit, and y
, which occupies the least significant four bits.
The three nested loops iterate through all possible values of the inputs (c
is carry-in). How you nest the loops doesn't matter at all for this example. In bizarrely complex cases, things may be different. The sum is calculated as total = a + b + c
. Finally, two dicts
are constructed with the inputs and outputs, and tt.put()
is called to store the corresponding value in the ROM image.
At the end of it all, we save the ROM as a binary image called full-adder-00.bin
.
Eight-Bit Four-Operation ALU
Here's a much more complex example, an entire 8-bit ALU. It can do additions and the bitwise operations AND
, OR
and XOR
. It also outputs a zero flag which is 1
if the result is zero and a negative flag which is 1
if the result is negative in two's complement (i.e. bit 7 is set1). Additions have a carry-in and carry-out bit. The output is eleven bits wide, so it won't actually fit on a single ROM. Instead, we use two ROMs, which output as follows:
- ROM 00 includes the least significant 4 bits of the result, plus carry out, for a total of five used bits.
- ROM 01 includes the most significant 4 bits of the result, plus the zero and negative flags, for a total of six used bits.
This is obviously a bit contrived, but actual hardware may have special needs like these due to wiring or routing contraints, et cetera. Here's the code:
from romtools import FunctionTable
tt = FunctionTable('op:2 c_in b:8 a:8', 'c_out z n y:8', singleROM=False)
tt.rom('op c_in b a', 'y/3-0 c_out') # ROM 0: low nybble of Y & carry out
tt.rom('op c_in b a', 'y/7-4 z n') # ROM 1: high nybble of Y & flags
ops = {
# OP Function to calculate Y Function to calculate carry out
0: (lambda a, b, c: a + b + c, lambda a, b, c: int((a + b + c) >= 256)),
1: (lambda a, b, c: a & b, lambda a, b, c: 0),
2: (lambda a, b, c: a | b, lambda a, b, c: 0),
3: (lambda a, b, c: a ^ b, lambda a, b, c: 0),
}
for op in xrange(4):
fx_y, fx_c = ops[op]
for a in xrange(256):
for b in xrange(256):
for c in xrange(2):
inputs = dict(op=op, a=a, b=b, c_in=c)
result = fx_y(a, b, c)
zero = int(result == 0)
neg = int((result & 0x80) != 0)
outputs = dict(y=result, c_out=fx_c(a, b, c), z=zero, n=neg)
tt.put(inputs, outputs)
tt.report()
tt.writeBin('8-bit-alu')
The two major differences in this example are how multiple ROMs are defined using the FunctionTable.rom()
method. The singleROM=False
option is necessary to allow this. The other difference is slicing: in specifying what input and output signals go to each ROM, we have the option of using the signal/y-x
notation, so that bits x
to y (inclusive) of signal
will be added to the ROM. Bit numbers start at zero, and the highest bit is specified first. As always, inputs and outputs are specified in order of decreasing significance (hence the choice of range specification). If you've used a hardware description language, you should be right at home here.
The four operations are coded 0–3 and are ADD
, AND
, OR
and XOR
respectively. Two lambda
functions for each operation take care of calculating the result.
As you can see, calling the put()
method is no different.
After running the code, you'll get something like this (courtesy of the FunctionTable.report()
method):
Addresses set: 524288 / 524288
Time elapsed: 00:01:05 (8010 calcs per second)
ROMS:
ROM 00: ([ROM: 19 x 5 bits (27040/29040), checksum 4160], 'op c_in b a', 'y/3-0 c_out')
ROM 01: ([ROM: 19 x 6 bits (27040/29040), checksum 9f76], 'op c_in b a', 'y/7-4 z n')
The report()
method shows you the speed of the table generation, and how the ROMs were generated out of the function table. An EPROM/Flash RAM part number is included — a 4 MBit part, in this case. For each ROM, you also get a list of its input and output bits. Binary files for each ROM are saved separately.
Download
ROMtools is available in the usual collection of formats.