The Heart of a CPU
Here's a very similar Verilog statement:
Superficially the same, but very different. This gives a requirement for an output that will always react to a change in selector. The value of out will always be
b and if at any time selector changes, out will change immediately.
In other words, supposes I had 10 statements like that driving
out0, out1, out2, and so on. In C, there would be no way that
out9 would change at the same time. The CPU is really only looking at one of them at a time (unless you posit some exotic multiprocessor system that chops your code up inside a function). However, in Verilog the requirement to do this for each output will cause all of them to be serviced at the same time. In an FPGA, that means there will be 10 mux gates, one to drive each output. There is no "cycling through" the 10 items. Each one does what it is supposed to do all the time and independent of the others.
That's great if you are running on an FPGA. However, for now I'm using the Icarus simulator. Note the word simulator! Icarus pretends a certain amount of time has passed and then recomputes all outputs. Then it advances the simulated time a little and repeats. As long as you refer to simulated time, the behavior is the same as it would be for a real FPGA. To control the programmatic nature of the simulator, you often put an initial block and some actual procedural code that has to run in a step by step nature like a regular program. Consider the test bench for vtach:
'timescale 1us/1ns module vtach_tb(); reg clk; reg reset; top dut(clk,reset); always #1 clk=~clk; initial begin $display("vtach - A CARDIAC implmentation in Verilog"); $display("By Al Williams [email protected]"); $display("Licensed under the Lesser GPL"); $display("See files COPYING and COPYING.LESSER for details"); $dumpfile("vtach_tb.vcd"); $dumpvars; clk=1'b0; reset=1'b1; dut.mem.row0=13'h101; // load location 1 into acc (acc=500) dut.mem.row0=13'h500; // output location 0 (print 101) dut.mem.row0=13'h033; // Input to location 33 (X) dut.mem.row0=13'h533; // output location 33 dut.mem.row0=13'h200; // add acc + location 0 (500+101=601) dut.mem.row0=13'h733; // sub acc - location 33 (601-X) dut.mem.row0=13'h610; // store acc to location 10 dut.mem.row0=13'h510; // output location 10 dut.mem.row0=13'h820; // goto location 20 dut.mem.row2=13'h599; // output return address from jump (should be 9) dut.mem.row2=13'h900; // halt! // dut.mem.row0=13'h800; // Only use this line with io_input.v // not with io_input_keyboard.v // dut.execunit.in.inputvalues=13'h222; #5 reset=1'b0; end endmodule
The timescale command tells the simulator to accept our timing inputs as having units of 1uS. It also tells the simulator to advance simulated time 1nS at a time. This allows you to visualize the output with a precision of 1nS without having to specify all timing inputs in such a fine resolution.
The line that names the top component is like an object instantiation in C++ or Java. It creates another module and "connects" the inputs and outputs. Keep in mind, this isn't like making a function call because it occurs all the time, not just when some invisible finger points to this line of code.
The CPU needs a clock and the line that reads:
always #1 clk=~clk;
simulates the clock, by pausing 1uS (the time mentioned in the timescale command) and then inverting the clock signal. This happens all the time, so you wind up with a nice square wave to drive the system.
Here's another example of something that happens all the time (from the execution unit in alu.v):
always @(posedge clk) begin if (rst==1'b0) // do nothing if in reset case (ir[11:8]) 4'b0000: // INP . . .
Here, the requirement is that the system determines when the positive edge of the clock occurs. It also tests the reset line, and if the system is not in reset it does a case statement on the first digit of the instruction register.
The rest of the test bench — another way of saying test driver, by the way — is inside an initial block. The simulator will execute the statements within as though they were a program. The # notation gives delays (in uS) to wait between statements.
I'll talk more about what's in this initialization as well as other modules of vtach next time. Meanwhile, the figure below shows a sample run of vtach displayed using GTKWave. Over the next few weeks I will finish up the rest of the instructions (right now, op codes 3 and 4 don't do anything) and also share more about the structure of a Verilog design and some of the unusual features of vtach. I've licensed vtach under the LGPL and, when it is done, you'll be able to find a living copy on OpenCores.
If you want to experiment in the meantime, all you need is the Icarus simulator and GTKWave. The go script in the downloads will run the test bench (and its very simple program). You can use GTKWave to examine the resulting vcd file.
If that sounds mysterious, don't worry. There will be more to come, including what is a vcd file and why you'd want to examine it. But I'll save that for next time.