Wishbone Interface

June 2nd, 2015

SOC designs require standard bus architecture for IP cores to communicate There are other interconnection schemes like AMBA and CoreConnect but wishbone has several advantages over them as designers can select IP cores from the site and connect it to wishbone according to design specifications secondly it contains all features of AMBA and CoreConnect
Wishbone SOC interconnection architecture is a flexible design methodology for design with semiconductor IP cores.It is a standard interconnection scheme and common interface specification to design structured design methodologies on large projects.Different IP cores developed independently can be tied together and can be tested by standard IP core interfaces.Many reusable designs are available are compatible with WISHBONE standard.Designers has to collect IP cores and integrate it into design to complete SOC design.All these cores are free of cost and reusable.Hence it helps in making low cost,portable,reusable design.

To create portable interface that supports both ASIC and FPGA design that is independent of semiconductor technology and of logic signalling level.
To provide standard interface that can be written using languages such as VERILOG and VHDL.
Provides simple compact logical IP core hardware interface that require few logic gates
Supports single clock cycle data transfer
It provides variety of bus transfer cycle cycles in which data transaction is independent of application based IP cores
Provides different types of interconnection architecture


The figure given below shows wishbone interconnection scheme


Here master initiates the communication by providing address and control signals to slave and then slave respond to master with specified address range.The INTERCON system provides interface between master and slave for data transfer between them.The SYSCON provides wishbone clock and reset signals for proper functioning of system.Fig1 shown above clearly describes the process.Wishbone INTERCON are designed to operate over infinite frequency range and can be described using hardware languages like VERILOG or VHDL
Wishbone interface specification is used to specify signalling method used by master,slave and syscon module.It also provides a way to create documentation wishbone compatible IP cores which can be reused.Wishbone compatible IP cores must be provided with Wishbone datasheet.Wishbone datasheet contains the below information
 The revision level of wishbone after which it is designed
 If IP designed is Master or Slave IP core
 Signal name used in the design must be specified.If any signal name is different from defined specification then it must have a cross reference to original signal specification
 How the master reacts to error and retry signal and how slave generates error and retry signal
 The design supporting tag signal must specify TAG TYPE and operation of tag
 The port size must be 8bit,16bit,32bit or 64bit
 Data transfer ordering such as LITTLE ENDIAN and BIG ENDIAN must be specified
 Any constraint on the clock signal must be specified in terms of clock frequency

Interface signals are divided into three parts master signals,slave signals and signals common to both master and slave
Signals must allow master and slave to use variable interconnection
Signals must support all basic types of bus cycle
Handshaking mechanism must be used for either master or participating slave to adjust data transfer
Every interface must support acknowledgment signal and retry and error signals are optional
Address and data bus width can be 8bit,16bit,32bit or 64bit
All signals must be either input or output.If bidirectional then target device should support it

Three types of bus cycle is supported by wishbone interface
• Single read write
• Block read write
• Read modify write(RMW)
All bus cycles follows the handshaking protocol for the transfer of data between master and slave.

Handshaking protocol is used in transfer of data between master and slave.The transfer is divided into four parts
• Operation requested
• Slave ready
• Operation over
• Ready for new operation
Here the master asserts strobe indicating data is ready to be transferred,strobe remains asserted until slave asserts acknowledge signal indicating that it is ready to participate in data transfer.At every rising edge of clock terminating acknowledge signal is sampled and if asserted then strobe signal is negated showing the completion of operation.In response slave negates its acknowledge signal indicating it is ready for new operation.


Power Operator ** in Verilog ( Especially 2 to Power N)

May 24th, 2015

The 2001 edition of Verilog introduced power operator using **.  The x**y, for example mean x to the power of y.  The most common us of power operator will be 2 to the power N which will also be easiest to understand from synthesis perspective. This code snippet, for example compares if all bits of an N bit register are one.


parameter N = 4;



assign clock_out_tick   =  (reg_value == 2**N-1)  ?  l'bl  :  l'b0;

Let us say N is 4. In that case 2**N -1 will have a value of 15 or 4’b1111. This is part of a code that gives a “clock tick” or a  1 signal for 1 clock period when counter reaches its maximum value.

If  N were not a parameter we could possibly have written this code

assign clock_out   =  (reg_value == 4'b1111)  ?  l'bl  :  l'b0;

You will also find its usage in localparam declaration often as in

localparam N;
localparam ALL_BITS_1       = 2 ** N -1;

Anytime you see 2**N-1 mentally assumes that it is equivalent to making all N bits 1.

In general, the operands of the power operators can take values other than 2 and N mentioned here. For example.

4**3 evaluates to 4*4*4
3**0 evaluates to 1

Some special cases
Hopefully you do not come across these – but just in case – if you raise a negative power to 0, it will become undecided.

0 **(-1) evaluates to 1/0 or ‘x

Difference with C

In the power ^ is used for power operation. So 2^3 will mean 23. In verilog the ^ was taken up for Exclusive OR and hence the FORTRAN style ** was used for power operator. The Section 11.4.3 of Verilog reference manual describes the power operator.

More verilog tutorial here


Verilog Clock Divide by 3 – Synthesis Issue and others

May 22nd, 2015

A clock divide by 3 has to work in one way or the other on the positive as well as the negative edge of the clock. We started working at the following verilog code to implement the divide by 3

module clk_div3(clk,reset, clk_out);

input clk;
input reset;
output clk_out;

reg [1:0] r_reg;
wire [1:0] r_nxt;
reg clk_track;

always @(posedge clk,negedge clk)

  if (reset)
        r_reg <= 2'b0;
	clk_track <= 1'b0;

  else if (r_nxt == 2'b11)
	     r_reg <= 0;
	     clk_track <= ~clk_track;

      r_reg <= r_nxt;

 assign r_nxt = r_reg+2'b01;
 assign clk_out = clk_track;

The code works well as far as compilation is concerned. However, when we started implementing it in Spartan 3,
it turns out that Xilinx shows following error in the Synthesis.

ERROR:Xst:899 – “comparator.v” line 37: The logic for <r_reg> does not match a known FF or Latch template. The description style you are using to describe a register or latch is not supported in the current software release.
ERROR:Xst:899 – “comparator.v” line 38: The logic for <clk_track> does not match a known FF or Latch template. The description style you are using to describe a register or latch is not supported in the current software release.

It turns out that the Spartan 3 in particular and Spartan 3 in particular does not support flip flops that work on positive
as well as negative edges.

For example if we change the line

always @(posedge clk,negedge clk)


always @(posedge clk)

The synthesis is proceed without error. However, we have a divide by 6 and not divide by 3 circuit as you can see from the following screen shot capture on a Xilinx evaluation board.


So what is the solution ?

One approach is to have two separate always block – each working on a positive and a negative edge of clock and then to combine the two in some way.

To understand it, consider that we keep two counters, one that counts the number of  positive rising edge and other that counts negative rising edge. The counter counts 0,1, 2 and then 0  as in the following diagram.


We can make use of this counter to generate a positive clock for three half cycles and negative clock for next three half clock cycles.  Look carefully at the diagram and look for the periods where at least one of the counters is 2. The assign high to clock_out for these periods. And then assign 0 to rest of the time periods.  If you can understand this, the following code should be easy to understand.

module clk_div3(clk,reset, clk_out);

input clk;
input reset;
output clk_out;

reg [1:0] pos_count, neg_count;
wire [1:0] r_nxt;

always @(posedge clk)
if (reset)
pos_count <=0;
else if (pos_count ==2) pos_count <= 0;
else pos_count<= pos_count +1;

always @(negedge clk)
if (reset)
neg_count <=0;
else  if (neg_count ==2) neg_count <= 0;
else neg_count<= neg_count +1;

assign clk_out = ((pos_count == 2) | (neg_count == 2));

Notice the statement

assign clk_out = ((pos_count == 2) | (neg_count == 2));

This basically implements the logic that we observed. Also notice that we could replace the above with

assign clk_out = ((pos_count == 1) | (neg_count == 1));

or with

assign clk_out = ((pos_count == 0) | (neg_count == 0));

Here is the output waveform for this code. This was synthesized and verified on Spartan 3.


The output clock in this scheme may have glitches at the crossings produced by the combinational logic.

FPGA’s generally have dedicated PLL and clock circuit that provide low jitter clocks. They have generally capability to multiply by 2 or divide by 1.5, 2, 3, 4.5 etc. These should be used, especially for higher clock frequencies. The idea presented here can be used for slower clocks and for division of a clock by an odd number.


Verilog Initial block synthesis

May 4th, 2015

In Verilog, Initial blocks are used in test benches. There may be one or more than one initial block all running simultaneously right at the beginning.

Can initial block be also used in synthesis. For example, you may be tempted to write the code as follows

module dff_not_synthesizable(clkin,q,d);
input clkin,d;
output q;
reg q;

initial  // Not synthesizable

 q <= 0;

always @ (posedge clkin)
 q <= d;


We are trying to make sure that initially the output is 0 using the initial statement

q <= 0;

However, the code will not synthesize. We should be using a reset input instead, to make make initial value of the output of the D flip flop low at the time of reset.

Initial blocks are non synthesizable blocks.

Why the hell then is it included in Verilog ? To confuse the students ?

Well, generating synthesizable code is not the only purpose of Verilog. A lot of effort goes on in verifying the code that is synthesizable. And hence we need a lot of other constructs and initial block is one such construct. The $display, $monitor etc are other such constructs that will not synthesize, but will help in testing the code.


Verilog Tutorial expanded and future plans

April 30th, 2015

We have expanded an updated Verilog Tutorial and we expect it to be helpful to beginners as well as experienced users ( in future if not now). We now have more persons contributing to this tutorial. Expect an online compiler to be up and running. Numerous examples have been added.

We also have plans to bring you an FPGA board that you can use to run and learn with the actual generation of the synthesizable code. We will update you as soon as possible.

Update : We started adding Verilog example. Here are some



i.MX6 DDR3 memory Calibration

April 13th, 2015

The i.MX6 boards that come in several flavors have been found to fail in extreme memory tests. One of the quick ways to fix it is to Calibrate the DDR3 memory and see if this fixes the issue. We will outline the steps that has been verified to work.

The procedure outlines for the calibration of DDR in MX6.

Step 1 : Downloaded DDR stress tester from

Navigate to the freescale post at https://community.freescale.com/docs/DOC-96412

We will used the procedure where we can calibrate using Serial Ports. We may look at the USB method at some other time. The file to use is DDR_Stress_Tester_V1.0.2_UART1_for_SDboot&JTAG.zip. Download and unzip this file and navigate to find the file named DDR_Stress_Tester_V1.0.2_UART1\ddr-stress-test-mx6dq.bin.

Copy this file to your SD Card using the procedure in < a href ="http://referencedesigner.com/blog/transferring-kernel-in-i-mx6-android-system-using-adb/2359/"> this blog

2. Boot the board, stop at u-boot and load the DDR stress test file using

U-Boot > ext2load mmc 0:1 0x907000 /ddr-stress-test-mx6dq.bin
U-Boot > go 0x907000

Note that this assumes SD Card is on ext2 filesystem. The original command for fat SD Card

as mentioned in freescale site is

U-Boot > fatload mmc 2:1 0x907000 ddr-stress-test-mx6dq.bin
U-Boot > go 0x907000

3. Extend Serial Ports on Pins CSI0_DAT10 and CSI0_DAT11 of your board. These pins will be used as UART ports that will give the output when booting.

At this stage you can get the calibration results

DDR Stress Test (1.0.2) for MX6DQ
Build: Dec 10 2013, 14:26:05
Freescale Semiconductor, Inc.

=======DDR configuration==========
BOOT_CFG3[5-4]: 0x00, Single DDR channel.
DDR type is DDR3
Data width: 64, bank num: 8
Row size: 14, col size: 10
Chip select CSD0 is used
Density per chip select: 1024MB

What ARM core speed would you like to run?
Type 0 for 650MHz, 1 for 800MHz, 2 for 1GHz, 3 for 1.2GHz
ARM set to 800MHz

Please select the DDR density per chip select (in bytes) on the board
Type 0 for 2GB; 1 for 1GB; 2 for 512MB; 3 for 256MB; 4 for 128MB; 5 for 64MB; 6 for 32MB
For maximum supported density (4GB), we can only access up to 3.75GB. Type 9 to select

DDR density selected (MB): 512

Calibration will run at DDR frequency 528MHz. Type ‘y’ to continue.
If you want to run at other DDR frequency. Type ‘n’
DDR Freq: 528 MHz

Would you like to run the write leveling calibration? (y/n)
Please enter the MR1 value on the initilization script
This will be re-programmed into MR1 after write leveling calibration
Enter as a 4-digit HEX value, example 0004, then hit enter
0004 You have entered: 0x0004
Start write leveling calibration
Write leveling calibration completed
MMDC_MPWLDECTRL0 ch0 after write level cal: 0x001E001E
MMDC_MPWLDECTRL1 ch0 after write level cal: 0x002F0023
MMDC_MPWLDECTRL0 ch1 after write level cal: 0x002C0036
MMDC_MPWLDECTRL1 ch1 after write level cal: 0x001D0034

Start: HC=0x02 ABS=0x18
End: HC=0x04 ABS=0x4C
Mean: HC=0x03 ABS=0x32
End-0.5*tCK: HC=0x03 ABS=0x4C
Final: HC=0x03 ABS=0x4C
Start: HC=0x01 ABS=0x68
End: HC=0x04 ABS=0x40
Mean: HC=0x03 ABS=0x14
End-0.5*tCK: HC=0x03 ABS=0x40
Final: HC=0x03 ABS=0x40
Start: HC=0x01 ABS=0x58
End: HC=0x04 ABS=0x28
Mean: HC=0x02 ABS=0x7F
End-0.5*tCK: HC=0x03 ABS=0x28
Final: HC=0x03 ABS=0x28
Start: HC=0x01 ABS=0x5C
End: HC=0x04 ABS=0x3C
Mean: HC=0x03 ABS=0x0C
End-0.5*tCK: HC=0x03 ABS=0x3C
Final: HC=0x03 ABS=0x3C
Start: HC=0x02 ABS=0x14
End: HC=0x04 ABS=0x68
Mean: HC=0x03 ABS=0x3E
End-0.5*tCK: HC=0x03 ABS=0x68
Final: HC=0x03 ABS=0x68
Start: HC=0x02 ABS=0x14
End: HC=0x04 ABS=0x50
Mean: HC=0x03 ABS=0x32
End-0.5*tCK: HC=0x03 ABS=0x50
Final: HC=0x03 ABS=0x50
Start: HC=0x01 ABS=0x50
End: HC=0x04 ABS=0x0C
Mean: HC=0x02 ABS=0x6D
End-0.5*tCK: HC=0x03 ABS=0x0C
Final: HC=0x03 ABS=0x0C
Start: HC=0x00 ABS=0x7C
End: HC=0x04 ABS=0x44
Mean: HC=0x02 ABS=0x60
End-0.5*tCK: HC=0x03 ABS=0x44
Final: HC=0x03 ABS=0x44

DQS calibration MMDC0 MPDGCTRL0 = 0x4340034C, MPDGCTRL1 = 0x033C0328

DQS calibration MMDC1 MPDGCTRL0 = 0x43500368, MPDGCTRL1 = 0x0344030C

Note: Array result[] holds the DRAM test result of each byte.
0: test pass. 1: test fail
4 bits respresent the result of 1 byte.
result 00000001:byte 0 fail.
result 00000011:byte 0, 1 fail.

Starting Read calibration…

ABS_OFFSET=0x00000000 result[00]=0x11111111
ABS_OFFSET=0x04040404 result[01]=0x11111111
ABS_OFFSET=0x08080808 result[02]=0x11111111
ABS_OFFSET=0x0C0C0C0C result[03]=0x11111111
ABS_OFFSET=0x10101010 result[04]=0x11111111
ABS_OFFSET=0x14141414 result[05]=0x11011111
ABS_OFFSET=0x18181818 result[06]=0x01011010
ABS_OFFSET=0x1C1C1C1C result[07]=0x00011000
ABS_OFFSET=0x20202020 result[08]=0x00011000
ABS_OFFSET=0x24242424 result[09]=0x00010000
ABS_OFFSET=0x28282828 result[0A]=0x00010000
ABS_OFFSET=0x2C2C2C2C result[0B]=0x00000000
ABS_OFFSET=0x30303030 result[0C]=0x00000000
ABS_OFFSET=0x34343434 result[0D]=0x00000000
ABS_OFFSET=0x38383838 result[0E]=0x00000000
ABS_OFFSET=0x3C3C3C3C result[0F]=0x00000000
ABS_OFFSET=0x40404040 result[10]=0x00000000
ABS_OFFSET=0x44444444 result[11]=0x00000000
ABS_OFFSET=0x48484848 result[12]=0x00000000
ABS_OFFSET=0x4C4C4C4C result[13]=0x00000000
ABS_OFFSET=0x50505050 result[14]=0x00000000
ABS_OFFSET=0x54545454 result[15]=0x00000110
ABS_OFFSET=0x58585858 result[16]=0x00100110
ABS_OFFSET=0x5C5C5C5C result[17]=0x00101110
ABS_OFFSET=0x60606060 result[18]=0x00101111
ABS_OFFSET=0x64646464 result[19]=0x01101111
ABS_OFFSET=0x68686868 result[1A]=0x11101111
ABS_OFFSET=0x6C6C6C6C result[1B]=0x11111111
ABS_OFFSET=0x70707070 result[1C]=0x11111111
ABS_OFFSET=0x74747474 result[1D]=0x11111111
ABS_OFFSET=0x78787878 result[1E]=0x11111111
ABS_OFFSET=0x7C7C7C7C result[1F]=0x11111111


Starting Write calibration…

ABS_OFFSET=0x00000000 result[00]=0x10111111
ABS_OFFSET=0x04040404 result[01]=0x10110111
ABS_OFFSET=0x08080808 result[02]=0x10110011
ABS_OFFSET=0x0C0C0C0C result[03]=0x10100010
ABS_OFFSET=0x10101010 result[04]=0x00100000
ABS_OFFSET=0x14141414 result[05]=0x00100000
ABS_OFFSET=0x18181818 result[06]=0x00100000
ABS_OFFSET=0x1C1C1C1C result[07]=0x00000000
ABS_OFFSET=0x20202020 result[08]=0x00000000
ABS_OFFSET=0x24242424 result[09]=0x00000000
ABS_OFFSET=0x28282828 result[0A]=0x00000000
ABS_OFFSET=0x2C2C2C2C result[0B]=0x00000000
ABS_OFFSET=0x30303030 result[0C]=0x00000000
ABS_OFFSET=0x34343434 result[0D]=0x00000000
ABS_OFFSET=0x38383838 result[0E]=0x00000000
ABS_OFFSET=0x3C3C3C3C result[0F]=0x00000000
ABS_OFFSET=0x40404040 result[10]=0x00000000
ABS_OFFSET=0x44444444 result[11]=0x00000000
ABS_OFFSET=0x48484848 result[12]=0x00000000
ABS_OFFSET=0x4C4C4C4C result[13]=0x00000000
ABS_OFFSET=0x50505050 result[14]=0x00000000
ABS_OFFSET=0x54545454 result[15]=0x00000000
ABS_OFFSET=0x58585858 result[16]=0x00000000
ABS_OFFSET=0x5C5C5C5C result[17]=0x00000000
ABS_OFFSET=0x60606060 result[18]=0x00001000
ABS_OFFSET=0x64646464 result[19]=0x00001000
ABS_OFFSET=0x68686868 result[1A]=0x01001000
ABS_OFFSET=0x6C6C6C6C result[1B]=0x01011111
ABS_OFFSET=0x70707070 result[1C]=0x01011111
ABS_OFFSET=0x74747474 result[1D]=0x11011111
ABS_OFFSET=0x78787878 result[1E]=0x11111111
ABS_OFFSET=0x7C7C7C7C result[1F]=0x11111111


MMDC registers updated from calibration

Read DQS Gating calibration
MPDGCTRL0 PHY0 (0x021b083c) = 0x4340034C
MPDGCTRL1 PHY0 (0x021b0840) = 0x033C0328
MPDGCTRL0 PHY1 (0x021b483c) = 0x43500368
MPDGCTRL1 PHY1 (0x021b4840) = 0x0344030C

Read calibration
MPRDDLCTL PHY0 (0x021b0848) = 0x3E34363A
MPRDDLCTL PHY1 (0x021b4848) = 0x3E3E344A

Write calibration
MPWRDLCTL PHY0 (0x021b0850) = 0x30383C3A
MPWRDLCTL PHY1 (0x021b4850) = 0x4032483A

The DDR stress test can run with an incrementing frequency or at a static freq
To run at a static freq, simply set the start freq and end freq to the same value
Would you like to run the DDR Stress Test (y/n)?