Computer Architecture Lab/HOWTO

From Wikiversity
Jump to navigation Jump to search

A collection of some tips for your microprocessor design

Memory[edit | edit source]

You will probably need some memory for your processor. The simplest way to start is using the small on-chip memory blocks in the FPGA. You can instantiate the vendor-specific components (e.g. altsyncram. However, to get a vendor-independent design it is also possible to use plain VHDL.

Note: In actual FPGAs you cannot use memories in an asynchronous mode. At the minimum the address, data input and write enable are registered.

ROM[edit | edit source]

A ROM can be described by a simple VHDL case statement:

--
--  rom.vhd
--
--  generic VHDL version of ROM
--
--      DONT edit this file!
--      it is automatically generated
--

library ieee;
use ieee.std_logic_1164.all;

entity rom is
generic (width : integer; addr_width : integer);    -- for compatibility
port (
    clk         : in std_logic;
    address     : in std_logic_vector(9 downto 0);
    q           : out std_logic_vector(7 downto 0)
);
end rom;

architecture rtl of rom is

    signal areg     : std_logic_vector(9 downto 0);
    signal data     : std_logic_vector(7 downto 0);

begin

process(clk) begin

    if rising_edge(clk) then
        areg <= address;
    end if;

end process;

    q <= data;

process(areg) begin

    case areg is

        when "0000000000" => data <= "10000000";
        when "0000000001" => data <= "10000000";
        when "0000000010" => data <= "11000000";
        when "0000000011" => data <= "10000000";
        when "0000000100" => data <= "00011011";
        when "0000000101" => data <= "11000001";
        ....
        when "1111011011" => data <= "10000000";
        when "1111011100" => data <= "10000000";
        when "1111011101" => data <= "10000000";
        when "1111011110" => data <= "10000000";

        when others => data <= "00000000";
    end case;
end process;

end rtl;

Write a small program to generate this VHDL file from your assembler output or even let your assembler write the output as VHDL (see as exmple Jopa.java).

Another tool to generate a ROM VHDL file (romgen.cpp) is available in pacman_003.zip.

RAM[edit | edit source]

Even dual ported memories can now (with careful VHDL coding) be instantiated from plain VHDL as the following example shows:

--
--	sdpram.vhd
--
--	Simple dual port ram with read and write port
--		and independent clocks
--
--	When using different clocks following warning is generated:
--		Functionality differs from the original design.
--	Read during write at the same address is undefined.
--
--	Author: Martin Schoeberl (mschoebe@mail.tuwien.ac.at)
--
--	2006-08-03	adapted from simulation only version
--

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity sdpram is
generic (width : integer := 32; addr_width : integer := 7);
port (
	wrclk		: in std_logic;
	data		: in std_logic_vector(width-1 downto 0);
	wraddress	: in std_logic_vector(addr_width-1 downto 0);
	wren		: in std_logic;

	rdclk		: in std_logic;
	rdaddress	: in std_logic_vector(addr_width-1 downto 0);
	dout		: out std_logic_vector(width-1 downto 0)
);
end sdpram ;

architecture rtl of sdpram is

	signal reg_dout			: std_logic_vector(width-1 downto 0);

	subtype word is std_logic_vector(width-1 downto 0);
	constant nwords : integer := 2 ** addr_width;
	type ram_type is array(0 to nwords-1) of word;

	signal ram : ram_type;

begin

process (wrclk)
begin
	if rising_edge(wrclk) then
		if wren='1' then
			ram(to_integer(unsigned(wraddress))) <= data;
		end if;
	end if;
end process;

process (rdclk)
begin
	if rising_edge(rdclk) then
		reg_dout <= ram(to_integer(unsigned(rdaddress)));
		dout <= reg_dout;
	end if;
end process;

end rtl;

Use a single clock connected to wrclk and rdclk for your design

External Memory[edit | edit source]

For the 32 bit 1 MB SRAM on the FPGA board you can use the SimpCom compatible memory interface (sc_sram32.vhd) that is provided with JOP.

Xilinx Enabled RAM[edit | edit source]

General[edit | edit source]

This RAM implementation enables you to use the XILINX Block RAMs within your design. It is based on the code from the XILINX XST Manual. For small RAMs (e.g. address bus width = 3) distributed RAM is used since for such small memories it needs less logic cells. The code was synthesised using ISE Webpack 8.2i03 and tested on a Micromodule from Trenz Elektronik (carring a XC3S1000-4). The testroutine fills the RAM using a serial port and reads back the RAM contents. It is not included in the Wiki since it only works with additional hardware (serial port). Additionally the code was synthesised using QuartusII 6.0sp1 WebPack and tested on the LAB hardware.

Implementation Summary[edit | edit source]

Single Port RAM Dual Port RAM with one Write Port Dual Port RAM with two Write Ports
Behavioural Simulation OK OK OK
Synthesis (XC3S1000) OK OK OK
Synthesis (EP1C12) 1 warning TBD TBD
Timing Simulation (XC3S1000) OK OK OK
Timing Simulation (EP1C12) TBD TBD TBD
Test (XC3S1000) OK TBD TBD
Test (EP1C12) OK TBD TBD

TBD...to be done

Code[edit | edit source]

Simple Dual-Port RAM[edit | edit source]

The following code is compiled to on-chip memory, which is prefered for the register file.

--
--
--  This file is a part of JOP, the Java Optimized Processor
--
--  Copyright (C) 2006, Martin Schoeberl (martin@jopdesign.com)
--
--  This program is free software: you can redistribute it and/or modify
--  it under the terms of the GNU General Public License as published by
--  the Free Software Foundation, either version 3 of the License, or
--  (at your option) any later version.
--
--  This program is distributed in the hope that it will be useful,
--  but WITHOUT ANY WARRANTY; without even the implied warranty of
--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
--  GNU General Public License for more details.
--
--  You should have received a copy of the GNU General Public License
--  along with this program.  If not, see <http://www.gnu.org/licenses/>.
--

--
--	sdpram.vhd
--
--	Simple dual port ram with read and write port
--		and independent clocks
--	Read and write address, write data is registered. Output is not
--	registered. Read enable gates the read address. Is compatible
--	with SimpCon.
--
--	When using different clocks following warning is generated:
--		Functionality differs from the original design.
--	Read during write at the same address is undefined.
--
--	If read enable is used a discrete output register is synthesized.
--	Without read enable the 
--
--	Author: Martin Schoeberl (martin@jopdesign.com)
--
--	2006-08-03	adapted from simulation only version
--	2008-03-02	added read enable
--

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity sdpram is
generic (width : integer := 32; addr_width : integer := 7);
port (
	wrclk		: in std_logic;
	data		: in std_logic_vector(width-1 downto 0);
	wraddress	: in std_logic_vector(addr_width-1 downto 0);
	wren		: in std_logic;

	rdclk		: in std_logic;
	rdaddress	: in std_logic_vector(addr_width-1 downto 0);
	rden		: in std_logic;
	dout		: out std_logic_vector(width-1 downto 0)
);
end sdpram ;

architecture rtl of sdpram is

	signal reg_dout			: std_logic_vector(width-1 downto 0);

	subtype word is std_logic_vector(width-1 downto 0);
	constant nwords : integer := 2 ** addr_width;
	type ram_type is array(0 to nwords-1) of word;

	signal ram : ram_type;

begin

process (wrclk)
begin
	if rising_edge(wrclk) then
		if wren='1' then
			ram(to_integer(unsigned(wraddress))) <= data;
		end if;
	end if;
end process;

process (rdclk)
begin
	if rising_edge(rdclk) then
		if rden='1' then
			reg_dout <= ram(to_integer(unsigned(rdaddress)));
		end if;
	end if;
end process;

	dout <= reg_dout;

end rtl;
Single Port RAM[edit | edit source]
-------------------------------------------------------------------------
--
-- Filename: sp_ram.vhd
-- =========
--
-- Short Description:
-- ==================
--   Single port ram.
--
-- Description:
-- ============
--   Implementation of a single port ram.
--   Based on the ram code described in the Xilinx XST Manual.
--
--   The memory uses low active control signals. The following
--   combinations are possible:
--
--    nwr | ncs | Description
--   ----------------------------
--     0  |  0  | Write access
--     1  |  0  | Read access
--     X  |  1  | Inactive
--
--   The address and data bus width can be specified using
--   the generic map. The default is eight bit data bus and
--   eight bit address bus.
--
-- Verification:
-- =============
--   Simulated using ISE Webpack 8.2.03i running on Ubuntu Linux 6.10.
--   Synthesized using ISE Webpack 8.2.03i running on Ubuntu Linux 6.10.
--   Tested on a XC3S1000-4
--
-------------------------------------------------------------------------

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity sp_ram is
  generic
  (
    DATA_WIDTH : integer := 8;
    ADDR_WIDTH : integer := 8
  );
  port
  (
    clk : in std_logic;
    addr : in std_logic_vector(ADDR_WIDTH - 1 downto 0);
    data_w : in std_logic_vector(DATA_WIDTH - 1 downto 0);
    data_r : out std_logic_vector(DATA_WIDTH - 1 downto 0);
    nwr : in std_logic;
    ncs : in std_logic
  );
end sp_ram;

architecture beh of sp_ram is
  subtype ram_entry is std_logic_vector(DATA_WIDTH - 1 downto 0);
  type ram_type is array(0 to (2 ** ADDR_WIDTH) - 1) of ram_entry;
  signal ram : ram_type;
begin
  process(clk)
  begin
    if rising_edge(clk) then
      if ncs = '0' then
        if nwr = '0' then
          ram(to_integer(unsigned(addr))) <= data_w;
        else
          data_r <= ram(to_integer(unsigned(addr)));
        end if;
      end if;
    end if;
  end process;
end beh;
Dual Port RAM with one Write Port[edit | edit source]
-------------------------------------------------------------------------
--
-- Filename: dp_ram_1w.vhd
-- =========
--
-- Short Description:
-- ==================
--   Dual port ram with only one writing port.
--
-- Description:
-- ============
--   Implementation of a dual port ram with only one writing port.
--   Based on the ram code described in the Xilinx XST Manual.
--
--   Both ports are using the same clock!
--   The memory uses low active control signals. The following
--   combinations are possible:
--
--   Port 1:
--    nwr1 | nsc1 | Description
--   ----------------------------
--      0  |   0  | Write access
--      1  |   0  | Read access
--      X  |   1  | Inactive
--
--   Port 2:
--    nsc2 | Description
--   ---------------------
--      0  | Read access
--      1  | Inactive
--
--   The address and data bus width can be specified using
--   the generic map. The default is eight bit data bus and
--   eight bit address bus.
--
-- Verification:
-- =============
--   Simulated using ISE Webpack 8.2.03i running on Ubuntu Linux 6.10.
--   Synthesized using ISE Webpack 8.2.03i running on Ubuntu Linux 6.10.
--
-------------------------------------------------------------------------

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity dp_ram_1w is
  generic
  (
    DATA_WIDTH : integer := 8;
    ADDR_WIDTH : integer := 8
  );
  port
  (
    clk : in std_logic;
    addr1 : in std_logic_vector(ADDR_WIDTH - 1 downto 0);
    data_w1 : in std_logic_vector(DATA_WIDTH - 1 downto 0);
    data_r1 : out std_logic_vector(DATA_WIDTH - 1 downto 0);
    nwr1 : in std_logic;
    ncs1 : in std_logic;
    addr2 : in std_logic_vector(ADDR_WIDTH - 1 downto 0);
    data_r2 : out std_logic_vector(DATA_WIDTH - 1 downto 0);
    ncs2 : in std_logic
  );
end dp_ram_1w;

architecture beh of dp_ram_1w is
  subtype ram_entry is std_logic_vector(DATA_WIDTH - 1 downto 0);
  type ram_type is array(0 to (2 ** ADDR_WIDTH) - 1) of ram_entry;
  signal ram : ram_type;
begin
  process(clk)
  begin
    if rising_edge(clk) then
      if ncs1 = '0' then
        if nwr1 = '0' then
          ram(to_integer(unsigned(addr1))) <= data_w1;
        else
          data_r1 <= ram(to_integer(unsigned(addr1)));
        end if;
      end if;
    end if;
  end process;

  process(clk)
  begin
    if rising_edge(clk) then
      if ncs2 = '0' then
        data_r2 <= ram(to_integer(unsigned(addr2)));
      end if;
    end if;
  end process;
end beh;
Dual Port RAM with two Write Ports[edit | edit source]
-------------------------------------------------------------------------
--
-- Filename: dp_ram.vhd
-- =========
--
-- Short Description:
-- ==================
--   Dual port ram with two writing ports.
--
-- Description:
-- ============
--   Implementation of a dual port ram with two writing ports.
--   Based on the ram code described in the Xilinx XST Manual.
--
--   Both ports are using the same clock!
--   The memory uses low active control signals. The following
--   combinations are possible:
--
--    nwrx | nscx | Description
--   ----------------------------
--      0  |   0  | Write access
--      1  |   0  | Read access
--      X  |   1  | Inactive
--
--   The address and data bus width can be specified using
--   the generic map. The default is eight bit data bus and
--   eight bit address bus.
--
-- Verification:
-- =============
--   Simulated using ISE Webpack 8.2.03i running on Ubuntu Linux 6.10.
--   Synthesized using ISE Webpack 8.2.03i running on Ubuntu Linux 6.10.
--
-------------------------------------------------------------------------

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity dp_ram is
  generic
  (
    DATA_WIDTH : integer := 8;
    ADDR_WIDTH : integer := 8
  );
  port
  (
    clk : in std_logic;
    addr1 : in std_logic_vector(ADDR_WIDTH - 1 downto 0);
    data_w1 : in std_logic_vector(DATA_WIDTH - 1 downto 0);
    data_r1 : out std_logic_vector(DATA_WIDTH - 1 downto 0);
    nwr1 : in std_logic;
    ncs1 : in std_logic;
    addr2 : in std_logic_vector(ADDR_WIDTH - 1 downto 0);
    data_w2 : in std_logic_vector(DATA_WIDTH - 1 downto 0);
    data_r2 : out std_logic_vector(DATA_WIDTH - 1 downto 0);
    nwr2 : in std_logic;
    ncs2 : in std_logic
  );
end dp_ram;

architecture beh of dp_ram is
  subtype ram_entry is std_logic_vector(DATA_WIDTH - 1 downto 0);
  type ram_type is array(0 to (2 ** ADDR_WIDTH) - 1) of ram_entry;
  shared variable ram : ram_type;
begin
  process(clk)
  begin
    if rising_edge(clk) then
      if ncs1 = '0' then
        if nwr1 = '0' then
          ram(to_integer(unsigned(addr1))) := data_w1;
        else
          data_r1 <= ram(to_integer(unsigned(addr1)));
        end if;
      end if;
    end if;
  end process;

  process(clk)
  begin
    if rising_edge(clk) then
      if ncs2 = '0' then
        if nwr2 = '0' then
          ram(to_integer(unsigned(addr2))) := data_w2;
        else
          data_r2 <= ram(to_integer(unsigned(addr2)));
        end if;
      end if;
    end if;
  end process;
end beh;

Automation[edit | edit source]

When your processor build process gets more complex a more batch oriented build process will come handy. make will be your friend. Following listing shows how to use Quartus II from within a Makefile:

#
#	Quartus build process
#		called by jopser, jopusb,...
#
qsyn:
	echo $(QBT)
	echo "building $(QBT)"
	-rm -r quartus/$(QBT)/db
	-rm quartus/$(QBT)/jop.sof
	-rm jbc/$(QBT).jbc
	-rm rbf/$(QBT).rbf
	quartus_map quartus/$(QBT)/jop
	quartus_fit quartus/$(QBT)/jop
	quartus_asm quartus/$(QBT)/jop
	quartus_tan quartus/$(QBT)/jop

The example is taken from the Makefile for JOP.

JOP as an Example[edit | edit source]

If you want to see a Java processor (JOP) running on the board follow the instructions in Jopwiki Getting started. The processor is open-source and you can borrow some ideas from it.

Latex2wiki[edit | edit source]

For the ones who prefer Latex over wiki-code for documentation, this tool might be interesting: Latex2wiki.