New Account     Sign In         see this page in Japanese

LatticeNEWS April 2009


LatticeMico32 System Support of "Inline Memories" for LatticeMico32-Based Platforms

System platforms built using LatticeMico32 System's MicoSystemBuilder tool can now include inline memories.

LatticeMico32 (CPU) can now be connected to on-chip memory (EBRs) via the LatticeMico32’s dedicated processor local bus rather than the wishbone Interface. Memory connected to the CPU in this higher-speed manner is termed as an inline memory. As shown in Figure 1, the Processor Local Bus provides a fast point-to-point link between the inline memories and the instruction (and data) ports of the LatticeMico32.

There are two types of inline memories:

  • Instruction Inline Memory – This memory component is connected to the instruction port of LatticeMico32 and is used to hold only program code of any software application.
  • Data Inline Memory – This memory component is connected to the data port of LatticeMico32 and is used to hold read-only or read/write data of any software application.

 

LatticeMico32 with Inline Memories

Figure 1

 

In designs built with LatticeMico32 System Builder, all components (including LatticeMico32) are connected to the system wishbone bus. Inherent to this architecture are two aspects that can degrade LatticeMico32 performance:

  • Every transaction between a bus master and a slave takes more than one clock cycle. This means that any access from LatticeMico32 to EBR via the wishbone interface will take more than one cycle to complete, thereby stalling LatticeMico32 for these cycles.
  • When there is an ongoing transaction between a bus master and a slave, no transaction initiated by another bus master can commence. This means that any access from LatticeMico32 to EBR via the wishbone interface can remain pending for an indeterminate amount of time.

Inline memories overcome both the aforementioned drawbacks of using EBRs on the Wishbone Bus for both application program and data needed by LatticeMico32. The dedicated point-to-point link between LatticeMico32 and an inline memory guarantees that there can be no resource contention. That is, LatticeMico32 can initiate accesses to the inline memory even in the presence of ongoing transactions on the system wishbone bus, thereby decoupling the performance of LatticeMico32 from system issues. In addition, this point-to-point link is tightly coupled with LatticeMico32, thereby providing a single-cycle access.

Key Features of Inline Memories

  • Dedicated memories for instructions and data respectively.
  • Size of each inline memory is only limited by the available EBR resources on the FPGA.
  • Guaranteed single-cycle access to the inline memories from LatticeMico32.
  • Deterministic LatticeMico32 performance for an application in inline memories since execution is independent of utilization/loading of the system wishbone bus.
  • Inline memories can co-exist with other wishbone-based memory components in a LatticeMico32 design.

Applicability of Inline Memories

The most obvious designs in which inline memories can be used are those in which the entire software already resides in on-chip memory that is on the system wishbone bus. In such designs, the wishbone-based on-chip memory can be replaced with inline memories for higher application performance without using up any additional EBR resources. In other situations though, the software footprint of a LatticeMico32-based design is large enough to warrant the need for off-chip memory such as DRAM or SRAM. While such designs can also make use of inline memories, their limited size means that only time-critical portions of the application may be located in these inline memories. One example is a Real-Time Operating System (RTOS) running on LatticeMico32. One of the key tenets of an RTOS is that certain events, such as interrupts, are time-bound and must be handled as fast as possible. In such implementations, the portions of software that handle such time-critical events may be mapped to inline memories to guarantee the response time.

Performance Advantage vs. Wishbone-based Memory Without LatticeMico32 Caches

This direct connection between LatticeMico32 and inline memory has the advantage of providing a single-cycle read/write access to LatticeMico32. Figure 2 shows cycle-level analysis of potential performance benefits of using inline memory versus on-chip memory (EBR) connected to LatticeMico32 via the wishbone interface. Figure 2 compares the number of cycles it takes to service read access from LatticeMico32 by the inline memory and wishbone-based on-chip EBR. The read access initiated to inline memory will complete in the following cycle, while a read access initiated to a wishbone-based EBR will take four cycles. A similar behavior can be seen for writes initiated by LatticeMico32. Thus deploying program code or data to inline memory can provide at least a 3x speedup over wishbone-based memories.

 

LatticeMico32 and Inline Memories

Figure 2

 

Performance Advantage vs. Wishbone-Based Memory With LatticeMico32 Caches

It is common to configure LatticeMico32 with instruction and data caches to reduce the performance impact of accessing Wishbone-based memories since caches theoretically provide a single-cycle access. In practice though, we commonly encounter situations in which a single-cycle cache access is not possible and inline memory affords a performance advantage. These scenarios are:

  • Any cache access (read or write) that results in a miss will initiate an access to memory components on the wishbone interface. Thus, the cache access will take multiple cycles to complete.
  • The data cache in LatticeMico32 is write-through, i.e., any write to the data cache from LatticeMico32 will immediately result in access to memory component on the wishbone interface. This means that, as a design rule, all data cache writes are multi-cycle accesses.

Creating LatticeMico32 Designs with Inline Memories

Inline memories can be automatically instantiated in a design through the LatticeMico32 GUI in MicoSystemBuilder (MSB). Once instantiated, it appears like any other memory component within the design. The sizes of the inline memories are only limited by the number of EBR resources available on the FPGA. These memories can be located at any address within the 4GB (i.e., 32-bit addressable) range; the designer can select the address or let MSB automatically place them.

From a software developer’s perspective inline memories are just another memory component in the design and have the option to automatically map the entire application, or default application sections (.boot, .text, .rodata, .data, and .bss), to the inline memories via LatticeMico32 C/C++ SPE. This process is identical to the process used to map software to other wishbone-based memories. On the other hand, one of the most obvious applications/uses of inline memories, as mentioned earlier, is to map only time-critical portions of software code and data to the Inline Memories. In order to achieve this, the software developer must identify and isolate these time-critical portions of software code and data in to user-defined sections which are then mapped to Inline Memories via user-defined linker scripts. The process of creating these user-defined sections and linker scripts is generic and widely used in the open-source community. For the software developer’s convenience, this process is extensively documented (with example code) in the LatticeMico32 Software Developer User Guide.