AArch64 Device Memory Attributes

Om Narasimhan
4 min readOct 29, 2019

--

AArch64 System programmers who deal with Devices and Device memory often encounter device specific memory’s attributes like Gather, Reorder and Early write acknowledgement, collectively quoted as GRE/nGnRnE and/or various combinations of these (nGRnE, GnRE … etc.). This article attempts to elucidate on these attributes and their relevance to Device memory, and to give some examples where these attributes can be useful. In addition, the constraints associated with using the device memory for instruction fetches are also covered.

Gather(G)/non-Gather(nG)

If, for a block of device memory, Gather attribute is set, it means that

  • Multiple memory accesses of the same type (multiple reads or multiple writes) to the same memory location can be combined to a single transaction.
  • Multiple memory accesses of the same type to different memory locations within the block can be combined to a single memory transaction on an interconnect.

Either of these behaviors is permitted only if the ordering and coherency rules of the memory location can be followed. In case of conflicts, ordering and coherency rules take precedence.

The most observable manifestation of these behaviors is the number of memory accesses at the endpoint location (memory) being less than the number of accesses generated by the running executive. By corollary, for memory regions with Gather bit clear, the number of memory accesses at the endpoint location is exactly same as the number of memory accesses generated by the running executive.

Some important points

  • Gathering across memory barriers (DMB/DSB and variants) is not permitted. (Well, there are some qualifications to the permission, but that is set aside for a later discussion).
  • Gathering across memory fences (load-acquire, store-release) are not permitted.
  • If the target memory is non-Gather, a read from that memory cannot come from the cache, or a buffer that is not a part of the target endpoint for that address in the memory system.
  • In other words, the read MUST reach the target endpoint and result must return results from the target endpoint.
  • ARM architecture defines the PE observable behavior for Gather attribute. The implementations are free to perform gather in a non-observable manner by the PE.
  • That is, the semantics of Gather attribute does not define how the target endpoint implements the memory access.

Device memory that maps to various registers internal to device and is exposed through MMIO interface is generally attributed as non-Gather. Some examples of such registers are:

  • Control registers: Different bits may control different device aspects or behaviors. Combining writes to such registers would sometime cause unexpected behaviors.
  • Read only registers with side effects: Many a times, status registers are implemented as Clear on read (CoR) registers. Such register read accesses must be individually honored and must not be combined.

Reorder(R)/non-Reorder(nR)

For all device memory with non-Reordering attribute, the order of memory accesses arriving at that memory of size SIZE (as defined by the device) must be the same order that is generated by the executive in the PE — the accesses must appear in the program order

Note: The SIZE is generally defined as the same as the memory size operated on by a DMB instruction

An example of memory where non-Reorder attribute is recommended is a PCIe device’s internal register space which is externally mapped to host MMIO via one or more PCIe BARs

An example of memory where Reorder attribute MAY BE recommended is a PCIe device’s internal RAM memory which is externally mapped to host MMIO via one or more PCIe BARs. Please note that, in such a case, the implementor need to ensure that reordered transactions won’t cause unexpected behavior.

In an implementation, it is permitted to perform an access with the Reordering attribute set in a manner consistent with the requirement consistent with non-Reordering attribute. But the corollary is not true.

For a non-Reorder supported implementation, there are no additional ordering requirements to meet for accesses between

  • Areas with non-Reordering attribute and Reordering attribute.
  • Areas with non-Reordering attribute and normal memory.
  • Areas with non-Reordering attribute and other peripherals (access size defined by the other peripheral).

Early Write Acknowledgement (E/nE)

For memories in the endpoints for which the PE requires that the acknowledgement of a write comes from the endpoint, non-Early Write Acknowledgement attribute guarantees that

  • Only the endpoint of the write operation can return a Write Acknowledgement.
  • No intermediate agents can return a Write Acknowledgement.

This means that a DMB instruction completes only after the write to the endpoint memory reaches and acknowledged by the endpoint

Note

  • This attribute defines the endpoint from which the acknowledgement should come out. This attribute does not control the order in which such acknowledgements should come out.
  • PCI/PCIe configuration space is a good example for nE attribute. Memory areas within PCIe devices that expects the posted memory writes are good examples of E attribute.

Device memory and instruction fetches

Software should not make the CPU fetch instructions from Device memory. Such fetches may result in undefined behavior. Yet, there are some circumstances under which such fetches may be allowed

  • If specific areas in device memory are not marked as execute-never for the current exception level, and if the address translation is enabled, then those areas can be used for speculative instruction accesses.
  • If specific areas in device memory are not marked as execute-never for the current exception level, and if a branch instruction causes the PC to point to such an area of memory, then the implementation can
  • Generate a permission fault.
  • Treat the pointed area as Normal, Non-cacheable memory.

Please feel free to leave your comments and questions.

--

--