Non-Uniform Memory Access (NUMA) Support
There aren't any ARM NUMA systems out at the moment, but one can use the Linux NUMA subsystem to group cores that share an L2 cache together into the same node. For certain workloads this can then result in a decrease in the accessess to main memory.
A patch set has been submitted to lakml that adds NUMA support to ARM. This takes advantage of the topology information in a TC2 to group the A15s in one node and the A7s in another node. It can also be used to arbitrarily add CPU cores to a configurable number of nodes to aid with developing NUMA support in userspace.
The patch set can be found on the linux-arm-kernel archives at: http://lists.infradead.org/pipermail/linux-arm-kernel/2012-December/137113.html and http://lists.infradead.org/pipermail/linux-arm-kernel/2012-December/137114.html
Git branch for the Arndale board
I've set up an autorebasing branch here:
Enabling NUMA support
For the Arndale board, one can check out the git branch above (otherwise apply the patch sets in the linux-arm-kernel mailing list to 3.7-rc8).
To enable NUMA in the kernel, one must select:
Kernel Features --> Memory model --> Either Sparse or Discontiguous (NOT flat memory)
General Setup --> Prompt for development and/or incomplete code/drivers.
System Type --> NUMA Support (EXPERIMENTAL).
System Type --> NUMA Support (EXPERIMENTAL) --> Maximum NUMA nodes (as a power of 2) set appropriately.
To activate NUMA, the following must be appended to the commandline: numa=fake=2
Sets up 2 nodes, and allocates memory/CPUs evenly between them. More options are explained in the menuconfig help for NUMA. Any NUMA related information can be found in dmesg (prefixed by "NUMA: ").
NUMA utilities (numactl & libnuma)
For userspace one has to hook up syscall #379 to sys_migrate_pages (in syscall.c). The libnuma libraries can be found at: http://oss.sgi.com/projects/libnuma/
Automatic testing of NUMA support
A sensible test configuration would have the same number of NUMA nodes as CPU cores. The following configuration options assume a dual core system (and it's tested on an Arndale board).
The kernel must be compiled with the following options:
One must also boot the kernel with the following appended to the command line:
Test suite (numactl 2.0.8)
To set up a test suite do the following:
$ mkdir testdir && cd testdir $ wget ftp://oss.sgi.com/www/projects/libnuma/download/numactl-2.0.8.tar.gz $ tar xvf numactl-2.0.8.tar.gz
Define __NR_migrate_pages to be 379 in syscall.c in an #if defined(__arm__) section. The following patch does this:
--- numactl-2.0.8.orig/syscall.c 2012-10-11 21:52:26.000000000 +0100 +++ numactl-2.0.8/syscall.c 2013-02-15 13:13:10.725600572 +0000 @@ -109,6 +109,9 @@ #define __NR_migrate_pages 272 +#elif defined (__arm__) +#define __NR_migrate_pages 379 + #elif !defined(DEPS_RUN) #error "Add syscalls for your architecture or update kernel headers" #endif
Note since kernel 3.8-rc1 __NR_finit_module has been introduced. The syscall number for migrate_pages, appears to have been incremented to 380 so that should be used above instead.
All the unit tests will pass except regress-io (aka regress3). This is because regress-io expects a PCI bus (which won't likely be present). To stop regress-io from being run, locate the following line in the Makefile:
test: all regress1 regress2 test_numademo regress3
and remove regress3 from the end.
Make sure to build the test suite with a make clean first to clean up some stale zero byte executables in the archive. Then run make && make test. The remaining unit tests should complete without any errors.
LEG/Engineering/Kernel/NUMA (last modified 2013-02-23 14:57:01)