BoF agenda and discussion

Goal is using modern CPU features in Linaro kernels

  • NEON
    • debug support (e.g. making sure the registers are saved in core files)
    • string routines
      • Need to measure how much it gains to switch to these
      • Cortex-A8 versus A9 makes a difference, measure both
      • Should send some UDP workload, netperf, TCP checksums etc. would be good workloads
      • Could use for some crypto implementations as well
  • We don't use Altivec
  • Build the kernel as Thumb 2
    • Need to measure whether it has an impact when it's enabled but not used
      • Benchmark needs to use a lot of paging
    • SD driver had performance degraded with HIGHMEM turned on
    • Need to test on all boards because it does break some boards right now
    • Need to ensure there is no performance degradation in kernel drivers as a result of highmem
  • SMP on UP
    • Russell working on a set of patches providing that
    • Architecture independent
    • Need to be ported to Thumb 2
  • SMP
    • Hotplug is being worked on in the power management working group
    • SMP and PREEMPT often exposes funny bugs, but PREEMPT is often turned on
    • Private peripheral interrupts will be an issue with SMP
      • Could remove current hack with local time and do it properly
    • Will posted some patches to fix kgdb SMP support
      • Mostly generic code, using the wrong atomic operations
  • Thumb 2
    • Kprobes has to be reimplemented entirely for Thumb 2
    • Ftrace without dynamic ftrace would be crazy, so we need to make sure ftrace and dynamic ftrace work on T2
    • oprofile or perf events do userspace backtraces
      • Need wider discussion with the toolchain working group
      • Need to fix the userspace side to use libunwind (Toolchain WG)
      • Need to fix the kernel side to do proper backtraces in any case
      • Investigate how userspace can pass unwind tables to the kernel to do this properly

* Multiple libc implementations - check memcpy etc in each

  • glibc

Some hardware (Marvell perhaps) may be too slow even to saturate network.

  • - Marvell and others tend to use a basic design they reuse for different
    • products and don't care that much about performance.
    • U-Boot built as Thumb 2 as a nice to have
  • Cache maintenance operations on ARMv7 are expensive: cache geometry is probed every time some cache maintenance is required; need to fix this for better performance; should improve DMA and I/O substantially


  • Confirm with Toolchain WG what's missing and implement support for NEON registers in core files
  • Loïc to check if Toolchain WG can provide suitable memcpy routines and test cases for the kernel -- it's different in that the context might be more limited in the kernel
  • Integrate optimized memcpy routines and measure performance
  • Help finish and then integrate SMP on UP patchset, test on all supported platforms
  • Fix SMP on UP for Thumb 2 and test
  • Find a good benchmark for highmem and then benchmark a kernel with and without highmem, when not using high memory
  • Enable highmem on all architectures, test that they build and boot
  • Benchmark drivers like SD, USB etc. with highmem turned on
  • Merge hotplug support from power mgmt wg
  • Turn PREEMPT on in test farm
  • Investigate PPI (Private Peripheral Interrupt) on SMP -- might not work with some devices, XXX which devices?
  • Build and test all kernels for Thumb 2
  • Implement Kprobes for Thumb 2
  • Fix dynamic ftrace for Thumb 2
  • (experimental) Build u-boot in Thumb 2 mode
  • Investigate and fix kgdb on SMP
  • Implement caching / factor out cache geometry at runtime


