Team

  • Technical Lead: Michael Hope
  • Primary TSC Sponsor:
  • Secondary TSC Sponsor:

Priorities

  • Improve GCC Performance
  • Widen the uptake of Linaro GCC through documentation, recipes, and binary releases
  • Standardize benchmarking

Ongoing

  • Keep on doing Linaro GCC and Linaro QEMU
  • Downstream engagements: Ubuntu, Yocto, etc
  • Benchmarking via EEMBC and SPEC
    • Add AutoBench and Telecom

Components

Component

Description

Release frequency

Upstream

gcc-linaro 4.5

Linaro GCC 4.5 development and maintenance branch

Monthly

FSF

gcc-linaro 4.6

Linaro GCC 4.6 development branch

Monthly

FSF

gdb-linaro

Linaro GDB 7.2 maintenance branch

Monthly

FSF

qemu-linaro

Linaro QEMU development branch

Monthly

QEMU

cortex-strings

Optimised string routines for the Cortex-A series

On demand

Linaro

GDB on ARM will soon be on par with GDB for x86. No further development work is planned for the 11.11 cycle.

Bionic support, IFUNC, libffi, ltrace, perf, and valgrind are in a good state and need no further work.

OpenOCD, LLVM, binutils, GOLD, and hard float do not appear on any priority lists.

Also missing are any new end-developer tools such as profiling or trace.

Assumptions

  • The canonical platform is a dual core Cortex-A9 with NEON. All optimizations are aimed at this platform.

TRs

Ref

Name

Priority

Description

Estimate (months)

T1

Performance

T1.1

Thumb-2 performance

High

Continue improving Thumb-2 performance

18

T1.2

NEON performance

High

Continue improving NEON performance

12

T1.3

Generic Cortex-A tuning

Medium

Add infrastructure for having a good blend of performance across all Cortex-As micro architectures

2

T1.4

64 bit sync primitives

Medium

Round out sync primitive support by adding 64 bit support to GCC, libgcc, GLIBC, and the kernel

1

T1.5

String routines everywhere

High

Make the string routines available in distributions and common libcs such as GLIBC, Bionic, and Newib

2

T1.6

Better intrinsics

Low

Improve the NEON intrinsics so that they are usable as a replacement for in-line assembly and on par with RVDS.

1

T2

Benchmarking

T2.1

Add SPEC and EEMBC

Medium

Benchmark using SPEC 200x and more from the EEMBC suite

1

T2.2

Publish benchmarks

Medium

Provide regular, public benchmark results

2

T3

Consumption

T3.1

Linaro GCC

High

Regularly release and provide support for Linaro GCC 4.5 and 4.6

3

T3.2

Linaro GDB

Medium

Regularly release and provide support for Linaro GDB 7.2

1

T3.3

Linaro QEMU

Medium

Regularly release and provide support for Linaro QEMU

2

T3.4

In distributions

Medium

Make available in and have first-class support for Android, Yocto, and Ubuntu

2

T3.5

Binary builds

Medium

Make available as a binary build that run on any recent Linux with basic end-user support

1

T3.6

Deeper validation

Low

Use distributions such as Yocto and Android as a testsuite

2

T4

Emulation

T4.1

Maintain existing models

High

Maintain QEMU Versatile Express and BeagleBoard models so that all changes are upstream and they work with the latest evaluation builds

1

T4.2

Initial A15 support

Medium

Add support for the published Cortex-A15 features to GCC and QEMU

1

T4.3

A15 planning

Medium

Plan Cortex-A15 system emulation support for QEMU

1

T4.4

Emulation speed

Low

Improve the QEMU emulation speed through optimisation and multi-core support

3

T4.5

Device Tree support

Low

Add Device Tree support to help QEMU by informing the kernel what is and what isn't modeled.

1

T4.6

QEMU improvements

Low

Make QEMU more versatile through deeper correctness, record/replay support, selectivly enabling features, and save/restore support.

1

T4.7

Low-cost A9 model

Wishlist

Add a low-cost A9 model to QEMU to supplement or replace vexpress

2

T5

Tools and future

T5.1

Cross debug

High

Check and support using GDB as a cross debugger targeting ARM. Includes working with multiarch.

1

T5.2

STM Support

Medium

Develop a standard kernel STM driver and integrate to kernel and first-tier tools

6

T5.3

GDB server completeness

Medium

Extended GDB server to be on par with native GDB by adding hardware watchpoints and any other missing features.

2

T5.4

Fast tracepoints

Low

Add tracepoint and then fast tracepoint support to GDB server.

2

T6

Future

T6.1

Good backtracing

Medium

Show the way forward on reliable backtracing by doing a libunwind based proof-of-concept

2

T6.2

GCC backend re-work

Wishlist

Re-engineer / refactoring parts of the backend to be more productive in the future and decrease the maintenance cost

2

T6.3

QEMU blue sky

Wishlist

Investigate features such as better diagnostics, tracepoints, usable timing numbers, and reversible debugging for future cycles.

2

allocation.png

Resourcing

+1 Engineer Tester for doing development test, integration, builds, and benchmarks.

Performance

The stretch goal is for a 5 % improvement across a set of benchmarks including SPEC 200x and a parts of EEMBC. Individual benchmarks may show a significant (> 20 %) improvement and individual hotspots may show large (> 100 %) improvements.

The general approach is to benchmark a range of compilers, identify the biggest areas of improvement, improve them, and repeat. Specific areas we may investigate are:

  • Tuning the scheduler for register pressure
  • NEON strided load/stores
  • NEON if-conversions with loads
  • Tweak Thumb-2 code generation to improve mix of 16 and 32 bit instructions
  • Improve conditional code generation for Thumb-2
  • Unaligned access
  • Unaligned block copy
  • NEON instruction coverage
  • Better use of the IRA register allocator
  • NEON over widening

Areas that have been done are:

  • SMS
  • Auto-detection of vector size
  • Multiple vector size support
  • Profiler driven feedback

Areas that aren't worthwhile are:

  • ARMv6 SIMD instructions
  • ARMv5 saturated math instructions

Notes

New development will be done in Linaro GCC 4.6 and then backported to 4.5 if practical. Strictly speaking 4.5 should be in maintenance mode, but sufficent people have picked it up that we should continue to improve it for at least the next cycle.

  • Benchmarking:
    • What we actually are using is SPEC2000, because SPEC2006 requires more than a gig of RAM
    • For A5 benchmarking, ARM to sponsor hardware or drop
    • At the moment we're not rebuilding the platform to benchmark, but doing so would account for gcc-linaro improvements across the board, specifically on SPEC2006
  • QEMU
    • Would like to use the A9 baseline to be a low-cost board instead of Versatile
    • Should be able to run a tier-1 evaluation build

Cycles/1111/TechnicalTopics/Toolchain (last modified 2011-07-12 19:18:51)