Software Codec Optimization

Interest from Linaro TSC to optimize codecs - in particular audio

AAC, VP8, JPEG, JPEG-turbo. Quite a few implementations on NEON not a lot of conversation and not always very good. Linaro does not want to put resources to do work which is done by others. How to synchronize across different vendors. It would be good to have a collection of codec implementations which can be reused.

Contributor agreement - registration needed.

Runtime detection of NEON - should not be used, instead application should be built against such optimizations.

Questions

  • are there applications which care a lot about optimization?
  • using NEON in the kernel? currently you can't
  • crypto/openSSL optimization? OpenSSL has interest also for Android certification
  • are there features of neon orc is not using
  • NEON: more than one NEON implementations/optimizations, at least two libjpeg-turbos (Meego version, forked version of android with optimizations from Qualcomm). libjpeg : people rarely use it as a shared library. Centralized version of libjpeg-turbo is needed HOW to do that?
  • Does Linaro track specific areas of neon optimizations? Setting up a mailing list to discuss is possible but is there member interest in that sort of work?
  • Can we provide a collection of good optimizations in order to discourage proprietary ones? Possibly running benchmarks internally in own organization afterwards to validate the reused parts.
  • Uptake one level above: get the optimizations in Linux distro?
  • Doing runtime check of the existence of NEON? How to handle that so you load the properly optimized library?
  • Power management questions: how to power off NEON components?
  • Hard FP: will only give benefit where there is fp passed through the API - irrelevant for NEON optimization.
  • NEON: do we need to look at anything beyond what we have checked so far (AAC, VP8, JPEG)?
    • MP3 - do we want to do the effort to optimize it (not low hanging fruit, and not big improvement expected).
    • Skype SILK - is there a NEON version for that?
    • Low latency voice codecs (which ones are interesting to look at)?
  • Does multicore make sw decoding make more sense? SW for sure is more flexible. Eg Video decoding does not parallelize in the way you'd want on a GPU.
  • People who could work on the optimization?
    • ARM: SW optimizing team exists but may need to advertise the work to be done and define who could do work
    • People from the community (based on IRC interaction):
    • Potentially university students ? Google summer of code projects? Linaro should set it up in order to actually avoid the students getting stuck with upstream difficulties. Probably need to work with upstream projects which would accept such optimizations
  • How to setup the work
    • Project index - similar to a google summer of code project page - Linaro can set it up on the wiki
    • What would draw contributors to add to this wiki page?

OpenCV - computer vision library? Has algorithms that could be optimized to add NEON routines. Freescale optimization of openCV could be investigated?

Android's use of compositions + colour conversions.

Optimization Forum Proposal

* Wikipage to hold existing optimized software implementations (codecs, encryption, checksumming)

  • Also capture requests for software-optimizing
  • How to message it? NEON-optimizing usefully constrains optimization type; maybe ARM optimizing; take into account that OpenCL is another optimization avenue
  • Aggregate per domain? For instance, Codecs/formats available, and then noting what is optimized
    • Potentially grade optimization level of codecs

* Mailing list to hold conversation around site-specific optimization * Potentially hold a day or set of sessions in Cambridge August sprint * Look at libjpeg-turbo and the optimizations that can be done there

  • Mans, Tom, Mandeep to work with Darrell on -turbo

ACTIONS:

* Study how Android does hardware capability selection to dynamically load NEON vs non-NEON components? * Look at OpenSSL on ARM; evaluate NEON optimization potential * Look at libjpeg Android loading: how is support for hardware codecs handled? * Investigate how libjpeg-turbo could cater for handling hardware decoding

  • Switching would need to be dynamic because for smaller images the latency overhead could negate the benefit
  • How would the hardware codec be exposed to libjpeg-turbo
  • What about decoding hundreds of images in a gallery

* Investigate OpenCV implementation -- how much of it is vectorizable; what has been SSE'd; how hard will it be to add NEON routines (Mandeep: some Freescale work done) * Do a broad sweep of Ubuntu packages that are using SSE and missing NEON for candidates for work

  • pixman, skia probably already have NEON code
  • transcoding for MMS might be another use case

* Look at use of composition/blending and color conversion in SurfaceFlinger; potential for NEON work or handled by the GPU already? * Investigate which upstream projects could accept NEON optimizations

  • NEON: do we need to look at anything beyond what we have checked so far (AAC, VP8, JPEG)?
  • Skype SILK - is there a NEON version for that?
  • Low latency voice codecs (which ones are interesting to look at)?
  • MP3 - do we want to do the effort to optimize it (not low hanging fruit, and not big improvement expected).

Events/2011-06-MMWG/CodecOptimization (last modified 2011-06-13 22:20:34)