WARNING WARNING WARNING

Information in this wiki is obsolete and will not be supported. Please see file HOWTO.md in the openCSD github repository for how coresight support on Linux has been integrated with the perf framework for both acquisition and decoding of traces.

Coresight trace Decoding with DS-5

In this wiki are step-by-step instructions to perform the decoding of Coresight compressed streams using ARM's DS-5 integrated development environment. Using DS-5 there are two ways to decode traces: 1) using the GUI and 2) from a command line prompt. We have decided to document the latter as it makes it easier to automate in a script and is better suited for the format of this wiki.

Agenda

First we will start by adding a small demonstration routine to the kernel, something that will narrow the scope of the tracing and make verification of the decoded traces easier. From there we will configure the coresight component in accordance to our modification and enable trace collection. The third step will gather the trace and meta data on target for export to a host where decoding can be done.

The host part will start by an explanation of the environment needed to perform the trace decoding. Since DS-5 is not connected to a D-Stream unit (and the target) it is important to build the environment that would have normally been deduced from interactions with the D-Stream tool. Once the environment is setup properly trace decoding can be started, followed by the verification of the resulting decompression with objdump.

Environment and Assumptions

The host we are using is a 64-bit Ubuntu 14.04 and the target platform a big.LITTLE TC2 from ARM (2 x A15 + 3 x A7). The target is running an upstream kernel (3.19-rc1 at the earliest, vexpress_defconfig) with a Linaro "nano" file system.

Preparing the Kernel

Adding a Simple Test Section

To get acquainted with coresight and trace collection we decided to narrow the amount of code traced to a very small and well defined section. It was also desirable to run the test code at the time of our choosing, on the CPU we wanted and controlled via command line.

The following patch fulfils all of the above requirements by replacing sysrq's "showreg()" routine with a small loop and a single access to CP15's contextID register.

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 454b658..0128f3f 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -260,10 +260,17 @@ static struct sysrq_key_op sysrq_showallcpus_op = {
 
 static void sysrq_handle_showregs(int key)
 {
-       struct pt_regs *regs = get_irq_regs();
-       if (regs)
-               show_regs(regs);
-       perf_event_print_debug();
+       volatile int loop = 0;
+       u32 contextidr;
+
+       while (loop++ < 5)
+               ;
+
+       asm volatile(
+       "       mrc     p15, 0, %0, c13, c0, 1\n"
+       : "=r" (contextidr));
+
+       pr_err("looped around %d times with: %d idr: 0x%x\n", loop, current->pid, contextidr);
 }
 static struct sysrq_key_op sysrq_showregs_op = {
        .handler        = sysrq_handle_showregs,

The above is only for the purpose of this example. For real usecase scenarios the address range of the function under scrutiny should be used.

Removing Configuration guards

When the coresight framework comes up on a platform the initialisation code will configure the first address range comparator to trace the entire text area of the kernel. This is useful for debugging and having a feel for how healthy coresight is on a specific target. To avoid going through a wealth of configuration commands we re-use this initial configuration by simply narrowing the address range to fit the beginning and end address of the above "sysrq_handle_showregs()" routine.

Because this address range comparator has already been configured (by the coresight framework), attempting to modify the initial range will cause an error. To prevent the error from cropping up and reducing the configuration step for this example we have decided to remove the configuration check:

diff --git a/drivers/coresight/coresight-etm3x.c b/drivers/coresight/coresight-etm3x.c
index b77befd..37d1c9c 100644
--- a/drivers/coresight/coresight-etm3x.c
+++ b/drivers/coresight/coresight-etm3x.c
@@ -817,6 +817,7 @@ static ssize_t addr_range_store(struct device *dev,
                spin_unlock(&drvdata->spinlock);
                return -EPERM;
        }
+#if 0
        if (!((drvdata->addr_type[idx] == ETM_ADDR_TYPE_NONE &&
               drvdata->addr_type[idx + 1] == ETM_ADDR_TYPE_NONE) ||
              (drvdata->addr_type[idx] == ETM_ADDR_TYPE_RANGE &&
@@ -824,7 +825,7 @@ static ssize_t addr_range_store(struct device *dev,
                spin_unlock(&drvdata->spinlock);
                return -EPERM;
        }
-
+#endif
        drvdata->addr_val[idx] = val1;
        drvdata->addr_type[idx] = ETM_ADDR_TYPE_RANGE;
        drvdata->addr_val[idx + 1] = val2;

Again, this is strictly for the purpose of this example. Normal trace scenarios would cancel the initial setup and provide the required configuration steps for their particular needs.

But for now a new kernel should be recompiled with the above changes and booted on the target.

On Target Configuration

When the target has booted move to the main coresight configuration directory:

root@linaro-developer:~# cd /sys/bus/coresight/devices/
root@linaro-developer:/sys/bus/coresight/devices# ls
00000000.replicator  20030000.tpiu    2201c000.ptm  2203c000.etm  2203e000.etm
20010000.etb         20040000.funnel  2201d000.ptm  2203d000.etm
root@linaro-developer:/sys/kernel/debug/coresight#

Keep in mind that based on the platform the list of component and their address will be different than what is shown here for the TC2. Once in the main configuration directory at least one trace sink needs to be "activated". The trace sink must be on a downward path from the trace source that will be enabled.

root@linaro-developer:/sys/bus/coresight/devices# echo 1 > 20010000.etb/enable_sink
root@linaro-developer:/sys/bus/coresight/devices#

Next the beginning and end of the trace routine must be identified in the "System.map":

8026b200 t pty_close
8026b360 t ptmx_open
8026b4a8 t sysrq_handle_crash
8026b4e8 t sysrq_reset_seq_param_set
8026b53c t sysrq_handle_showregs         <----
8026b590 t sysrq_handle_loglevel         <----
8026b5c0 t sysrq_disconnect
8026b5f4 t sysrq_connect
8026b6e0 t sysrq_do_reset
8026b6fc t sysrq_reinject_alt_sysrq
8026b7d4 t sysrq_handle_SAK

If "sysrq_handle_showregs" is to be traced, the values for the beginning and the end of the range should be 0x8026b53c and 0x8026b590 respectively. Note that the values are bound to change base on the kernel configuration and target.

With the above in mind the range of the initial address range of the first comparator can be modified:

root@linaro-developer:/sys/bus/coresight/devices# echo 8026b53c 8026b590 > 2201d000.ptm/addr_range
root@linaro-developer:/sys/bus/coresight/devices# cat 2201d000.ptm/addr_range
0x8026b53c 0x8026b590
root@linaro-developer:/sys/bus/coresight/devices#

Thanks to the initial configuration of the first comparator by the coresight framework, this is all the configuration that is needed. Care must be taken to chose the right source to configure. For this example we have decided to work with PTM1 that is associated with CPU1. As such only code executed by CPU1 that falls within the configured comparator range will be traced.

Next the source can be enabled but before doing so it is a good idea to probe the status of the sink, in this case the ETB:

root@linaro-developer:/sys/bus/coresight/devices# cat 20010000.etb/status
Depth:          0x2000
Status:         0x8
RAM read ptr:   0x0
RAM wrt ptr:    0x0       <----
Trigger cnt:    0x0
Control:        0x0
Flush status:   0x2
Flush ctrl:     0x0
root@linaro-developer:/sys/bus/coresight/devices#

All the values have their default setting and the RAM write pointer is at address 0. Enabling the source will see the first synchronisation information being picked up by the sink:

root@linaro-developer:/sys/bus/coresight/devices# echo 1 > 2201d000.ptm/enable_source   <--- Enable the source
root@linaro-developer:/sys/bus/coresight/devices# cat 20010000.etb/status
Depth:          0x2000
Status:         0x0
RAM read ptr:   0x0
RAM wrt ptr:    0x1    <--- First synchronisation data is received in the ETB buffer.
Trigger cnt:    0x0
Control:        0x1
Flush status:   0x0
Flush ctrl:     0x2001
root@linaro-developer:/sys/bus/coresight/devices#

At this point the source has been configured with the beginning and end address of the code to be traced and a path from source to sink has been instantiated. The remaining step is to force one of the processor to execute the test code:

root@linaro-developer:/sys/bus/coresight/devices# echo p > /proc/sysrq-trigger    <--- Force the code to be executed
sysrq: looped around 6 times with: 1821 idr: 0x933    <--- Output from the test code
root@linaro-developer:/sys/bus/coresight/devices# cat 20010000.etb/status
Depth:          0x2000
Status:         0x0
RAM read ptr:   0x0
RAM wrt ptr:    0x1      <--- The RAM write pointer hasn't moved  - why?
Trigger cnt:    0x0
Control:        0x1
Flush status:   0x0
Flush ctrl:     0x2001
root@linaro-developer:/sys/bus/coresight/devices#

Echo'ing the letter "p" to "/proc/sysrq-trigger" will force the test code to be executed but from the above capture it is clear that no trace data have been gathered. This is because the processor that executed the command (probably CPU0) isn't associated with the PTM that was configured, in this case PTM1 (associated to CPU1). To force the right processor to execute the code the utility "taskset" is used:

root@linaro-developer:/sys/bus/coresight/devices# echo $$
1821
root@linaro-developer:/sys/bus/coresight/devices# taskset -p 0x2 1821
pid 1821's current affinity mask: 1f
pid 1821's new affinity mask: 2
root@linaro-developer:/sys/bus/coresight/devices#

After forcing the affinity of the console process to CPU1, trace data are collected:

root@linaro-developer:/sys/bus/coresight/devices# echo p > /proc/sysrq-trigger   <--- Force the code to be executed
sysrq: looped around 6 times with: 1821 idr: 0x933    <--- Output from the test code
root@linaro-developer:/sys/bus/coresight/devices# cat 20010000.etb/status
Depth:          0x2000
Status:         0x0
RAM read ptr:   0x0
RAM wrt ptr:    0xf    <--- Trace data have been collected
Trigger cnt:    0x0
Control:        0x1
Flush status:   0x0
Flush ctrl:     0x2001
root@linaro-developer:/sys/bus/coresight/devices#

When trace data is showing up on the configured sink(s), the collection can be stopped:

root@linaro-developer:/sys/bus/coresight/devices# echo 0 > 2201d000.ptm/enable_source

Collection of Trace and Metadata

From there the "metadata" associated with the trace collection must be harvested. This is an important step and one must be cautious and precise. Failing to do so will result in trace data being garbled and lost.

First the configuration of the trace source must be recorded - creating a new directory is probably a good idea:

root@linaro-developer:~# mkdir trace
root@linaro-developer:~# cd trace
root@linaro-developer:~/trace# cat /sys/bus/coresight/devices/2201d000.ptm/status > 2201d000.ptm.status
root@linaro-developer:~/trace# ls
2201d000.ptm.status
root@linaro-developer:~/trace#

Next the content of the ETB buffer associated with the trace needs to be collected.

root@linaro-developer:~/trace# dd if=/dev/20010000.etb of=cstrace.bin bs=1
32768+0 records in
32768+0 records out
32768 bytes (33 kB) copied, 0.400984 s, 81.7 kB/s
root@linaro-developer:~/trace# ls
2201d000.ptm.status  cstrace.bin
root@linaro-developer:~/trace#

Last but not least a capture of the kernel image in memory is required for decoding to happen properly:

root@linaro-developer:~/trace# cat /proc/iomem | grep "Kernel code"
  80008000-805d13c3 : Kernel code
root@linaro-developer:~/trace#
root@linaro-developer:~/trace# dd if=/dev/mem skip=$((0x80008000)) bs=1 count=$((0x5c93c3)) of=kernel_dump.bin
6067139+0 records in
6067139+0 records out
6067139 bytes (6.1 MB) copied, 74.7211 s, 81.2 kB/s
root@linaro-developer:~/trace#

The value of "count" has been obtained by substracting the beginning from the end kernel address (0x805d13c3 - 0x80008000 == 0x5c93c3).

As a final tally, 3 files are required for the proper decoding of the trace data:

root@linaro-developer:~/trace# ls
2201d000.ptm.status  cstrace.bin  kernel_dump.bin
root@linaro-developer:~/trace#

Host Side Configuration and Decoding

This activity can be split into two main activities, namely the configuration database and the configuration of individual components. This part is definitely tightly couple with the TC2 - porting to other platforms is definitely possible.

Configuration Database

To begin a copy of the existing DS-5 configuration database needs to be replicated to a working directory:

mpoirier@t430:~/work/linaro/coresight/linaro$ cp -dpR /usr/local/DS-5/sw/debugger/configdb .
mpoirier@t430:~/work/linaro/coresight/linaro$ ls
configdb
mpoirier@t430:~/work/linaro/coresight/linaro$

For this copy all the sub-directories under "Boards" can be deleted with the exception of "ARM Development Boards". Under "ARM Development Boards", remove all the entries except for the one that represent your target and rename it to something meaningful. If the target you are working with is not already supported, start with the one that is a closest match and make the required modifications. In our case we renamed "Versatile_Express_V2P-CA15_A7" to "linaroTC2":

mpoirier@t430:~/work/linaro/coresight/linaro/configdb/Boards/ARM Development Boards/linaroTC2$ ls
coredump.py  coredump.xml  dtsl_config_script.py  project_types.xml  tc2.rvc
mpoirier@t430:~/work/linaro/coresight/linaro/configdb/Boards/ARM Development Boards/linaroTC2$

In the "Snapshot View" section of "project_types.xml", all occurences of "cpu_X" need to be modified to represent something more meaningful:

mpoirier@t430:~/work/linaro/coresight/mathieu/configdb/Boards/ARM Development Boards/linaroTC2$ diff -ruN project_types.xml.orig project_types.xml
--- project_types.xml.orig      2014-07-31 16:20:25.130923639 -0600
+++ project_types.xml   2014-07-21 15:00:14.000000000 -0600
@@ -357,31 +357,31 @@
                     <name language="en">View Cortex-A7_0</name>
                     <xi:include href="../../../Include/coredump_activity_description.xml"/>
                     <xi:include href="../../../Include/coredump_connection_type.xml"/>
-                    <core connection_id="cpu_0" core_definition="Cortex-A7"/>
+                    <core connection_id="Cortex-A7_0" core_definition="Cortex-A7"/>
                 </activity>
                 <activity id="ICE_DEBUG" type="Debug">
                     <name language="en">View Cortex-A7_1</name>
                     <xi:include href="../../../Include/coredump_activity_description.xml"/>
                     <xi:include href="../../../Include/coredump_connection_type.xml"/>
-                    <core connection_id="cpu_1" core_definition="Cortex-A7"/>
+                    <core connection_id="Cortex-A7_1" core_definition="Cortex-A7"/>
                 </activity>
                 <activity id="ICE_DEBUG" type="Debug">
                     <name language="en">View Cortex-A7_2</name>
                     <xi:include href="../../../Include/coredump_activity_description.xml"/>
                     <xi:include href="../../../Include/coredump_connection_type.xml"/>
-                    <core connection_id="cpu_2" core_definition="Cortex-A7"/>
+                    <core connection_id="Cortex-A7_2" core_definition="Cortex-A7"/>
                 </activity>
                 <activity id="ICE_DEBUG" type="Debug">
                     <name language="en">View Cortex-A15_0</name>
                     <xi:include href="../../../Include/coredump_activity_description.xml"/>
                     <xi:include href="../../../Include/coredump_connection_type.xml"/>
-                    <core connection_id="cpu_3" core_definition="Cortex-A15"/>
+                    <core connection_id="Cortex-A15_0" core_definition="Cortex-A15"/>
                 </activity>
                 <activity id="ICE_DEBUG" type="Debug">
                     <name language="en">View Cortex-A15_1</name>
                     <xi:include href="../../../Include/coredump_activity_description.xml"/>
                     <xi:include href="../../../Include/coredump_connection_type.xml"/>
-                    <core connection_id="cpu_4" core_definition="Cortex-A15"/>
+                    <core connection_id="Cortex-A15_1" core_definition="Cortex-A15"/>
                 </activity>
                 <activity id="ICE_DEBUG" type="Debug">
                     <name language="en">View Cortex-A15x2 SMP</name>
mpoirier@t430:~/work/linaro/coresight/mathieu/configdb/Boards/ARM Development Boards/linaroTC2$

Changes to "coredump.py" must be made accordingly:

mpoirier@t430:~/work/linaro/coresight/mathieu/configdb/Boards/ARM Development Boards/linaroTC2$ diff -ruN coredump.py.orig coredump.py
--- coredump.py.orig    2014-07-31 16:19:57.338924626 -0600
+++ coredump.py 2014-07-25 07:11:42.738500706 -0600
@@ -32,86 +32,77 @@
 
 
     def discoverDevices(self):
-        # Find all the cores: named cpu_0, cpu_1, ..., cpu_N
+        # cores
         self.cores = []
-        for devNum, name in self.getDevicesOfName("cpu"):
-            dev = Device(self, devNum, name)
-            self.cores.append(dev)
-            self.addDeviceInterface(dev)
-
-        # create SMP devices if enough cores are present
-        #   first 3 are assumed to be A7 cores
-        if len(self.cores) > 0:
-            smp = RDDISyncSMPDevice(self, "smp_a7", self.cores[:3])
-            self.addDeviceInterface(smp)
-        #   next 2 are assumed to be A15 cores
-        if len(self.cores) > 2:
-            smp = RDDISyncSMPDevice(self, "smp_a15", self.cores[3:])
-            self.addDeviceInterface(smp)
-            
-            # big.LITTLE
-            clusters = [ DeviceCluster("big", self.cores[3:]), DeviceCluster("LITTLE", self.cores[:3]) ]
-            bl = RDDISyncSMPDevice(self, "smp_bl", clusters)
-            self.addDeviceInterface(bl)
-
-
+        cortexA15s = []
+        cortexA7s = []
+        for name in ["Cortex-A15_0", "Cortex-A15_1"]:
+            devID = self.findDevice(name)
+            core = Device(self, devID, name)
+            self.cores.append(core)
+            cortexA15s.append(core)
+           self.addDeviceInterface(core)
+        for name in ["Cortex-A7_0", "Cortex-A7_1", "Cortex-A7_2"]:
+            devID = self.findDevice(name)
+            core = Device(self, devID, name)
+            self.cores.append(core)
+            cortexA7s.append(core)
+           self.addDeviceInterface(core)
+        # SMP sync groups
+        smpA7 = RDDISyncSMPDevice(self, "smp_a7", cortexA7s)
+        self.addDeviceInterface(smpA7)
+        smpA15 = RDDISyncSMPDevice(self, "smp_a15", cortexA15s)
+        self.addDeviceInterface(smpA15)
+        # big.LITTLE
+        clusters = [ DeviceCluster("big", cortexA15s), DeviceCluster("LITTLE", cortexA7s) ]
+        bl = RDDISyncSMPDevice(self, "smp_bl", clusters)
+        self.addDeviceInterface(bl)
         # Find all the trace sources
         streamID = 1
         self.traceSources = []
-        for devNum, name in self.getDevicesOfName("PTM"):
-            ptm = PTMTraceSource(self, devNum, streamID, name)
+        for name in ["PTM_0", "PTM_1"]:
+            devID = self.findDevice(name)
+            ptm = PTMTraceSource(self, devID, streamID, name)
             self.traceSources.append(ptm)
             streamID += 1
-        for devNum, name in self.getDevicesOfName("ETM"):
-            etm = ETMv3_5TraceSource(self, devNum, streamID, name)
+        for name in ["ETM_0", "ETM_1", "ETM_2"]:
+            devID = self.findDevice(name)
+            etm = ETMv3_5TraceSource(self, devID, streamID, name)
             self.traceSources.append(etm)
             streamID += 1
-            
-        # Other trace sources
-        try:
-            itmDev = self.findDevice("ITM_0")
-            itm = ITMTraceSource(self, itmDev, streamID, "ITM_0")
-            self.traceSources.append(itm)
-            streamID += 1
-        except DTSLException, e:
-            pass
-            
         # Define associations between cores and trace sources
         self.__traceSourceCores = {
-            'PTM_0': 'cpu_3',
-            'PTM_1': 'cpu_4',
-            'ETM_0': 'cpu_0',
-            'ETM_1': 'cpu_1',
-            'ETM_2': 'cpu_2',
+            'PTM_0': 'Cortex-A15_0',
+            'PTM_1': 'Cortex-A15_1',
+            'ETM_0': 'Cortex-A7_0',
+            'ETM_1': 'Cortex-A7_1',
+            'ETM_2': 'Cortex-A7_2'
         }
 
 
     def setupFileTrace(self):
         '''Setup file trace capture'''
-
         self.fileTrace = FileTraceCapture(self, "File")
-
         self.addTraceCaptureInterface(self.fileTrace)
-
         self.registerTraceSources(self.fileTrace)
-
         for s in self.fileTrace.getTraceSources():
             s.setSnapshotMode(True)
             s.setEnabled(True)
-
         self.addManagedTraceDevices(self.fileTrace.getName(), [ self.fileTrace ])
         
         
     def openFileTrace(self):
-
         # core dump location is connection address
         snapshotFile = self.getConnectionAddress()
         snapshotPath = os.path.split(snapshotFile)[0]
-
         # reset file trace
         traceMode = "None"
         self.fileTrace.setTraceFile(None)
-
         # load metadata
         if snapshotPath:
             metadata = CSLIBTraceDumpMetadata.read(snapshotFile)
@@ -122,18 +113,15 @@
                 dumpFormat = dumpMetadata.get("format")
                 self.fileTrace.setTraceFile(dumpFile)
                 self.fileTrace.setTraceFormat(dumpFormat)
-
                 traceMode = "File"
-
                 break
-
         self.setManagedDevices(self.getManagedDevices(traceMode))
 
 
     def registerTraceSources(self, traceCapture):
         for source in self.traceSources:
             coreName = self.__traceSourceCores.get(source.getName(), None)
-            if coreName:
+            if coreName is not None:
                 self.registerCoreTraceSource(traceCapture, coreName, source)
             else:
                 self.registerTraceSource(traceCapture, source)
mpoirier@t430:~/work/linaro/coresight/mathieu/configdb/Boards/ARM Development Boards/linaroTC2$

Configuration of Individual Components

A fair amount of configuration is required to properly represent the coresight components to DS-5. For this we adopt a bottom up approach starting with individual component configuration, gradually moving up the latter to finish with the main scripts. The remainder of this example will assume that a new TraceDecode directory has been created at the same level as the "configdb" from the above section.

mpoirier@t430:~/work/linaro/coresight/linaro$ mkdir TraceDecode
mpoirier@t430:~/work/linaro/coresight/linaro$ ls
configdb  TraceDecode
mpoirier@t430:~/work/linaro/coresight/linaro$

CPU Configuration Files

To begin with a file that represent each process needs to be created. For TC2 we have cortex_a15_[0, 1].ini and cortex_a7_[0, 1, 2].ini:

mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$ ls
cortex_a15_0.ini    cortex_a15_1.ini    cortex_a7_0.ini    cortex_a7_1.ini    cortex_a7_2.ini  
mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$

The content of the files are almost identical, except for the "name" that have to match the entries that were modified in "project_types.xml" from the above section. As such "cortex_a15_0.ini" will look like this:

mpoirier@t430:~/work/linaro/coresight/mathieu/TraceDecode$ cat cortex_a15_0.ini
[device]
name=Cortex-A15_0

[regs]
R15=0x80008000
R13=0
CPSR=0x1D3

[dump]
file=kernel_dump.bin
address=0x80008000
length=0x005c93c3

"R15" and "address" must match the based address of the kernel in memory as provided by "cat /proc/iomem | grep "Kernel code"", in this example 0x80008000. The "lenght" field must be the size of the kernel code in memory, again 0x805d13c3 - 0x80008000 == 0x005c93c3. The other ".ini" files are expected to have the same information, except for the "name". "R13" and "CPSR" are not required for trace decoding but the debugger will complain if it can't find them. As such the above values should be replicated in all other files.

Coresight Sources Configuration Files

From the same "TraceDecode" directory a file for each potential source also needs to be created. Once again, for TC2 we have ptm_[0, 1].ini and etm_[0, 1, 2].ini. The content of each file gives information about what the source is, its capabilities and how it was configured when the trace was generated. It is tailored on the output of the "status" register for each source. Above the status of PTM1 was redirected to "2201d000.ptm.status", which can now be used to build "ptm_1.ini":

mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$ cat 2201d000.ptm.status
ETMCCR: 0x8d294004
ETMCCER: 0x34c01ac2
ETMSCR: 0x00000000
ETMIDR: 0x411cf312
ETMCR: 0x10001401
ETMTRACEIDR: 0x00000002
Enable event: 0x0000406f
Enable start/stop: 0x00000000
Enable control: CR1 0x00000001 CR2 0x00000000
mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$
mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$ cat ptm_1.ini 
[device]
name=PTM_1

[regs]
ETMCR(0x000)=0x10001401
ETMIDR(0x079)=0x411CF312
ETMCCER(0x07A)=0x34C01AC2
ETMTRACEIDR(0x080)=0x00000002
mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$

It is important to have one ".ini" file for each source in the system. On the flip side the accuracy of each file is only important based on the trace source(s) that has been enable for that trace capture. For example, if traces have been collected for CPU0 and CPU3, then "status" would be collected for PTM_0 and ETM_1 with the content "ptm_0.ini" and "etm_1.ini" updated accordingly.

Top Level Configuration Files

Two more files are required. The first one is called "trace.ini" and should have the following content:

mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$ cat trace.ini 
[buffer0]
component=ETB
format=coresight
mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$ 

The second one, "snapshot.ini", tells DS-5 where to find the various components (CPU and coresight sources) found on the system, the metadata and the trace buffer. For TC2 and based on the above the content would be:

mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$ cat snapshot.ini 
[device_list]
device0=cortex_a7_0.ini
device1=cortex_a7_1.ini
device2=cortex_a7_2.ini
device3=cortex_a15_0.ini
device4=cortex_a15_1.ini
device5=ptm_0.ini
device6=ptm_1.ini
device7=etm_0.ini
device8=etm_1.ini
device9=etm_2.ini

[trace]
metadata=trace.ini
buffer0=cstrace.bin
mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$

The order in which components are listed and the order the CPUs were booted is irrelevant. On the flip side it is important to represent all the CPUs and the source they are associated to.

Trace Decoding

Before decoding the trace data DS-5 needs a handle on the "vmlinux" image that was used on the target and be told where to put the extracted trace data. Here we create the file "tracedump.cmd" under "TraceDecode":

mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$ cat tracedump.cmd 
add-symbol-file "/home/mpoirier/work/linaro/coresight/kernel1/builds/vexpress/vmlinux"
trace report FILE="tracereportptm1.txt" source=PTM_1 columns=record_type,cycles,address,opcode,branch,detail OUTPUT_PATH="/home/mpoirier/work/linaro/coresight/linaro/TraceDecode/"
mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$

If more than one source is present in the trace, another line should be added to specify where to put the decoded traces and how to format the information:

mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$ cat tracedump.cmd 
add-symbol-file "/home/mpoirier/work/linaro/coresight/kernel1/builds/vexpress/vmlinux"
trace report FILE="tracereportptm1.txt" source=PTM_1 columns=record_type,cycles,address,opcode,branch,detail OUTPUT_PATH="/home/mpoirier/work/linaro/coresight/linaro/TraceDecode/"
trace report FILE="tracereportptm0.txt" source=PTM_0 columns=record_type,cycles,address,opcode,branch,detail OUTPUT_PATH="/home/mpoirier/work/linaro/coresight/linaro/TraceDecode/"

mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$

The last piece of the puzzle is calling the debugger utility with the right options. For this example "ds5tracedump.sh" was created:

mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$ cat ds5tracedump.sh 
/usr/local/DS-5/bin/debugger --cdb-root "/usr/local/DS-5/sw/debugger/configdb:/home/mpoirier/work/linaro/coresight/linaro/configdb" --cdb-entry "ARM Development Boards::linaroTC2::Snapshot View::Snapshot View::View big.LITTLE::Snapshot" --cdb-entry-param rvi_address="/home/mpoirier/work/linaro/coresight/linaro/TraceDecode/snapshot.ini" --script "/home/mpoirier/work/linaro/coresight/linaro/TraceDecode/tracedump.cmd"
mpoirier@t430:~/work/linaro/coresight/linaro/TraceDecode$

The option --cdb-root identifies the configuration database to use. The original from DS-5 and the newly created one from the above steps need to be specified. --cdb-entry is for internal DS-5 representation and used when decoding traces using the GUI. The path that is represented should be in line with the "Snapshot" of the system that traces have been harvested from. --cdb-entry-param gives a path to the "snapshot.ini" file create earlier and --script is a pointer to the .cmd file, also created above.

At this stage the only thing remaining is calling the DS-5 debugger tool and let it do the rest:

mpoirier@t430:~/work/linaro/coresight/mathieu/TraceDecode$ ./ds5tracedump.sh 
Connected to stopped target Snapshot Viewer
Generating trace report, trace source: PTM_1, file:/home/mpoirier/work/linaro/coresight/linaro/TraceDecode/tracereportptm1.txt
Done.
Disconnected from stopped target Snapshot Viewer
mpoirier@t430:~/work/linaro/coresight/mathieu/TraceDecode$

The decompressed stream will be found in "tracereportptm1.txt", in accordance with what was specified in the "tracedump.cmd" file.

Proving the Accuracy of the Decompressed Data

Now that we have a decompressed stream we need to prove the above steps are actually correct. First, let's take a look at the recovered data (tracereportptm1.txt):

mpoirier@t430:~/work/linaro/coresight/mathieu/TraceDecode$ cat tracereportptm1.txt
Info                                    Tracing enabled
Instruction     106378866       0x8026B53C      E52DE004        false   PUSH     {lr}
Instruction     0       0x8026B540      E24DD00C        false   SUB      sp,sp,#0xc
Instruction     0       0x8026B544      E3A03000        false   MOV      r3,#0
Instruction     0       0x8026B548      E58D3004        false   STR      r3,[sp,#4]
Instruction     0       0x8026B54C      E59D3004        false   LDR      r3,[sp,#4]
Instruction     0       0x8026B550      E3530004        false   CMP      r3,#4
Instruction     0       0x8026B554      E2833001        false   ADD      r3,r3,#1
Instruction     0       0x8026B558      E58D3004        false   STR      r3,[sp,#4]
Instruction     0       0x8026B55C      DAFFFFFA        true    BLE      {pc}-0x10 ; 0x8026b54c
Timestamp                                       Timestamp: 17106715833
Instruction     319     0x8026B54C      E59D3004        false   LDR      r3,[sp,#4]
Instruction     0       0x8026B550      E3530004        false   CMP      r3,#4
Instruction     0       0x8026B554      E2833001        false   ADD      r3,r3,#1
Instruction     0       0x8026B558      E58D3004        false   STR      r3,[sp,#4]
Instruction     0       0x8026B55C      DAFFFFFA        true    BLE      {pc}-0x10 ; 0x8026b54c
Instruction     9       0x8026B54C      E59D3004        false   LDR      r3,[sp,#4]
Instruction     0       0x8026B550      E3530004        false   CMP      r3,#4
Instruction     0       0x8026B554      E2833001        false   ADD      r3,r3,#1
Instruction     0       0x8026B558      E58D3004        false   STR      r3,[sp,#4]
Instruction     0       0x8026B55C      DAFFFFFA        true    BLE      {pc}-0x10 ; 0x8026b54c
Instruction     7       0x8026B54C      E59D3004        false   LDR      r3,[sp,#4]
Instruction     0       0x8026B550      E3530004        false   CMP      r3,#4
Instruction     0       0x8026B554      E2833001        false   ADD      r3,r3,#1
Instruction     0       0x8026B558      E58D3004        false   STR      r3,[sp,#4]
Instruction     0       0x8026B55C      DAFFFFFA        true    BLE      {pc}-0x10 ; 0x8026b54c
Instruction     7       0x8026B54C      E59D3004        false   LDR      r3,[sp,#4]
Instruction     0       0x8026B550      E3530004        false   CMP      r3,#4
Instruction     0       0x8026B554      E2833001        false   ADD      r3,r3,#1
Instruction     0       0x8026B558      E58D3004        false   STR      r3,[sp,#4]
Instruction     0       0x8026B55C      DAFFFFFA        true    BLE      {pc}-0x10 ; 0x8026b54c
Instruction     10      0x8026B54C      E59D3004        false   LDR      r3,[sp,#4]
Instruction     0       0x8026B550      E3530004        false   CMP      r3,#4
Instruction     0       0x8026B554      E2833001        false   ADD      r3,r3,#1
Instruction     0       0x8026B558      E58D3004        false   STR      r3,[sp,#4]
Instruction     0       0x8026B55C      DAFFFFFA        true    BLE      {pc}-0x10 ; 0x8026b54c
Instruction     6       0x8026B560      EE1D3F30        false   MRC      p15,#0x0,r3,c13,c0,#1
Instruction     0       0x8026B564      E1A0100D        false   MOV      r1,sp
Instruction     0       0x8026B568      E3C12D7F        false   BIC      r2,r1,#0x1fc0
Instruction     0       0x8026B56C      E3C2203F        false   BIC      r2,r2,#0x3f
Instruction     0       0x8026B570      E59D1004        false   LDR      r1,[sp,#4]
Instruction     0       0x8026B574      E59F0010        false   LDR      r0,[pc,#16] ; [0x8026B58C] = 0x80550368
Instruction     0       0x8026B578      E592200C        false   LDR      r2,[r2,#0xc]
Instruction     0       0x8026B57C      E59221D0        false   LDR      r2,[r2,#0x1d0]
Instruction     0       0x8026B580      EB07A4CF        true    BL       {pc}+0x1e9344 ; 0x804548c4
Info                                    Tracing enabled
Instruction     13570831        0x8026B584      E28DD00C        false   ADD      sp,sp,#0xc
Instruction     0       0x8026B588      E8BD8000        true    LDM      sp!,{pc}
Timestamp                                       Timestamp: 17107041535
mpoirier@t430:~/work/linaro/coresight/mathieu/TraceDecode$

From the snapshot we see that the condition in the loop is checked 6 times before making an access to the co-processor 15. From there the call to an external function is done before returning, which is coherent with the sample code we introduced in our kernel:

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 454b658..0128f3f 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -260,10 +260,17 @@ static struct sysrq_key_op sysrq_showallcpus_op = {
 
 static void sysrq_handle_showregs(int key)
 {
-       struct pt_regs *regs = get_irq_regs();
-       if (regs)
-               show_regs(regs);
-       perf_event_print_debug();
+       volatile int loop = 0;
+       u32 contextidr;
+
+       while (loop++ < 5)
+               ;
+
+       asm volatile(
+       "       mrc     p15, 0, %0, c13, c0, 1\n"
+       : "=r" (contextidr));
+
+       pr_err("looped around %d times with: %d idr: 0x%x\n", loop, current->pid, contextidr);
 }
 static struct sysrq_key_op sysrq_showregs_op = {
        .handler        = sysrq_handle_showregs,

So far so good. To close the loop objdump is used to cross-reference the result of the de-assembly process with what coresight and DS-5 have given us. First "sysrq.o" is de-assembled:

mpoirier@t430:~/work/linaro/coresight/kernel1$ arm-linux-gnueabi-objdump -d -S drivers/tty/sysrq.o > sysrq.dump
mpoirier@t430:~/work/linaro/coresight/kernel1$

From there looking at the content "sysrq.dump" and, more specifically at "sysrq_handle_showregs" we find:

00000094 <sysrq_handle_showregs>:
  94:   e52de004        push    {lr}            ; (str lr, [sp, #-4]!)
  98:   e24dd00c        sub     sp, sp, #12
  9c:   e3a03000        mov     r3, #0
  a0:   e58d3004        str     r3, [sp, #4]
  a4:   e59d3004        ldr     r3, [sp, #4]
  a8:   e3530004        cmp     r3, #4
  ac:   e2833001        add     r3, r3, #1
  b0:   e58d3004        str     r3, [sp, #4]
  b4:   dafffffa        ble     a4 <sysrq_handle_showregs+0x10>
  b8:   ee1d3f30        mrc     15, 0, r3, cr13, cr0, {1}
  bc:   e1a0100d        mov     r1, sp
  c0:   e3c12d7f        bic     r2, r1, #8128   ; 0x1fc0
  c4:   e3c2203f        bic     r2, r2, #63     ; 0x3f
  c8:   e59d1004        ldr     r1, [sp, #4]
  cc:   e59f0010        ldr     r0, [pc, #16]   ; e4 <sysrq_handle_showregs+0x50>
  d0:   e592200c        ldr     r2, [r2, #12]
  d4:   e59221d0        ldr     r2, [r2, #464]  ; 0x1d0
  d8:   ebfffffe        bl      0 <printk>
  dc:   e28dd00c        add     sp, sp, #12
  e0:   e8bd8000        ldmfd   sp!, {pc}
  e4:   00000020        .word   0x00000020

Which is exactly what the trace decompression process yielded.

Conclusion

Using the latest 30-day evaluation version of DS-5 Ultimate or Professinal edition, it is possible to use the coresight framework currently being upstreamed by Linaro to decode compressed data streams. This example concentrated on the Vexpress TC2 system but other can be adapted just as easily.

All the files that were used in this wiki are available here. It is strongly advised to validate the decoding process in other environment using theses files before attempting the decoding of new traces data.

Resting on the Shoulders of Giants

Trace decoding using DS-5 and this wiki would not have been possible without the significant contribution of Tony Armitstead and Al Grant, both employees of ARM Ltd. Yours truly only collected their words and followed their leads.


CategoryHomepage CategoryCoreDev

WorklingGroups/Kernel/Coresight/traceDecodingWithDS5 (last modified 2018-12-10 18:32:04)