Clang provides a few options to generate timing report. Among them,-ftime-report
and -ftime-trace
can be used toanalyze the performance of Clang's internal passes.
-fproc-stat-report
records time and memory on spawnedprocesses (ld
, and gas if-fno-integrated-as
).-ftime-trace
, introduced in 2019, generates Clangtiming information in the Chrome Trace Event format (JSON). The formatsupports nested events, providing a rich view of the front end.-ftime-report
: The option name is borrowed fromGCC.This post focuses on the traditional -ftime-report
,which uses a line-based textual format.
-ftime-report
outputThe output consists of information about multiple timer groups. Thelast group spans the largest interval and encompasses timing data fromother groups.
Up to Clang 19, the last group is called "Clang front-end timereport". You would see something like the following.
1 | % clang -c -w -ftime-report ~/Dev/testsuite/sqlite3.i |
The "Clang front-end timer" timer measured the time spent inclang::FrontendAction::Execute
, which includes lexing,parsing, semantic analysis, LLVM IR generation, optimization, andmachine code generation. However, "Code Generation Time" and "LLVM IRGeneration Time" belonged to the default timer group "MiscellaneousUngrouped Timers". This caused confusion for many users. For example, https://aras-p.info/blog/2019/01/12/Investigating-compile-times-and-Clang-ftime-report/elaborates on the issues.
To address the ambiguity, I revamped the output in Clang 20.
1 | ... |
The last group has been renamed and changed to cover a longerinterval within the invocation. It provides timing information for fourstages:
The -ftime-report
output further elaborates on thesestages through additional groups:
Examples:
"Pass execution timing report" (first instance) 1
2
3
4
5
6
7
8
9
10===-------------------------------------------------------------------------===
Pass execution timing report
===-------------------------------------------------------------------------===
Total Execution Time: 3.0009 seconds (3.0016 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.9626 ( 32.7%) 0.0162 ( 26.6%) 0.9788 ( 32.6%) 0.9790 ( 32.6%) InstCombinePass
0.3203 ( 10.9%) 0.0056 ( 9.2%) 0.3259 ( 10.9%) 0.3263 ( 10.9%) InlinerPass
0.3123 ( 10.6%) 0.0068 ( 11.1%) 0.3190 ( 10.6%) 0.3187 ( 10.6%) SimplifyCFGPass
...
When -ftime-report=per-run-pass
is specified, a timer iscreated for each pass object. This can result in significant output,especially for modules with numerous functions, as each pass will bereported multiple times.
As clang -### -ftime-report
shows, clangDriver forwards-ftime-report
to Clang cc1. Within cc1, this option setsthe codegen flag clang::CodeGenOptions::TimePasses
. Thisflag enables th uses of llvm::Timer
objects to measure theexecution time of specific code blocks.
From Clang 20 onwards, the placement of the timers can be understoodthrough the following call tree.
1 | cc1_main |
LLVM/lib/Support/Time.cpp
implements the timer feature.Timer
belongs to a TimerGroup
.Timer::startTimer
and Timer::stopTimer
generate a TimeRecord
. Inclang/tools/driver/cc1_main.cpp
,llvm::TimerGroup::printAll(llvm::errs());
dumps theseTimerGroup
and TimeRecord
information tostderr.
There are a few cl::opt
options
sort-timers
(default: true): sort the timers in a groupin descending wall time.track-memory
: record increments or decrements in mallocstatistics. In glibc 2.33 and above, this utilizesmallinfo2::unordblks
.info-output-file
: dump output to the specifiedfile.Examples:
1 | clang -c -ftime-report -mllvm -sort-timers=0 a.c |
The cl::opt option -time-passes
can be used with theLLVM internal tools opt
and llc
, e.g.
1 | opt -S -passes='default<O2>' -time-passes < a.ll |
On Apple platforms, LLVM_SUPPORT_XCODE_SIGNPOSTS=on
builds enableos_signpost
forstartTimer
/stopTimer
.