@@ -115,110 +104,61 @@ standards, fast compilation, and low memory use. Like LLVM, Clang provides a
modular, library-based architecture that makes it suitable for creating or
integrating with other development tools. Clang is considered a
production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86
(32- and 64-bit), and for darwin-arm targets.</p>
<p>In the LLVM 2.8 time-frame, the Clang team has made many improvements:</p>
<ul>
<li>Clang C++ is now feature-complete with respect to the ISO C++ 1998 and 2003 standards.</li>
<li>Added support for Objective-C++.</li>
<li>Clang now uses LLVM-MC to directly generate object code and to parse inline assembly (on Darwin).</li>
<li>Introduced many new warnings, including <code>-Wmissing-field-initializers</code>, <code>-Wshadow</code>, <code>-Wno-protocol</code>, <code>-Wtautological-compare</code>, <code>-Wstrict-selector-match</code>, <code>-Wcast-align</code>, <code>-Wunused</code> improvements, and greatly improved format-string checking.</li>
<li>Introduced the "libclang" library, a C interface to Clang intended to support IDE clients.</li>
<li>Added support for <code>#pragma GCC visibility</code>, <code>#pragma align</code>, and others.</li>
<li>Added support for SSE, AVX, ARM NEON, and AltiVec.</li>
<li>Improved support for many Microsoft extensions.</li>
<li>Implemented support for blocks in C++.</li>
<li>Implemented precompiled headers for C++.</li>
<li>Improved abstract syntax trees to retain more accurate source information.</li>
<li>Added driver support for handling LLVM IR and bitcode files directly.</li>
<li>Major improvements to compiler correctness for exception handling.</li>
<li>Improved generated code quality in some areas:
<ul>
<li>Good code generation for X86-32 and X86-64 ABI handling.</li>
<li>Improved code generation for bit-fields, although important work remains.</li>
@@ -751,343 +637,187 @@ infrastructure, which allows us to implement more aggressive algorithms and make
it run faster:</p>
<ul>
<li>The clang/gcc -momit-leaf-frame-pointer argument is now supported.</li>
<li>The clang/gcc -ffunction-sections and -fdata-sections arguments are now
supported on ELF targets (like GCC).</li>
<li>The MachineCSE pass is now tuned and on by default. It eliminates common
subexpressions that are exposed when lowering to machine instructions.</li>
<li>The "local" register allocator was replaced by a new "fast" register
allocator. This new allocator (which is often used at -O0) is substantially
faster and produces better code than the old local register allocator.</li>
<li>A new LLC "-regalloc=default" option is available, which automatically
chooses a register allocator based on the -O optimization level.</li>
<li>The common code generator code was modified to promote illegal argument and
return value vectors to wider ones when possible instead of scalarizing
them. For example, <3 x float> will now pass in one SSE register
instead of 3 on X86. This generates substantially better code since the
rest of the code generator was already expecting this.</li>
<li>The code generator uses a new "COPY" machine instruction. This speeds up
the code generator and eliminates the need for targets to implement the
isMoveInstr hook. Also, the copyRegToReg hook was renamed to copyPhysReg
and simplified.</li>
<li>The code generator now has a "LocalStackSlotPass", which optimizes stack
slot access for targets (like ARM) that have limited stack displacement
addressing.</li>
<li>A new "PeepholeOptimizer" is available, which eliminates sign and zero
extends, and optimizes away compare instructions when the condition result
is available from a previous instruction.</li>
<li>Atomic operations now get legalized into simpler atomic operations if not
natively supported, easing the implementation burden on targets.</li>
<li>We have added two new bottom-up pre-allocation register pressure aware schedulers:
<ol>
<li>The hybrid scheduler schedules aggressively to minimize schedule length when registers are available and avoid overscheduling in high pressure situations.</li>
<li>The instruction-level-parallelism scheduler schedules for maximum ILP when registers are available and avoid overscheduling in high pressure situations.</li>
</ol></li>
<li>The tblgen type inference algorithm was rewritten to be more consistent and
diagnose more target bugs. If you have an out-of-tree backend, you may
find that it finds bugs in your target description. This support also
allows limited support for writing patterns for instructions that return
multiple results (e.g. a virtual register and a flag result). The
'parallel' modifier in tblgen was removed, you should use the new support
for multiple results instead.</li>
<li>A new (experimental) "-rendermf" pass is available which renders a
MachineFunction into HTML, showing live ranges and other useful
details.</li>
<li>The new SubRegIndex tablegen class allows subregisters to be indexed
symbolically instead of numerically. If your target uses subregisters you
will need to adapt to use SubRegIndex when you upgrade to 2.8.</li>
<!-- SplitKit -->
<li>The -fast-isel instruction selection path (used at -O0 on X86) was rewritten
to work bottom-up on basic blocks instead of top down. This makes it
slightly faster (because the MachineDCE pass is not needed any longer) and
allows it to generate better code in some cases.</li>