There was an initial AnTuTu benchmark where CT+ scored high by 'skipping' parts of instructions. The 'problem' was fixed and the benchmark was revised, bringing down the CT+ score substantially.
I think it is possible that both benchmarks are valid in different contexts. The former is representative of a real world scenario where x86 compiler optimizes the executable to skip 'unnecessary' instructions. However, for the pure hardware test, competiting processors need to execute the same (or equivalent) set of instructions, even if that means executing unnecessary instructions. This is what was done in the revised benchmark.
I am just talking about the possibility, don't know what is the real case. But if the above thought is true, CT+ and successors will have additional advantage in running the real world applications - through compiler optimization. In real world situations, CT+/BT will run as fast as the first AnTuTu benchmark.
Another question is why does AnTuTu benchmark need C/C++ compiler, whether ICC or GCC? Is it based on C/C++ code? Does it bypass Java based Android OS on Android platforms and directly deal with the processor?
Antutu is a C based benchmark and needs a C compiler. In Android you can have native applications and Java based ones. Most Apps are in Java. The OS itself is mostly written in C.
No, it's not possible that both benchmarks are valid. The point of a benchmark is to attempt to create an apples for apples comparison, even basic changes to the compile options invalidates the test (or rather the comparisons to other results).
Yes, using ICC would (in many cases) generate better code than GCC. As would using different compile options. As would using ARM's own compiler. All approaches would make the code run faster and produce better scores. Basically ARM code would also run faster using a better compiler and turning on switches that enable optimization.
ICC 'eliminated' parts of the test through optimization. This is a legitimate way for compilers to improve performance but it 'breaks' the benchmark as all logic needs to be run for the test to be valid.
Your paragraph #2 about "work" performed is very good and, I think you are very correct. It is for very similar reasons that I have big problems with geekbench. Geekbench runs the floating point benchmark on ARM in single-core scalar mode and multi-core scalarmode. On the x86, it runs same AND adds a single/multi-core VECTOR version. Geekbench forces the x86 to stir in the SCALAR modes of the codes EVEN though it would never execute the code in non-vector form. If everyone is so upset about the inequity of these results, why aren't they upset about the extra scalar results geekbench stirs into their 1 composite number.
Invariably a benchmark developer is faced with the decision to chose between how work is done by the tools on the system AND how much of that performance is deliverable using other means. It is very tough.
I have no problem with saying that the compiler used being part of the result. SPEC has a BASE LINE and an OPTIMIZED runs that can be submitted. The requirement of BASE LINE is that there be no special flags for a particular benchmark.
Benchmarking is tough.
If AnTuTu needs a C/C++ compiler, then it is written in C/C++.
Any OS exposes an "API" or application programming interface that defines the set of function calls that a programmer writes in his source code to "call" the function.
application programming interface API
The OS ALSO defines an application binary interface (ABI) that defines were the function parameters are placed with the OS gets a function call.
application binary interface (ABI)
It is not hard to write a thin translation layer to match a compiler with an OS interface.
I'm not sure why you have an issue with the vector tests under geekbench? Yes, they are included in the final score, but you can remove them when doing like for like comparisons. I haven't checked, but they could actually make x86 scores better. The reason why there are no vector tests for ARM is due to the Tegra2 - no SIMD unit:) Geekbench 3 is round the corner so I would expect it to include vector scores for ARM.