From the Android Authority;
The Exynos 5 Octa and the State of Samsung SoCs
by Daniel Charlton on June 2, 2013 8:47 am
"All of this leads up to the Exynos 5 Octa and the Galaxy S4. The Octa is used in the GT-I9500 variant of the Galaxy S4 and is supposed to solve the problem of Cortex-A15 power consumption by using ARM’s big.LITTLE architecture. big.LITTLE allows the use of two core clusters, one for high-performance tasks and one for low-performance tasks. In the Octa, this is a quad-core Cortex-A15 cluster and a quad-core Cortex-A7 cluster, all built on 28nm. In big.LITTLE, there are supposed to be three modes for managing threads across all of the cores in both clusters. Evidence thus far suggests that the Octa really only supports one of these modes – the least efficient. Even worse, it appears that this limitation is due to crippled hardware in the SoC and not something that can be fixed in software"
"According to Samsung, there is no hardware problem at all and the company chose cluster-migration because it “show[s] increased performance/efficiency.” But the statement does not match up with most people’s understanding of ARM’s big.LITTLE architecture. Moreover, ARM demonstrated core-migration working on a pre-release version of the Octa. And Samsung’s released kernel source code for the Octa includes the drivers for core-migration. But that code does not work in the final release version of the Octa and Samsung has been coy in giving a straight answer. Based on this, Linus Torvalds wrote:“quite frankly, the fact that the Exynos 5 currently only works in ‘either or’ configuration almost certainly means that there is something fundamentally wrong with the hardware design, to the point where no amount of ‘complex patches’ can fix it.” While this certainly makes it seem that Samsung has done something wrong in the Octa, the chip is still entirely based on designs from ARM. Torvalds goes on to point out, "
The reason they chose to do it this way, is because the A15 does not scale very well along the performance-power curve. The other big problem they have with core migration, is that in addition to context switching, is that apps have to be compiled for the lowest common instruction set, limiting instruction set extensions.
It's the other way around, big/little allowed the A15 to be so powerful/power hungry. Big/little is't a hack, it's an integrate part of ARM's design and it's spreads into the next generation CPU design (and it's GPU's). If the interconnect doesn't work, then ARM have problems.
Any CPU big/little config have the same exact ISA allowing extremely quick context switches.
A7 and A57 have the same instruction set by design as will A53 and A57. The bigger problem is that CCI is not apparently working. Is that an ARM or Samsung specific problem ? If it's an ARM problem then you can forget doing any serious large memory ARM SMP server systems. ARM flaps its gums a lot but they are not looking too smart at the moment. Also having Torvalds publicly disparage your architecture is not good PR, he has slagged off Itanium for years which has not done it any PR favors.
"Torvalds goes on to point out, he has “very little reason to believe that ARM engineers got their cache handling right. They've never done that before. They've had some of the crappiest caches on the planet.” So, it is very likely that the problem is not even inside Samsung’s control."
I will be interested to see how they approach cache coherency. Moving the data around the system to remain coherent and wiggles some big transistors. Intel made a big change in the way they handle coherency in the 45nm Nehalem CPU which resulted in a big performance jump. Intel likely has a couple of patents on that design that will be generating some revenue if someone chooses to use.
The other problem that will be interesting will be, if they operate in 4-big or in 4-little mode, what does the scheduler look like? When I push the start button does a little wake up or does a big wake up. If I start an app, does it chose the littles or does it chose the bigs? If I have 4 app running and 1 would like a big core and 3 would work fine with little cores, what does it chose.
These are very hard problems to solve and ARM and those who design their own versions will have to do a lot of head scratching.