The Next Performance Challenge: The Battle for the Burst

In day-to-day use, few people regularly complain about the performance of their tech devices. For the most part, people are content with the experience of using them. Sure, there are some curmudgeonly (or cheap) folks hanging onto older devices that still offer sub-par performance, but they’re becoming the exception instead of the norm.

That doesn’t mean all the performance challenges for tech devices have been solved, however—far from it. In fact, now that the overall bar for performance has been raised to a “good enough” level, component and system designers can finally start to tackle some of the thornier challenges that have been with us for some time.

One of the biggest challenges involves what I’ll call the “bursting” issue. It seems no matter how content we are with a given device’s performance, there almost always comes a moment (or two, or three…) where the performance doesn’t live up to our expectations. Streaming a video, taking multiple photos, playing an online game, and other types of activities can cause a hiccup in an otherwise decent performance experience. These moments may not last long, but they absolutely impact our overall opinion of the device, application or service we’re using.

In virtually all cases, these brief slowdowns involve a burst in activity, or series of bursts, that place strain on an otherwise solid performing system. Interestingly, these bursts can cause challenges in several different device subsystems—CPU, graphics, storage, modem and other connectivity—sometimes individually and sometimes simultaneously.

Regardless, some of the more interesting efforts to increase performance in all of these areas are now directed towards battling these burst issues. In the case of CPUs, it often involves more sophisticated chip architectures, with more simultaneous compute threads and pipelines, better predictive branching, increased caches and other enhancements that can ensure the chip is working as effectively as possible. For graphics, some of these same principles also apply, but there are also improvements in geometry engines, programmable shaders, and more.

For storage, meeting these challenges requires faster types of flash memory, more sophisticated controller chips, and better error correction algorithms. In the case of modems and other radios, new technology standards, like LTE Advanced and 802.11ac and 802.11ad make a difference, but implementing specific technologies within those standards, like carrier aggregation and multi-user MIMO, also have big influence on driving higher levels of throughput.[pullquote]For the device and component industry, where ‘good enough’ performance is becoming an increasing threat to upgrade purchases for existing devices, the trick will be to explain how performance headroom can be a valuable, worthwhile investment.”[/pullquote]

Raw performance improvement is also a factor in all cases, because sometimes it takes raising the overall performance bar in order to be prepared for the sudden spikes that inevitably occur. While there are different ways of reaching new performance levels in each of those respective component areas, general improvements in silicon manufacturing, shrinking of die size to smaller process technologies, and Moore’s Law overall conspire to make performance enhancements possible in all of these areas.

In the world of audio equipment, the ability to handle extremes in signal strength is called headroom. Well designed audio equipment, whether it be used for listening, creating or recording purposes, has plenty of headroom in order to handle the sudden bursts in volume that often occur in music. Not surprisingly, it adds cost to design and build in that extra headroom. There are always debates about how about much headroom is actually necessary and how much it’s worth paying for. While there aren’t necessarily any real right or wrong answers, it’s generally understood that having a decent amount of headroom helps with the overall performance of the audio component (or system) and is worth spending an additional amount on.

For the device and component industry, where “good enough” performance is becoming an increasing threat to upgrade purchases for existing devices, the trick will be to explain how performance headroom can be a valuable, worthwhile investment. Part of the problem is that many existing performance benchmark tests are designed to show off typical tasks and not the bursts in activity that are increasingly the bottleneck for better system performance. As mentioned previously, day-to-day performance on most devices is typically fine for most users, so showing increases in that area can seem like overkill. If new benchmarks were built around the ability to cover (or not cover) the bursts, however, that might provide an entirely new way of looking at today’s performance challenges.

Explaining some of these kinds of concepts in a meaningful way to typical consumers may not be an easy task, but it’s a critical one for future growth.

On a separate and unrelated note, this column marks the one year anniversary of the launch of my company, TECHnalysis Research, as well as the appearance of my weekly column on Tech.pinions.com. I’d just like to give a quick note of thanks for all the support, interest and feedback I’ve received over this past year. It’s been great. Thank you!

Published by

Bob O'Donnell

Bob O’Donnell is the president and chief analyst of TECHnalysis Research, LLC a technology consulting and market research firm that provides strategic consulting and market research services to the technology industry and professional financial community. You can follow him on Twitter @bobodtech.

20 thoughts on “The Next Performance Challenge: The Battle for the Burst”

  1. This is the one area where Apple has a big advantage. By designing their own chips, and showing that they are willing to pay for more silicon in their processor than their competitors, they can create the various subsystems on the processor to give them the headroom they need.

    1. Actually, my view is that Apple are doing the reverse in the A8
      – fewer cores, so less reserve oomph on tap
      – no big.LITTLE, so the lonely “big” side of the equation has to take more care of power consumption
      – a focus on device looks and thinness, with a small battery so power consumption is more of a concern.

      Apple are indeed customizing ARM’s designs, but to me, it seems they go for steady performance, compared for example to the increasingly widespread 4+4 big.LITTLE configuration of the competition.

      1. You take a vary narrow view of what is important in a processor in a mobile device.

        But let me make a few points first. Apple makes the most powerful single core in the ARM market. The new Tegra core is similar, but needs a higher clock frequency to maintain the performance. Single thread performance is still what is very important in mobile phones.

        Another point, as the A8 runs at a lower frequency than all these other chips you are talking about, Apple has much more room to grow performance. Plus, the thermal dissipation is a lot less with lower frequency, which is important in a phone.

        I don’t really understand how you can say the A8 has less reserve oomph when it is one of the fastest mobile processors out there, and its heat signature allows it to run with less throttling.

        The real issue that I am talking about is not the cores or the GPU. It is the other subsystems at work. The A8 includes custom video decoders (as do many others), it has custom processing elements that take care of encryption (as will others if they choose to implement that part of the ARM design), it has custom processing for the camera which is of the highest quality. An Apple phone also includes an ultra-low power processing element for tracking motion. These are the subsystems that Apple is willing to spend on a bit more than other companies. I expect for them to add more and more of them to their processors in the future.

        1. This was not about performance in general, but about reserve burst. What of what you list applies specifically to burst ? Indeed, thermal throttling does if it’s a long (several minutes) burst, but… that’s it ?
          As far as I know, VPU in the A8 is a standard part, just like the GPU is. Encryption is no longer optional in 64bit ARMv8, it was optional in earlier versions only (and often implemented), all other other SOCs also have dedicated camera HW many support higher resolutions too). Ditto the external chip for tracking motion, Motorola has one for voice for example…
          I’m not sure how Apple is willing to spend a bit more on SoCs than other companies. By going for 2 cores when many use 8 ?

          1. Again, if you have the FASTEST processor, you have the most reserve. Apple has the FASTEST processor. For being a guy who seems to only care about larger numbers, you should have figured that out by now.

          2. Not so fast. It’s not enough to have the fastest car on the straightaway, you also need torque to maintain smoothness of the ride. I’m not saying the A8 doesn’t have these attributes, I don’t know, just saying…

          3. Every mobile processor shares one design, the ability to get to its lowest power state as fast as possible and then back up to full power as fast as possible. This is happing within a few hundred cycles when it is operating at a billion and a half cycles every second, this is basically negligible. Your analogy doesn’t really hold up, since acceleration for every mobile processor from 0 to full bore is basically nothing.

          4. Actually, I see the number of cores as the basis for the analogue to “torque”, and what you describe as speed.

            Under diverse load (uphill, number of threads) more cores help. Under flat out single thread, what you just described matters.

          5. Well, actually it does, that’s why there was a move from ever higher clock speeds to multiple cores over a decade a go. It’s also the reason for turbo boost, for when the loads aren’t as high.

          6. Heat and power dissipation were the primary reasons for moving to multiple cores. They couldn’t scale performance very well anymore. You can only push clock speed so high, and they had reached peak performance. They went to multiple cores instead. However, multi-threaded programming is not up to snuff even now, which is one of the reason for turbo boost.

            But I’m really having a difficulty with the fundamental of the argument: the 4 core processor with lower total performance has more headroom than the 2 core processor with higher total performance. It’s just a stupid argument to make. It is especially stupid when the two core chip operates at a lower clock speed which allows much easier performance gains in the future. The 2 core chip also exhibits less throttling, making its performance even better over the long run.

            Now, you are saying that the lower performance chips are better because they have more cores, which are more difficult to program for. Or somehow the performance is better because they can leave one core one and just process with that? I don’t know.

          7. I’m not saying you’re entirely wrong, but I do think you are neglecting how threads of execution are handled. Multiple slower cores can (and do) outperform fewer faster cores. I’m not a CS guy, but even I know that. Of course that depends on the type of workload, whether it’s amenable to parallelism or not. Running disparate processes, as PC’s and mobile devices do, fits that bill.

  2. Explaining some of these kinds of concepts in a meaningful way to typical consumers may not be an easy task, but it’s a critical one for future growth.

    I think that this generally has to be combined with software and marketing.

    For example, if companies can persuade consumers that editing video on their smartphones is very useful, then those customers will demand higher performing hardware. To do this, the companies will have to provide compelling video editing software for smartphones. They will also have to air commercials that show how easy and satisfying it is to do.

    Likewise, if companies can convince customers that smartphones are capable of console-level game performance, they will also want the best hardware. To do this, companies will have to put good GPUs on their smartphones, combined with developer tools (APIs) that are optimised for graphics performance. They will also have to showcase great games running on their hardware.

    My observation is, ever since Steve Jobs started preaching the digital hub, Apple has consistently made an effort to do this. They’re pretty good at it by now.

    1. Agreed that software is a key part of this story. It’s advanced applications that tend to drive these “bursty” performance requirements.

      1. I was wondering what you felt about the PC industry back in the early 2000s. About the time that Steve Jobs started talking about the digital hub.

        If I remember correctly, at that time, PCs were already “good enough” for office productivity and basic web surfing. I recall some people were already proclaiming the PC to be dead, to which Steve Jobs replied “We don’t think the PC is dying at all. We don’t think the PC is moving from the center at all. We think it’s evolving. Just like it has since it was invented in 1975 and ’76.”

        What Steve Jobs then did was to launch iPhoto, iMovie, iDVD, iTunes, Garage Band and iWeb in the iLife suite. Apps which were necessary for the digital hub strategy, but which also required a lot of processing capability and made powerful PCs more desirable.

        Is history about to repeat itself? Will we see dramatic new applications that will demand more performance of PCs or smartphones? VR is an obvious possibility, but I’m sure that there are more.

        1. I agree with you, which are but some of the reasons I favor “speeds and feeds” (so maligned by some). You’re never fast enough in general. You’re only fast enough for specific tasks. That may be many, but when you need speed, you need it bad.
          That!, and speed and experience are aligned, not mutually exclusive.

Leave a Reply

Your email address will not be published. Required fields are marked *