Flow Claims That with Its Companion Chip and Some Muscle, it Can Boost the Performance of Any CPU by 100x | TechCrunch - Latest Global News

Flow Claims That with Its Companion Chip and Some Muscle, it Can Boost the Performance of Any CPU by 100x | TechCrunch

A Finnish startup called Flow Computing is making one of the boldest claims ever heard in silicon: By adding a proprietary companion chip, any CPU can instantly double its performance, and increase it up to 100x with software optimizations.

If it works, it could help the industry keep pace with AI manufacturers’ insatiable computing power demands.

Flow is a spin-off of VTT, a government-funded research organization in Finland that is something of a national lab. The chip technology it is commercializing, which it calls the Parallel Processing Unit, is the result of research conducted in that lab (although VTT is an investor, the intellectual property belongs to Flow).

Flow is the first to admit that this claim is ridiculous on its face. You can’t just magically squeeze extra performance out of CPUs across different architectures and codebases. If that were the case, Intel or AMD or whoever would have done it years ago.

But Flow has been working on something that has would have been theoretically possible – it’s just that no one managed to pull it off.

Central processing units have come a long way since the early days of vacuum tubes and punch cards, but in some fundamental ways they’re still the same. Their main limitation is that, as serial rather than parallel processors, they can only do one thing at a time. Of course, they switch that thing between multiple cores and paths a billion times a second — but those are all ways to accommodate the single-lane nature of the CPU. (A GPU, on the other hand, performs many related calculations simultaneously, but specializes in certain operations.)

“The CPU is the weakest link in computing,” says Flow co-founder and CEO Timo Valtonen. “It is not up to the task, and that must change.”

CPUs have become very fast, but even with a nanosecond response time, there is enormous waste in the execution of instructions, simply because of the fundamental limitation that one task must be completed before the next can start. (I’m simplifying here, as I’m not a chip engineer myself.)

Flow claims to have removed this limitation, transforming the CPU from a single-lane road into a multi-lane highway. While the CPU is still limited to handling one task at a time, Flow’s PPU, as they call it, essentially performs nanosecond traffic management on the chip to move tasks in and out of the processor faster than was previously possible.

Imagine the CPU as a chef working in the kitchen. The chef can only go so fast, but what if that person had a superhuman assistant that puts knives and tools in the chef’s hands, clears the prepared food, puts in new ingredients, and takes care of all the tasks that are not part of the actual cooking? The chef still only has two hands, but now he can work ten times faster.

Diagram (in log, note) showing the improvements of an FPGA PPU-enhanced chip over unmodified Intel chips. Increasing the number of PPU cores continuously improves performance.
Photo credits: Flow calculation

It’s not a perfect comparison, but it gives you an idea of ​​what’s happening here, at least according to Flow’s internal testing and demos with the industry (and they talk to everyone). The PPU doesn’t increase the clock speed or otherwise tax the system in any way that would result in extra heat or power; in other words, it doesn’t ask the chef to chop twice as fast. It just uses the CPU cycles already running more efficiently.

This is nothing new, says Valtonen. “It has been studied and discussed in the academic world. Parallelization is already possible, but it destroys the legacy code and is then useless.”

So it could be done. It just wouldn’t have been possible without rewriting all the code in the world from scratch, making it a futile endeavor. A similar problem was solved by another Nordic computer company, ZeroPoint, which achieved a high level of memory compression while maintaining data transparency with the rest of the system.

So the great achievement of Flow is not its high-speed traffic management, but the fact that it can do so without changing any code on any tested CPU or architecture. It sounds kind of crazy to claim that arbitrary code can run twice as fast on any chip without any changes other than integrating the PPU into the chip.

This is the biggest challenge to Flow’s commercial success: unlike a software product, Flow’s technology has to be integrated at the chip design level, meaning it doesn’t work retrospectively, and the first chip with a PPU would inevitably be a long time away. Flow has shown that the technology works in FPGA-based test setups, but chip manufacturers would have to invest quite a lot of resources to achieve the corresponding benefits.

The founding team of Flow, from left: Jussi Roivainen, Martti Forsell and Timo Valtonen.
Photo credits: Flow calculation

However, the magnitude of these gains, and the fact that CPU improvements have been iterative and marginal over the past few years, might have chipmakers knocking on Flow’s door pretty urgently. If you can really double your performance in a generation with a layout change, that’s a no-brainer.

Further performance gains are achieved by refactoring and recompiling software to work better with the PPU-CPU combination. Flow says it has seen performance gains of up to 100x in code that has been modified (but not necessarily completely rewritten) to take advantage of the technology. The company is working to offer recompilation tools to simplify this task for software makers looking to optimize for Flow-enabled chips.

Tirias Research analyst Kevin Krewell, who was briefed on Flow’s technology and brought in as an outside expert on the matter, was more concerned about industry adoption than fundamentals.

He rightly pointed out that AI acceleration is the biggest market right now, with specialty chips like Nvidia’s popular H100. While a PPU-accelerated CPU would result in benefits across the board, chipmakers may not want to get too far off the mark. And it simply begs the question of whether these companies are willing to invest significant resources in a largely unproven technology when they likely have a five-year plan that would be derailed by that choice.

Will Flow’s technology become an indispensable part of any chipmaker’s business, bringing them wealth and notoriety? Or will stingy chipmakers stay the course and continue to extract profits from the ever-growing computer market? Probably somewhere in between – but it’s telling that while Flow has achieved a great engineering feat here, like all startups, the company’s future depends on its customers.

Flow is emerging from the shadows with pre-funding of €4 million (approximately $4.3 million) led by Butterfly Ventures with participation from FOV Ventures, Sarsia, Stephen Industries, Superhero Capital and Business Finland.

Sharing Is Caring:

Leave a Comment