The Pixel 8 Pro's Tensor G3 off-loads all generative AI tasks to the cloud

kelnos · on Oct 21, 2023

Google said:

> And in our new Tensor G3 chip, every major subsystem has been upgraded, paving the way for on-device generative AI.

This is definitely weasel-worded, but note that this does not actually have to mean that all the generative AI stuff can be done on Tensor G3. They could (credibly, but, again, weasel-y) claim that the stuff they've done on G3 is prep work that is "paving the way" for a future chip to be able to do it all.

But either way, I'm not terribly surprised? Generative AI on a mobile SoC (even with special-purpose hardware) still sounds like a bit of a stretch at this point, no? At least, doing it on-device with acceptable performance and power consumption seems a little unlikely?

dagmx · on Oct 21, 2023

There are apps on iOS that do run stable diffusion like models locally on device.

https://apps.apple.com/ca/app/draw-things-ai-generation/id64...

I believe the Tensor chip is powerful enough to do the same but I suspect the efficiency is low enough that they don’t want to run it locally.

hotstickyballs · on Oct 21, 2023

That’s the difference between apple approach and google approach.

Apple does a lot of computation on your own devices (in the name of privacy and what not) and ends up saving a lot of server compute.

Googles strategy so far has been just burning cloud compute costs.

disillusioned · on Oct 22, 2023

This isn't a fair characterization, though. At least not as reductively as you're making it sound.

Google ALSO performs transcription/live captioning/speaker diarization on-device, for privacy, for instance.

The features being discussed here are the generative AI features, like Magic Editor and their heretofore not-yet-launch Video Boost mode.

kyrra · on Oct 22, 2023

I would say it's also done for copyright reasons. Google will not do assistant read aloud if the article is behind a paywall. But the new read aloud that is done on device allows for reading almost all any website.

fooblaster · on Oct 21, 2023

Google has an enormous advantage over apple in data center compute. For one thing, they don't have to pay the Nvidia monopoly tax to do inference. It's silly to color this as a disadvantage.

VectorLock · on Oct 21, 2023

Not that surprising really when you consider where the companies come from and what their other major successes are.

valianteffort · on Oct 21, 2023

From what I understand google's hardware doesn't measure up to apple even when just comparing ML cores.

hajile · on Oct 22, 2023

why is this even the headliner complaint?

Post-launch embargoes to deceive consumers about actual performance is a MUCH bigger deal.

phatskat · on Oct 22, 2023

What is this in reference to? Sorry, OOTL on “post-launch embargoes”

me-vs-cat · on Oct 22, 2023

From the article:

> As we covered exclusively earlier this week, in an extraordinary move, Google went out of its way to block reviewers from being able to easily install popular benchmark apps through its Play Store during the review embargo period. This actually also extended into the post-launch period too, however Google lifted the ban after our article went live. Tests using Primate Labs’ popular cross-platform benchmark Geekbench 6 showed that -- despite having quite new CPU architecture -- Tensor G3 performance is closer to the mid-range Qualcomm Snapdragon 7+ Gen 2 than it is to its current flagship chip the Snapdragon 8 Gen 2.

user_7832 · on Oct 21, 2023

Quote from Google:

> Our work with Tensor has never been about speeds and feeds, or traditional performance metrics. It’s about pushing the mobile computing experience forward. And in our new Tensor G3 chip, every major subsystem has been upgraded, paving the way for on-device generative AI. It includes the latest generation of Arm CPUs, an upgraded GPU, new ISP and Imaging DSP and our next-gen TPU, which was custom-designed to run Google’s AI models.

Then why the internet requirement? Were they trying something that failed last minute forcing them to ship it the way it is? Or was this always just advertisement material?

dataflow · on Oct 21, 2023

> Our work with Tensor has never been about speeds and feeds, or traditional performance metrics

Meanwhile, on [1] they say:

> Pixel 8 and Pixel 8 Pro are equipped with Google Tensor G3, Google’s fastest, most efficient, and most secure chip yet. Every major component of the chip has been upgraded for enhanced performance and efficiency. Not only has the number of machine learning models on device more than doubled since 2021, but their complexity and sophistication have increased as well. And Tensor is running many of these complex models at the same time.

Sounds to me like it was very much about performance...

[1] https://store.google.com/intl/en/ideas/articles/google-tenso...

ForkMeOnTinder · on Oct 21, 2023

> Were they trying something that failed last minute

Seems likely. Your chip underperforms? Just block people from using benchmarking apps, and also tell the press you never cared about speed. Problem solved.

josephcsible · on Oct 21, 2023

See https://www.notebookcheck.net/Google-blocked-Pixel-8-Pixel-8... for details about "block people from using benchmarking apps".

AlexandrB · on Oct 21, 2023

Wtf? How is this not an abuse of market power?

fidotron · on Oct 21, 2023

It's rife across the whole benchmarking industry. Lies, damn lies, and benchmarks.

The massive problem, which has particularly afflicted Qualcomm and Intel in the past, is the difference between peak and sustained performance being absolutely huge, since running at peak would trigger thermal conditions necessitating dynamic down clocking to below normal speeds, which would then persist for a surprisingly long time before normal performance resumes. Certain groups would detect known benchmarking code and alter the acceptable thermal parameters for the execution of the benchmark tests. (i.e. allow the device to become unusually hot).

As others have mentioned this has all the hallmarks of an execution failure, either by Google, Samsung, both, or some unknown third party. Google have a cultural problem of believing way too much in theoretical untested potential solutions instead of aiming for the boring known to work but only 95% as theoretically good, only to find when building it reality is different enough to more than nullify the advantages. It has bitten them in this area before, and it more than likely will again.

hypercube33 · on Oct 22, 2023

Now that you mention it...I have only seen one review about workstations that benchmarked machines for 24 hours to make sure they didn't thermally saturate and throttle and I'm wondering why that isn't more of a metric for non mobile stuff. Heck even mobile workstations may be put under heavy load for durations and this would show if their cooling keeps up or falls short

fomine3 · on Oct 23, 2023

Apple too. iPhone is known for state of the art SoC and moderate heat design, that causes unstable peak performance for heavy gaming or benchmark.

fidotron · on Oct 23, 2023

No, Apple is not the same for this. Their devices do not suffer from anything like the same downtime after running at peak - but they can do nothing to stop developers draining the battery, and that's what is going on. If they were to do that developers would have a giant hissy fit about it.

I have worked with prototypes from other manufacturers that had to be recalled after burning people. I have also found bugs in SoCs that their manufacturers were lying to OEMs about, and had to explain to different OEMs what was going on, after they had shipped millions of units.

The simple truth is modern developers are irresponsible and so the manufacturers get to play everyone off against each other to spread as much self serving nonsense as possible.

canucker2016 · on Oct 21, 2023

the notebookcheck.net article states: "In a highly unusual move, it has been revealed that Google has blocked reviewers of the Pixel 8 and Pixel 8 Pro from installing popular benchmarks including Geekbench and 3D Mark."

there's an embedded URL linking to a YouTube video, which shows that the Google Play store won't allow the user to install Geekbench or 3dMark on the pre-released Pixel 8 review devices.

see https://www.youtube.com/watch?v=bk4ZUmKqRm0&t=330

charcircuit · on Oct 22, 2023

Those app are notorious for leaking performance. At least with geekbench you have to buy a license to even have the choice to not upload results.

numpad0 · on Oct 22, 2023

saw some phone geeks suspecting last minute addition for thermal dispersion under LCD as well as frequency limiter to manage heat. not sure where the suspicion came from but sounds plausible

intelVISA · on Oct 21, 2023

they can't keep getting away with this (Copy)(2)

danpalmer · on Oct 21, 2023

There's a bit of a difference between an on-device chip for some AI tasks (like all modern smartphones and some computers have), and high end server-class graphics cards/GPUs/accelerators than take hundreds of watts to run for generative AI.

State of the art for low end generative AI is still running on some sort of desktop class GPU with optimised models and getting not-so-great results. A phone can't do that.

Lots of ML happens on device, more is moving to the device. Generative AI is not ready for that yet from what I can tell (based on external experience, not Google specific).

canucker2016 · on Oct 21, 2023

from Feb 2023 blog post, https://www.qualcomm.com/news/onq/2023/02/worlds-first-on-de...:

"...The result of this full-stack optimization is running Stable Diffusion on a smartphone under 15 seconds for 20 inference steps to generate a 512x512 pixel image — this is the fastest inference on a smartphone and comparable to cloud latency. User text input is completely unconstrained."

Hasn't shown up in a shipping cellphone yet AFAIK.

danpalmer · on Oct 21, 2023

There's a big difference between tech demos and production software for a general audience. Generative AI is only barely production software at the best of times, so almost any compromises mean it's not shippable.

This blog post is interesting, but there are so many caveats to it. It's a 1B parameter model which is tiny. Inference takes 20 seconds, which sucks for UX, but also that will be 20s of sustained extremely high load on the device, which means battery drain that would probably make this unshippable as well. It's also worth noting that the images in the blog post were not generated by this process as far as I can tell, they're just stable diffusion examples.

It's good this research is being done of course, but I think we're a few years away from this being shipped in any real form on device.

canucker2016 · on Oct 21, 2023

You stated, "State of the art for low end generative AI is still running on some sort of desktop class GPU with optimised models and getting not-so-great results. A phone can't do that."

The Qualcomm article states, "Read on to learn how Qualcomm AI Research performed full-stack AI optimizations using the Qualcomm AI Stack to deploy Stable Diffusion on an Android smartphone for the very first time. "

danpalmer · on Oct 21, 2023

Ok I guess it depends what you think of state of the art as being. I was thinking in terms of the best option that is somewhat generally available, like the best thing I could do with tools and techniques available. Qualcomm's research is better, but at an earlier stage. Maybe that is state of the art! Not quite as useful as I was thinking about though.

hypercube33 · on Oct 22, 2023

The problem is all of these mobiles have custom proprietary AI chips with custom SDKs and are usually not even market share (looking at you Qualcomm Microsoft and Google) so beyond the OEM is it worth any third party to invest time in development for them? Seems like general purpose gpu would be a better pick

IshKebab · on Oct 21, 2023

It's pretty obvious - some ML tasks are easy enough to do on the phone (e.g. speech recognition); some are not (e.g. inpainting).

Yeah they are being a bit slippery about it, but I don't think it's really a surprise that ML models that require a GTX 4090 to run aren't going to work on a phone.

izacus · on Oct 21, 2023

The fact that those tasks are running in cloud was called out in Pixel 8 keynote and mentioned in their blog post as well - https://blog.google/products/pixel/google-pixel-8-pro-camera...

This seems yet another nothingburger news to generate clicks and HN rants.

canucker2016 · on Oct 21, 2023

From the keynote, the only thing I remember being called out as offloaded to the cloud was the HDR video - a feature that will show up eventually on the phone.

All of the AI photo-related tasks that the Google employees demoed, i.e. inpainting et al, happened in realtime.

Rewatching the keynote just now, I see some of the Magic Editor-related scenes have the following disclaimer in tiny white text, "Features simulated and sequences shortened throughout ad. Magic Editor coming soon" - see https://www.youtube.com/watch?v=pxlaUCJZ27E&t=2416s

IshKebab · on Oct 21, 2023

It's not mentioned in that blog post except for Video Boost.

user_7832 · on Oct 21, 2023

While I mostly agree with this, apparently some AI features run fine offline on the Pixel 7 Pro but not on the 8/Pro (as per a comment on the article). And I'm not sure if the computation for AI wallpapers is that high to need to be offloaded too. But then again, I think this was more of a bug than an intended feature.

wccrawford · on Oct 21, 2023

That's exactly how it went with Stadia and "negative latency", and that was never actually implemented before they shuttered Stadia.

qwertox · on Oct 21, 2023

paving the way for on-device generative AI seems to indicate that they're not there yet.

troupo · on Oct 21, 2023

I wonder if they can get there. Apple did bet on the on-device ML, with varied results. I doubt Google can get around their "we run everything in the cloud" mindset.

As an anecdotal data point: they recently silently defaulted storing photos in Google Photos instead of on device (tweet in Russian: https://twitter.com/igrekde/status/1715456025594134690)

jpalawaga · on Oct 21, 2023

google has tons of on-device ml. live audio transcriptions, for example. always-on shazam.

generative ai is very computationally expensive, even for inference. I too wonder if they can get there, but nobody has really done it.

kortilla · on Oct 21, 2023

Is Shazam considered AI? I thought fingerprinting songs was a relatively straight-forward algo that didn’t require any training.

ShamelessC · on Oct 22, 2023

Sort of. They were early to use vector databases for semantic search. Now we use embeddings from a neural net as features for semantic search, I think they used lots of labels and analyzed features like beats per minute, chord progression, etc. happy to be corrected of course.

antifa · on Oct 22, 2023

To me, that choice of words indicates Google is intending to claim the hardware is ready.

barkingcat · on Oct 21, 2023

All of that text you quoted is pure advertising nonsense. It's like reading buzzword after buzzword.

SoftTalker · on Oct 21, 2023

> Then why the internet requirement?

So they can include it in their advertising profile about you, and later insert ads into the generated content.

hef19898 · on Oct 21, 2023

Later?

l33t7332273 · on Oct 22, 2023

Do they currently?

nunez · on Oct 21, 2023

Maybe Google intends on pushing those activities entirely on device later?

IAmGraydon · on Oct 21, 2023

I was thinking this. It may not be ready yet and they had to launch before it was done.

fotta · on Oct 21, 2023

Has anyone looked at network traffic while using the Magic Editor to see if it’s actually offloading the processing? Or is this just an inferred conclusion based on needing an internet connection? Is it possible that it requires an internet connection to fetch new models or something but the processing is still on device?

privacyking · on Oct 21, 2023

The magic editor (rainbow button on Google photos) only works if you enable cloud backup, so yes.

hedora · on Oct 22, 2023

All sorts of stuff in photos breaks in degoogled phones.

It wouldn’t surprise me at all if they decided to disable on device features unless you sent them your data so they could monetize it, share with oppressive regimes, etc.

partiallypro · on Oct 21, 2023

This seems like the most likely answer to me, but getting little fanfare here.

baby_souffle · on Oct 21, 2023

This makes some amount of sense: generative AI is _very_ expensive.

But generative is only one class of AI workload; predictive/inference is probably what the tensor is used for most. I would not be shocked to find out that the "find object in this photo" part of magic eraser worked on device but the "figure out what to put in picture after object removed" bit is done server side.

randyrand · on Oct 21, 2023

There is inference & training. generative is inference.

astrange · on Oct 22, 2023

Many generative tasks like inversion/dreambooth/finetuning aren't inference.

jan_Sate · on Oct 21, 2023

If they're offloading to the cloud, what's the AI processor in Tensor G3 even used for?

coder543 · on Oct 21, 2023

It is used for a few non-generative tasks like speech recognition, but other chips are also capable of these tasks.

The only thing the Tensor G3 seems to be better at is avoiding Qualcomm's habit of EoLing consumer smartphone chips.

fidotron · on Oct 21, 2023

I think the truth is the Tensor is a barely modified Samsung Exynos.

coder543 · on Oct 21, 2023

The Tensor G1 was probably pretty close to that, but I think subsequent generations have been diverging more. They are still made at Samsung's fabs, and I wouldn't be surprised if Samsung is still helping to design the chips.

Either way, I haven't seen anything to be impressed by when it comes to the Tensor SoCs themselves. The performance is lackluster, and in spite of Google's marketing, the AI acceleration isn't clearly better than what Qualcomm or Apple are offering.

kcb · on Oct 21, 2023

All the tensor chips so far have had different CPU core configurations than available Exynos chips. So it isn't just a rebadge.

fh9302 · on Oct 21, 2023

If that's the case, why do phones with the Tensor G1 and G2 only receive 3 major updates? That even applies to the Pixel Fold, a $1800 phone released a couple months ago. There's no Qualcomm to blame here, it appears to be purely a business decision by Google.

coder543 · on Oct 21, 2023

Where did I mention the Tensor G1 and G2? I didn't. I only said "Tensor G3". Yes, feel free to blame Google for not supporting the G1 and G2 better.

None of that changes that Qualcomm does not support consumer smartphone chips beyond a couple of years.

pjmlp · on Oct 21, 2023

Google is not a naive poor company that isn't capable to have a lawyer company setting up proper support contracts.

In fact, Microsoft has no issues imposing exactly that for Windows ARM devices since the Windows CE/Pocket PC days.

It seems it needs to be called for every time this is used as excuse.

coder543 · on Oct 21, 2023

> Google is not a naive poor company that isn't capable to have a lawyer company setting up proper support contracts.

Set up contracts with whom, exactly? Google doesn’t control the other OEMs, and the other OEMs are the ones negotiating with Qualcomm, not Google. Unless you’re specifically talking about the Pixel line, which doesn’t have enough market share to force Qualcomm to agree to anything.

Are you perhaps suggesting Google should suddenly and unilaterally prevent Android from working on Qualcomm devices until Qualcomm agrees to support it for X years? Which, if it were even possible, would hurt the other OEMs more than it would hurt Google?

Or are you suggesting that Google stop working with the other OEMs until they agree to negotiate with Qualcomm to support their devices longer?

Google can’t prevent Qualcomm or the other OEMs from compiling Android for their devices; it’s not closed source like Windows. Your comment is confusing.

If Android were closed source, and if Google were negotiating this from the beginning, your comment would make sense. With an open source product, Qualcomm can just do whatever they need to in order to bring Android to Qualcomm devices. The cat is already out of the bag. Google has no say in the matter.

Google could always break the Play Store / Google Play Services on Qualcomm devices, I suppose, but that would directly hurt Google, Google’s OEM partners, and the developers who build for the Google Play Store. More importantly, a move like this would make Samsung push much harder on their own App Store, and Google would attract the scrutiny of unhappy regulators. Google has no incentive to do any of that.

ClumsyPilot · on Oct 21, 2023

> Are you perhaps suggesting Google should suddenly and unilaterally prevent Android from working on Qualcomm devices until Qualcomm agrees to support it for X years?

Yes, literally

> Which, if it were even possible, would hurt the other OEMs more than it would hurt Google?

Seriously? Whats the revenue share of Qualcomm's sales from Android vendors, 99.9%? The negotiation would last one meeting.

It would only hurt the head of the Qualcomm CEO when doing a quarterly report.

izacus · on Oct 21, 2023

That would be a massive violation of Android Open Source license - something this site constantly rants about when Google moves a new component behind closed doors. People are screaming at Google for "monopoly abuse" when they move the SMS app into closed source and now you want them to do what exactly? Ban OEMs from using Apache licensed Android until they bow down to their business? How do you expect that to work?

Like, do you people even think for a second when you write this stufF?

wmf · on Oct 21, 2023

Google Android diverged from AOSP long ago and already has hundreds of pages of requirements. https://source.android.com/docs/compatibility/14/android-14-... Likewise Google abused their monopoly for years, so why not use that power for good? https://www.theverge.com/2022/9/14/23341207/google-eu-androi...

izacus · on Oct 22, 2023

Because both US and EU are suing them to prevent that.

pjmlp · on Oct 21, 2023

Many questions that lawyers in international trade law have certainly an answer for.

If Google actually cared about making it happen.

coder543 · on Oct 21, 2023

Thanks for those “insights” into trade law…

Google isn’t above the law, despite what some people in this thread apparently wish.

pjmlp · on Oct 21, 2023

I bet Google can afford to pay for that insight.

After all they had enough money for lawyers, to avoid becoming Microsoft on the second lawsuit of Java history After screwing up Sun.

coder543 · on Oct 21, 2023

Money can’t just make anything happen for any reason. Please re-read my first comment to you. Google has no apparent legal authority to stand on to do what you’re suggesting. You’re not even pretending to explain where they would get the legal authority instead of getting sued into oblivion.

You’re just shouting at Google to go stomp over to Qualcomm and throw a tantrum since they have no authority to more. It makes no sense.

Good luck with that.

pjmlp · on Oct 21, 2023

[flagged]

coder543 · on Oct 21, 2023

You’re the one making the extraordinary claim that Google has the authority to force Qualcomm into a far reaching support agreement. That kind of claim deserves evidence from you if you want it to be taken seriously. I never claimed to be a lawyer.

Google can certainly ask Qualcomm to provide more support, but I would be very surprised if they haven’t already done that.

Android is open source. If Google overstepped, then Samsung and Qualcomm would fork Android. They don’t need Google, but I’m sure they prefer to mutually benefit with Google.

Google is not Microsoft. Android is not Windows. Google does not “own” Android in the way that Microsoft owns Windows.

I have attempted to have a good faith conversation about your original reply, but you don’t seem interested. This conversation is going nowhere, so again… good luck with that.

jsnell · on Oct 21, 2023

So they've paid for that insight, and then proceeded to not do the thing you think they're able to and should do. How exactly is this supposed to support your position? Isn't it much more a sign that any lawyer would tell them "hell no" when presented with your plan.

pjmlp · on Oct 21, 2023

We know what we don't know.

fh9302 · on Oct 21, 2023

Can you show me evidence that Qualcomm refuses to support chipsets even in the case where they would get paid to extend support?

wmf · on Oct 21, 2023

I wonder if there's a coordination problem where no individual Qualcomm customer wants to pay the entire cost of extended support but if they all shared that cost it would be reasonable.

coder543 · on Oct 21, 2023

Fairphone made some claims about this when they launched the Fairphone 5. That's the sole reason it's not using a normal consumer-facing Snapdragon chip, and instead using some kind of industrial Qualcomm chip. Fairphone pointing this out wasn't exactly news to anyone who has been paying attention to this stuff.

I also wouldn't be too surprised if Qualcomm starts to change their tune under pressure from other manufacturers who want to stay competitive with Google.

fh9302 · on Oct 21, 2023

Mishaal Rahman claims it's purely a business decision and not an artifical limitation set by Qualcomm: https://twitter.com/MishaalRahman/status/1710442249778086260

coder543 · on Oct 21, 2023

"Business decision" and "artificial limitation" are frequently synonymous. I think @MishaalRahman was correct to say "If OEMs keep buying a particular chipset [...] then it'll continue being supported", especially with Qualcomm, if the sales volume remains high enough. I also agree it is generally correct to say you can pay an SoC vendor to continue supporting a chipset, but that doesn't seem to have been correct for Qualcomm.

If a particular mid-range chip was still selling well years after release, then Qualcomm may have seen fit to continue supporting it so that those new phones could continue selling well. Qualcomm primarily cares about selling chips. Supporting old chips that have already sold (and aren't continuing to sell well) does not help Qualcomm sell more chips, so they haven't wanted to do it. I think Qualcomm may come around to supporting chips better (maybe even at their next announcement in a few days), but their historical behavior in this regard has been less than ideal.

In fact, Qualcomm cares so much about selling chips that they've allegedly nearly sunk the Oryon SoC in the process.[0] They apparently saw an opportunity to force-bundle more chips, in this case the PMICs, which were so unsuitable that the manufacturers wanted to buy them and literally throw them in the garbage just to keep Qualcomm happy, while using alternative PMICs instead. But, Qualcomm supposedly baked the decision into Oryon so deeply that only Qualcomm's PMICs are compatible. Apparently, this is causing manufacturers to consider abandoning Oryon entirely. Most people would logically think that Qualcomm cares primarily about making money, but instead their first priority actually seems to be selling chips, even if it means leaving money on the table, regardless of whether that is logical or not.

Why did Fairphone choose to use an industrial SoC if they could have just paid Qualcomm a little more money to extend the life of the Snapdragon 8 Gen 2?

[0]: https://semiaccurate.com/2023/09/26/whats-going-on-with-qual...

tadfisher · on Oct 21, 2023

Qualcomm is happy to enter a support contract for the BSP they provide. It will be frozen at that already-outdated LTS kernel with whatever mainline backports they feel the need to apply, and the drivers will be the same binary blobs shipped on day one because Android HAL.

This, understandably, runs into Google's CTS policies regarding minimum kernel versions and others, which is why 2 OS releases + 1 year security became the norm.

Still, Projects Treble and Mainline just reached the point where you can stick with 5.x kernels and have an extended support schedule. This involved revving the HAL interfaces among other changes that are just not (economically) feasible to backport. OEM BSPs that are derived from Android Common Kernel releases can feasibly be supported for many years; for example, 5.10 shipped to AOSP in 2021 and will be supported through 2026 [1].

1: https://source.android.com/docs/core/architecture/kernel/and...

my123 · on Oct 21, 2023

Extended security update support by Qualcomm is very much possible if you're a big OEM, but can be quite expensive.

If you don't get money after the sale of the device, even $1 can be too much to pay for on that front.

imjonse · on Oct 21, 2023

Maybe it has hw acceleration support for HTTP operations :)

baz00 · on Oct 21, 2023

Marketing street cred.

xgudwilx · on Oct 29, 2023

Google's Ai says:

The Google Pixel 8 Pro can run generative AI models locally on the device, but not all of them.

Google announced at the Made by Google event in October 2023 that the Pixel 8 Pro's custom-built Tensor G3 chip can run "distilled" versions of Google's text- and image-generating models. These models can power a range of applications on the phone, such as image editing and smart replies in Gboard.

However, some generative AI tasks, such as running large language models like Bard, still require too much computing power to run locally on a smartphone. These tasks are offloaded to the cloud, where Google has access to more powerful servers.

Here are some examples of generative AI models that can run locally on the Google Pixel 8 Pro:

Magic Eraser Zoom Enhance Best Take Audio Magic Eraser Gboard Smart Replies AI summaries in Google Recorder

Google is still working on developing new ways to run generative AI models locally on devices. As the Tensor chip continues to improve, we can expect to see more and more generative AI features running on-device in future Pixel phones. ʘ ‿ ʘ

izacus · on Oct 21, 2023

"Tensor" isn't offloading anything, it's the Android apps that are doing that to Google cloud. Not sure what the SoC has to do with that?

jrflowers · on Oct 21, 2023

Who wrote the Android apps you’re talking about?

wmf · on Oct 21, 2023

But why don't those apps use the local AI engine?

izacus · on Oct 21, 2023

Er, have you seen the hardware requirements for latest high quality AI models? They won't even run on your MacBook, much less on a mobile SoC.

This really shoulnd't be a hard thing to check on a HACKER news forum which constantly talks about modern AI.

astrange · on Oct 22, 2023

Which latest high quality AI models? MacBooks have unified memory which means they are well suited for running many of them. (As long as it's the 16GB+ models.)

izacus · on Oct 22, 2023

And Pixel 8 doesn't have 16GB+ of RAM.

abeyer · on Oct 21, 2023

Because ads aren't local

kibwen · on Oct 21, 2023

The mothership needs to inform the phone whether it should inject a subliminal Pepsi logo into the generated image, or a subliminal Nike logo.

dicriseg · on Oct 21, 2023

The headline immediately reminded me of the Juicero.

How can they advertise on-device if they’re immediately sending your data elsewhere?

kristjansson · on Oct 22, 2023

Duh? High quality generative features on-device would be a shocking leap.

jauntywundrkind · on Oct 21, 2023

> For all of this generative AI stuff, for anything that actually has to use AI to create things like the AI Wallpaper making, the Magic Editor needs a permanent internet connection... It feels so sluggish, that you are constantly reminded that it's not running on-device...

This is fine.

Generative AI has gotten good by being beastly huge, consuming gobs of resources. It's not particularly fast even on >$1000 consumer GPUs.

It feels like it would be an enormous waste of time and effort to try to scale these generative tasks down to small on-device footprints, where they can fit and run in reasonable time, knowing how much worse the results would be.

I don't see what the alternative is here. But it certainly does leave the question open of what exactly are the great tensor tasks for the edge. With chipmaker like AMD and Qualcomm and ARM all putting on sizable neural/tensor cores too, the question is a somewhat concerning one.

ip26 · on Oct 21, 2023

Literally anything latency sensitive is appropriate for edge. Off-device voice recognition is a miserable experience anytime you have less than a perfect connection. Imagine clicking an object in photoshop, using AI to determine the edges… and every time you click, photoshop bundles the image, uploads it to the cloud, and waits five seconds for every select clicking to be scheduled, dispatched, and returned. Imagine running the camera app with ML object recognition focusing, except the cloud is focusing your camera and tracking objects in the viewfinder, with a three second lag.

Moldoteck · on Oct 21, 2023

It's fine, but then this means you have a poor processor that is not even capable of doing what it was advertised for. G3 is worse than snap gen2 in terms of performance, maybe close to gen 1(and I'm not sure battery consumption is comparable) and snap is about to release gen3

hiddencost · on Oct 21, 2023

Speech recognition, wakeword detection, object recognition, auto complete, ...

Wakeword especially is a continuously running process that needs a relatively small foot print CNN over a fixed window, where power is extremely important.

jessfyi · on Oct 21, 2023

It's not fine when this "magic" is being advertised as on-device.

After reading (and attempting to quickly implement the models ensembles within) both the RealFill[0] and Break-A-Scene[1] papers published from Google researchers just prior to the Pixel 8 launch I was expecting either a leap in their G3 tensor core akin to 2013 Moto X NLP+contextual awareness cores[2] (which provided better implementations of Active Display, gesture recognition, and voice recognition in loud environs than 95% of current mobile devices) or the Coral[3], the edge TPU they developed that got shockingly amazing inference performance from (though HW production handed off to ASUS in 2022--thanks to the chip shortage, the general arbitrary nature of the company, and their wholesale divestment from IoT) I expected more.

All that to say this: your assumptions of inference performance on >$1000 hardware are fundamentally flawed (the fact that you reach for the buzzy "generative" prefix suggests they're erroneously informed by twitter influencers and attempting to deploy current LLMs.)

Custom hardware can and has been developed in the past (on mobile devices) that could've been tailored to the task at hand. If they failed to meet performance, power draw, or processing time requirements, they should've reframed their pitch instead of exposing themselves to what is likely going to be yet another class action suit focusing on their hardware.

[0] https://realfill.github.io/ [1] https://omriavrahami.com/break-a-scene/static/paper/Break-A-... [2] https://en.wikipedia.org/wiki/Moto_X_(1st_generation)#Hardwa... [3] https://coral.ai/

jsnell · on Oct 21, 2023

> It's not fine when this "magic" is being advertised as on-device.

I can't help but notice that you included a lot of references, but none for this claim.

Neither of the features mentioned in the article is claimed to be on-device in the official Pixel 8 Pro announcement blog[0]. The only feature that the blog post claims is on-device is the Best Take feature, which the article does not say requires an internet connection.

But of course that's just one bit of marketing material, and I'm sure you've seen these features advertised as happening on-device. Maybe you could post a link?

[0] https://blog.google/products/pixel/google-pixel-8-pro/

jpalawaga · on Oct 21, 2023

> Custom hardware can and has been developed in the past (on mobile devices) that could've been tailored to the task at hand.

You think google doesn't know that? are you aware of what's inside of Google's phones?

I'm not sure what performance benefit you expect out of custom hardware. How many orders of magnitude? You're going to probably need at least a few, probably more, to make generative AI work well in the palm of your hand.

Oh, and if you've figured that out, Apple, Google, OpenAI, and other AI companies would like a word.

jessfyi · on Oct 21, 2023

I'm aware that what's in Google's phones aren't capable of doing the on-device ML inference they claim. You might want to actually read what both I and the article are addressing in particular beyond the broad "generative AI" umbrella that you and other philistines new to the field are imagining aren't capable of being performed on device.

Topfi · on Oct 21, 2023

> But of course, on-device generative AI is really complex: 150 times more complex than the most complex model on Pixel 7 just a year ago. Tensor G3 is up for the task, with its efficient architecture co-designed with Google Research.

This is a direct quote from an official press release [0]. They claimed Tensor G3 is "up for the task" that is "on-device generative AI".

I'd say "if you can't do it, simply don't promise it", but the fact is, this is the third time Tensor has been outright incapable of what was promised. People pointing that out are more than justified.

Prior to the launch of the Pixel 6, with their first generation of Tensor SOC, they made big promises concerning HDR video performance [1], implying heavily or outright stating (depending on whom generous you want to be) that they'd finally manage to be on par with Apple. They weren't, by a lot. Pixel 6 video performance was neither on par with Apple nor did it exceed the Pixel 5 on an upper-mid SD765G. Still, first-gen and a bit of overhyping happen to the best of us.

During the Pixel 7 launch [2], they claimed Tensor G2 enabled users to finally get computational photography for high-quality videos. Spoiler alert: It didn't. Fool me once...

Now, on the Pixel 8 with their third generation of Tensor, they finally have a solution that gets their nighttime video processing results competitive with the current iPhone in the form of Video Boost. Instead of doing that processing on their amazing Tensor SOC though, they offload that to the cloud [3]. At least they didn't promise on-device processing improvements to video with the G3, only a tone of GenAI capabilities...

I have followed Tensor extensively, and I am happy to see that they are at least utilizing their control over the silicon to provide a longer update cycle. But few of their local processing promises have held water, and even fewer appear to be impossible on contemporary SOCs from competitors such as Qualcomm (who are by no means angles and need all the competition the market can provide).

If the Pixel team were more honest about their SOCs capabilities and proactively transparent on what they run locally vs off-load to datacenters, that'd be appreciated. With Video Boost they did just that, though I fear that was mainly because of the upload times...

[0] https://blog.google/products/pixel/google-tensor-g3-pixel-8/

[1] https://9to5google.com/2021/08/02/google-pixel-6-video-hdr-t...

[2] https://www.youtube.com/live/2NGjNQVbydc?si=2Gg1mPrdOkmu1L44...

[3] https://blog.google/products/pixel/google-pixel-8-pro/

causality0 · on Oct 21, 2023

Man I'd kill for a Pixel 8 Pro with a snapdragon.

sergiotapia · on Oct 21, 2023

The days of pining for the highest geekbench score are over.

It's all about utility and how much I can do with my phone. I don't care about how many cpu megarams my phone has. Can I fix a picture I took of my wife and kids as they passed by on the Hulk rollercoaster?

wkat4242 · on Oct 22, 2023

There's a huge difference in battery life between Snapdragon and Exynos devices though. It's so clear because Samsung makes the same device in both configurations.

For tensors we don't have such an exact comparison but it does seem that Samsung's fab plays a role in this.

And battery life is still very much a hot topic.

sva_ · on Oct 22, 2023

I think the Snapdragon 8 Gen 2 does just fine with doing some matmuls, just as the Pixel tensor chip does.

Moreover, the Snapdragon uses less battery overall and stays cooler. But the differences seem marginal.

causality0 · on Oct 22, 2023

For me it's not about performance, it's about the Mali GPU cores. They seriously gimp game console emulation.

whartung · on Oct 21, 2023

It's going to be an interesting conundrum for Apple, with their drive to keep everything on phone.

Both the computing requirements and the data requirement on modern AI make it a challenging fit. So it will be a curiosity about how Apple tries to balance a feature set that they can power on the phone, net take advantage of the modern AI capabilities.

Or, just bail, and funnel it to the net.

wallaBBB · on Oct 21, 2023

My initial reaction was completely opposite to yours: - Oh this is going to burn google's money, especially if the phone becomes popular.

Unless you want to charge a subscription for a feature it really isn't smart for Apple to try something like this.

foota · on Oct 21, 2023

Aren't the M1 and friends relatively capable for LLM tasks?

brucethemoose2 · on Oct 21, 2023

Sort of.

Apple is super stingy about RAM capacity on their devices, which is a problem for GenAI

Also, the M1 NPU (the equivalent to Google's offloading) is not very fast in Apple's own Stable Diffusion port, and I'm not aware of any community LLM frameworks that even use the NPU. They all run Metal implementations on the M1's GPU.

ActorNightly · on Oct 21, 2023

Apple doesn't care about on or off phone, its only goal is to make trendy devices that sell because of the badge.

astrange · on Oct 22, 2023

Apple spends over $20 billion a year on R&D.

ActorNightly · on Oct 24, 2023

Whats your point?

Their device features has been behind android for quite some time, with only a slight edge in overall maxed out performance.

lopkeny12ko · on Oct 21, 2023

Everything old is new again. After delivering technology over the last 20 years to seamlessly offload arbitrary computation to a distributed cluster of dedicated hardware, managed entirely transparently by people you've never met, yes let's just go back to slow, local compute because that's "cooler" now or something.

randyrand · on Oct 21, 2023

cooler, fully reliable, and privacy preserving (ideally).

liuliu · on Oct 22, 2023

It is a lie that we tell people "your compute is offloaded to the cloud" until now. User-centric heavy compute in the past 20 years simply don't happen in "the cloud". Monetization and engagement tricks yes (ads bidding, behavior analysis, friend / content recommendations, content recompression), but not user-centric ones.

nmstoker · on Oct 21, 2023

It's strange as the Magic Editing feature, whilst showing promise turned out to be massively underwhelming... It's kind of slow, so you can't really experiment freely and the results usually aren't that great (maybe with practice it would be easier to get good results but that's not happening due to the clunky speed)

quitit · on Oct 21, 2023

Google's smartphone/AI announcements are used to portray them as being massively ahead of their smartphone competitors. The actual product on the other hand is not what they advertise.

Take Duplex for example. It was meant to be this flexible and natural AI that even the store wouldn't notice. In 2023 it's a clunky chatbot experience, it has a patchwork of features depending on the user's location and sounds so much like a robocall that there is a problem with how many stores hang up on it.

annaaa · on Oct 21, 2023

If I remember correctly, Duplex was intentionally modified this way after controversy from the initial demo, when, during a call, it didn't obviously identify itself as an automated system.

quitit · on Oct 21, 2023

Having to identify as a chatbot and provide a call back number was a requirement under some states' laws - while this seems like something they should have looked into before advertising it, my comment was with a wider scope to how it performs: it's not flexible and it doesn't sound natural.

5cott0 · on Oct 21, 2023

On-device is already possible on mobile?

https://apps.apple.com/us/app/tarot-raven/id1539747669

tortoise_in · on Oct 22, 2023

But If they won't say it then no one will buy it. But last it's still cleanest Android experience and you can flash custom roms anytime.

avipars · on Oct 21, 2023

What is the point of hyping up the G3 then?

clnq · on Oct 21, 2023

Weasel words marketing is still marketing.

edandersen · on Oct 22, 2023

Now we know where the “7 years of support” came from - that’s when they will/can turn the servers off, remotely removing advertised features.

astrange · on Oct 22, 2023

It's because that's the storage writes and repair parts availability lifespan.

retskrad · on Oct 21, 2023

Google is not a hardware company. Their hardware can never be trusted.