> And in our new Tensor G3 chip, every major subsystem has been upgraded, paving the way for on-device generative AI.
This is definitely weasel-worded, but note that this does not actually have to mean that all the generative AI stuff can be done on Tensor G3. They could (credibly, but, again, weasel-y) claim that the stuff they've done on G3 is prep work that is "paving the way" for a future chip to be able to do it all.
But either way, I'm not terribly surprised? Generative AI on a mobile SoC (even with special-purpose hardware) still sounds like a bit of a stretch at this point, no? At least, doing it on-device with acceptable performance and power consumption seems a little unlikely?
I would say it's also done for copyright reasons. Google will not do assistant read aloud if the article is behind a paywall. But the new read aloud that is done on device allows for reading almost all any website.
Google has an enormous advantage over apple in data center compute. For one thing, they don't have to pay the Nvidia monopoly tax to do inference. It's silly to color this as a disadvantage.
> As we covered exclusively earlier this week, in an extraordinary move, Google went out of its way to block reviewers from being able to easily install popular benchmark apps through its Play Store during the review embargo period. This actually also extended into the post-launch period too, however Google lifted the ban after our article went live. Tests using Primate Labs’ popular cross-platform benchmark Geekbench 6 showed that -- despite having quite new CPU architecture -- Tensor G3 performance is closer to the mid-range Qualcomm Snapdragon 7+ Gen 2 than it is to its current flagship chip the Snapdragon 8 Gen 2.
> Our work with Tensor has never been about speeds and feeds, or traditional performance metrics. It’s about pushing the mobile computing experience forward. And in our new Tensor G3 chip, every major subsystem has been upgraded, paving the way for on-device generative AI. It includes the latest generation of Arm CPUs, an upgraded GPU, new ISP and Imaging DSP and our next-gen TPU, which was custom-designed to run Google’s AI models.
Then why the internet requirement? Were they trying something that failed last minute forcing them to ship it the way it is? Or was this always just advertisement material?
> Our work with Tensor has never been about speeds and feeds, or traditional performance metrics
Meanwhile, on [1] they say:
> Pixel 8 and Pixel 8 Pro are equipped with Google Tensor G3, Google’s fastest, most efficient, and most secure chip yet. Every major component of the chip has been upgraded for enhanced performance and efficiency. Not only has the number of machine learning models on device more than doubled since 2021, but their complexity and sophistication have increased as well. And Tensor is running many of these complex models at the same time.
Sounds to me like it was very much about performance...
> Were they trying something that failed last minute
Seems likely. Your chip underperforms? Just block people from using benchmarking apps, and also tell the press you never cared about speed. Problem solved.
It's rife across the whole benchmarking industry. Lies, damn lies, and benchmarks.
The massive problem, which has particularly afflicted Qualcomm and Intel in the past, is the difference between peak and sustained performance being absolutely huge, since running at peak would trigger thermal conditions necessitating dynamic down clocking to below normal speeds, which would then persist for a surprisingly long time before normal performance resumes. Certain groups would detect known benchmarking code and alter the acceptable thermal parameters for the execution of the benchmark tests. (i.e. allow the device to become unusually hot).
As others have mentioned this has all the hallmarks of an execution failure, either by Google, Samsung, both, or some unknown third party. Google have a cultural problem of believing way too much in theoretical untested potential solutions instead of aiming for the boring known to work but only 95% as theoretically good, only to find when building it reality is different enough to more than nullify the advantages. It has bitten them in this area before, and it more than likely will again.
Now that you mention it...I have only seen one review about workstations that benchmarked machines for 24 hours to make sure they didn't thermally saturate and throttle and I'm wondering why that isn't more of a metric for non mobile stuff. Heck even mobile workstations may be put under heavy load for durations and this would show if their cooling keeps up or falls short
No, Apple is not the same for this. Their devices do not suffer from anything like the same downtime after running at peak - but they can do nothing to stop developers draining the battery, and that's what is going on. If they were to do that developers would have a giant hissy fit about it.
I have worked with prototypes from other manufacturers that had to be recalled after burning people. I have also found bugs in SoCs that their manufacturers were lying to OEMs about, and had to explain to different OEMs what was going on, after they had shipped millions of units.
The simple truth is modern developers are irresponsible and so the manufacturers get to play everyone off against each other to spread as much self serving nonsense as possible.
the notebookcheck.net article states:
"In a highly unusual move, it has been revealed that Google has blocked reviewers of the Pixel 8 and Pixel 8 Pro from installing popular benchmarks including Geekbench and 3D Mark."
there's an embedded URL linking to a YouTube video, which shows that the Google Play store won't allow the user to install Geekbench or 3dMark on the pre-released Pixel 8 review devices.
saw some phone geeks suspecting last minute addition for thermal dispersion under LCD as well as frequency limiter to manage heat. not sure where the suspicion came from but sounds plausible
There's a bit of a difference between an on-device chip for some AI tasks (like all modern smartphones and some computers have), and high end server-class graphics cards/GPUs/accelerators than take hundreds of watts to run for generative AI.
State of the art for low end generative AI is still running on some sort of desktop class GPU with optimised models and getting not-so-great results. A phone can't do that.
Lots of ML happens on device, more is moving to the device. Generative AI is not ready for that yet from what I can tell (based on external experience, not Google specific).
"...The result of this full-stack optimization is running Stable Diffusion on a smartphone under 15 seconds for 20 inference steps to generate a 512x512 pixel image — this is the fastest inference on a smartphone and comparable to cloud latency. User text input is completely unconstrained."
Hasn't shown up in a shipping cellphone yet AFAIK.
There's a big difference between tech demos and production software for a general audience. Generative AI is only barely production software at the best of times, so almost any compromises mean it's not shippable.
This blog post is interesting, but there are so many caveats to it. It's a 1B parameter model which is tiny. Inference takes 20 seconds, which sucks for UX, but also that will be 20s of sustained extremely high load on the device, which means battery drain that would probably make this unshippable as well. It's also worth noting that the images in the blog post were not generated by this process as far as I can tell, they're just stable diffusion examples.
It's good this research is being done of course, but I think we're a few years away from this being shipped in any real form on device.
You stated, "State of the art for low end generative AI is still running on some sort of desktop class GPU with optimised models and getting not-so-great results. A phone can't do that."
The Qualcomm article states, "Read on to learn how Qualcomm AI Research performed full-stack AI optimizations using the Qualcomm AI Stack to deploy Stable Diffusion on an Android smartphone for the very first time. "
Ok I guess it depends what you think of state of the art as being. I was thinking in terms of the best option that is somewhat generally available, like the best thing I could do with tools and techniques available. Qualcomm's research is better, but at an earlier stage. Maybe that is state of the art! Not quite as useful as I was thinking about though.
The problem is all of these mobiles have custom proprietary AI chips with custom SDKs and are usually not even market share (looking at you Qualcomm Microsoft and Google) so beyond the OEM is it worth any third party to invest time in development for them? Seems like general purpose gpu would be a better pick
It's pretty obvious - some ML tasks are easy enough to do on the phone (e.g. speech recognition); some are not (e.g. inpainting).
Yeah they are being a bit slippery about it, but I don't think it's really a surprise that ML models that require a GTX 4090 to run aren't going to work on a phone.
From the keynote, the only thing I remember being called out as offloaded to the cloud was the HDR video - a feature that will show up eventually on the phone.
All of the AI photo-related tasks that the Google employees demoed, i.e. inpainting et al, happened in realtime.
Rewatching the keynote just now, I see some of the Magic Editor-related scenes have the following disclaimer in tiny white text, "Features simulated and sequences shortened throughout ad. Magic Editor coming soon" - see https://www.youtube.com/watch?v=pxlaUCJZ27E&t=2416s
While I mostly agree with this, apparently some AI features run fine offline on the Pixel 7 Pro but not on the 8/Pro (as per a comment on the article). And I'm not sure if the computation for AI wallpapers is that high to need to be offloaded too. But then again, I think this was more of a bug than an intended feature.
I wonder if they can get there. Apple did bet on the on-device ML, with varied results. I doubt Google can get around their "we run everything in the cloud" mindset.
Sort of. They were early to use vector databases for semantic search. Now we use embeddings from a neural net as features for semantic search, I think they used lots of labels and analyzed features like beats per minute, chord progression, etc. happy to be corrected of course.
Has anyone looked at network traffic while using the Magic Editor to see if it’s actually offloading the processing? Or is this just an inferred conclusion based on needing an internet connection? Is it possible that it requires an internet connection to fetch new models or something but the processing is still on device?
All sorts of stuff in photos breaks in degoogled phones.
It wouldn’t surprise me at all if they decided to disable on device features unless you sent them your data so they could monetize it, share with oppressive regimes, etc.
This makes some amount of sense: generative AI is _very_ expensive.
But generative is only one class of AI workload; predictive/inference is probably what the tensor is used for most. I would not be shocked to find out that the "find object in this photo" part of magic eraser worked on device but the "figure out what to put in picture after object removed" bit is done server side.
The Tensor G1 was probably pretty close to that, but I think subsequent generations have been diverging more. They are still made at Samsung's fabs, and I wouldn't be surprised if Samsung is still helping to design the chips.
Either way, I haven't seen anything to be impressed by when it comes to the Tensor SoCs themselves. The performance is lackluster, and in spite of Google's marketing, the AI acceleration isn't clearly better than what Qualcomm or Apple are offering.
If that's the case, why do phones with the Tensor G1 and G2 only receive 3 major updates? That even applies to the Pixel Fold, a $1800 phone released a couple months ago. There's no Qualcomm to blame here, it appears to be purely a business decision by Google.
> Google is not a naive poor company that isn't capable to have a lawyer company setting up proper support contracts.
Set up contracts with whom, exactly? Google doesn’t control the other OEMs, and the other OEMs are the ones negotiating with Qualcomm, not Google. Unless you’re specifically talking about the Pixel line, which doesn’t have enough market share to force Qualcomm to agree to anything.
Are you perhaps suggesting Google should suddenly and unilaterally prevent Android from working on Qualcomm devices until Qualcomm agrees to support it for X years? Which, if it were even possible, would hurt the other OEMs more than it would hurt Google?
Or are you suggesting that Google stop working with the other OEMs until they agree to negotiate with Qualcomm to support their devices longer?
Google can’t prevent Qualcomm or the other OEMs from compiling Android for their devices; it’s not closed source like Windows. Your comment is confusing.
If Android were closed source, and if Google were negotiating this from the beginning, your comment would make sense. With an open source product, Qualcomm can just do whatever they need to in order to bring Android to Qualcomm devices. The cat is already out of the bag. Google has no say in the matter.
Google could always break the Play Store / Google Play Services on Qualcomm devices, I suppose, but that would directly hurt Google, Google’s OEM partners, and the developers who build for the Google Play Store. More importantly, a move like this would make Samsung push much harder on their own App Store, and Google would attract the scrutiny of unhappy regulators. Google has no incentive to do any of that.
> Are you perhaps suggesting Google should suddenly and unilaterally prevent Android from working on Qualcomm devices until Qualcomm agrees to support it for X years?
Yes, literally
> Which, if it were even possible, would hurt the other OEMs more than it would hurt Google?
Seriously? Whats the revenue share of Qualcomm's sales from Android vendors, 99.9%? The negotiation would last one meeting.
It would only hurt the head of the Qualcomm CEO when doing a quarterly report.
That would be a massive violation of Android Open Source license - something this site constantly rants about when Google moves a new component behind closed doors.
People are screaming at Google for "monopoly abuse" when they move the SMS app into closed source and now you want them to do what exactly? Ban OEMs from using Apache licensed Android until they bow down to their business? How do you expect that to work?
Like, do you people even think for a second when you write this stufF?
Money can’t just make anything happen for any reason. Please re-read my first comment to you. Google has no apparent legal authority to stand on to do what you’re suggesting. You’re not even pretending to explain where they would get the legal authority instead of getting sued into oblivion.
You’re just shouting at Google to go stomp over to Qualcomm and throw a tantrum since they have no authority to more. It makes no sense.
You’re the one making the extraordinary claim that Google has the authority to force Qualcomm into a far reaching support agreement. That kind of claim deserves evidence from you if you want it to be taken seriously. I never claimed to be a lawyer.
Google can certainly ask Qualcomm to provide more support, but I would be very surprised if they haven’t already done that.
Android is open source. If Google overstepped, then Samsung and Qualcomm would fork Android. They don’t need Google, but I’m sure they prefer to mutually benefit with Google.
Google is not Microsoft. Android is not Windows. Google does not “own” Android in the way that Microsoft owns Windows.
I have attempted to have a good faith conversation about your original reply, but you don’t seem interested. This conversation is going nowhere, so again… good luck with that.
So they've paid for that insight, and then proceeded to not do the thing you think they're able to and should do. How exactly is this supposed to support your position? Isn't it much more a sign that any lawyer would tell them "hell no" when presented with your plan.
I wonder if there's a coordination problem where no individual Qualcomm customer wants to pay the entire cost of extended support but if they all shared that cost it would be reasonable.
Fairphone made some claims about this when they launched the Fairphone 5. That's the sole reason it's not using a normal consumer-facing Snapdragon chip, and instead using some kind of industrial Qualcomm chip. Fairphone pointing this out wasn't exactly news to anyone who has been paying attention to this stuff.
I also wouldn't be too surprised if Qualcomm starts to change their tune under pressure from other manufacturers who want to stay competitive with Google.
"Business decision" and "artificial limitation" are frequently synonymous. I think @MishaalRahman was correct to say "If OEMs keep buying a particular chipset [...] then it'll continue being supported", especially with Qualcomm, if the sales volume remains high enough. I also agree it is generally correct to say you can pay an SoC vendor to continue supporting a chipset, but that doesn't seem to have been correct for Qualcomm.
If a particular mid-range chip was still selling well years after release, then Qualcomm may have seen fit to continue supporting it so that those new phones could continue selling well. Qualcomm primarily cares about selling chips. Supporting old chips that have already sold (and aren't continuing to sell well) does not help Qualcomm sell more chips, so they haven't wanted to do it. I think Qualcomm may come around to supporting chips better (maybe even at their next announcement in a few days), but their historical behavior in this regard has been less than ideal.
In fact, Qualcomm cares so much about selling chips that they've allegedly nearly sunk the Oryon SoC in the process.[0] They apparently saw an opportunity to force-bundle more chips, in this case the PMICs, which were so unsuitable that the manufacturers wanted to buy them and literally throw them in the garbage just to keep Qualcomm happy, while using alternative PMICs instead. But, Qualcomm supposedly baked the decision into Oryon so deeply that only Qualcomm's PMICs are compatible. Apparently, this is causing manufacturers to consider abandoning Oryon entirely. Most people would logically think that Qualcomm cares primarily about making money, but instead their first priority actually seems to be selling chips, even if it means leaving money on the table, regardless of whether that is logical or not.
Why did Fairphone choose to use an industrial SoC if they could have just paid Qualcomm a little more money to extend the life of the Snapdragon 8 Gen 2?
Qualcomm is happy to enter a support contract for the BSP they provide. It will be frozen at that already-outdated LTS kernel with whatever mainline backports they feel the need to apply, and the drivers will be the same binary blobs shipped on day one because Android HAL.
This, understandably, runs into Google's CTS policies regarding minimum kernel versions and others, which is why 2 OS releases + 1 year security became the norm.
Still, Projects Treble and Mainline just reached the point where you can stick with 5.x kernels and have an extended support schedule. This involved revving the HAL interfaces among other changes that are just not (economically) feasible to backport. OEM BSPs that are derived from Android Common Kernel releases can feasibly be supported for many years; for example, 5.10 shipped to AOSP in 2021 and will be supported through 2026 [1].
The Google Pixel 8 Pro can run generative AI models locally on the device, but not all of them.
Google announced at the Made by Google event in October 2023 that the Pixel 8 Pro's custom-built Tensor G3 chip can run "distilled" versions of Google's text- and image-generating models. These models can power a range of applications on the phone, such as image editing and smart replies in Gboard.
However, some generative AI tasks, such as running large language models like Bard, still require too much computing power to run locally on a smartphone. These tasks are offloaded to the cloud, where Google has access to more powerful servers.
Here are some examples of generative AI models that can run locally on the Google Pixel 8 Pro:
Magic Eraser
Zoom Enhance
Best Take
Audio Magic Eraser
Gboard Smart Replies
AI summaries in Google Recorder
Google is still working on developing new ways to run generative AI models locally on devices. As the Tensor chip continues to improve, we can expect to see more and more generative AI features running
on-device in future Pixel phones.
ʘ ‿ ʘ
Which latest high quality AI models? MacBooks have unified memory which means they are well suited for running many of them. (As long as it's the 16GB+ models.)
> For all of this generative AI stuff, for anything that actually has to use AI to create things like the AI Wallpaper making, the Magic Editor needs a permanent internet connection... It feels so sluggish, that you are constantly reminded that it's not running on-device...
This is fine.
Generative AI has gotten good by being beastly huge, consuming gobs of resources. It's not particularly fast even on >$1000 consumer GPUs.
It feels like it would be an enormous waste of time and effort to try to scale these generative tasks down to small on-device footprints, where they can fit and run in reasonable time, knowing how much worse the results would be.
I don't see what the alternative is here. But it certainly does leave the question open of what exactly are the great tensor tasks for the edge. With chipmaker like AMD and Qualcomm and ARM all putting on sizable neural/tensor cores too, the question is a somewhat concerning one.
Literally anything latency sensitive is appropriate for edge. Off-device voice recognition is a miserable experience anytime you have less than a perfect connection. Imagine clicking an object in photoshop, using AI to determine the edges… and every time you click, photoshop bundles the image, uploads it to the cloud, and waits five seconds for every select clicking to be scheduled, dispatched, and returned. Imagine running the camera app with ML object recognition focusing, except the cloud is focusing your camera and tracking objects in the viewfinder, with a three second lag.
It's fine, but then this means you have a poor processor that is not even capable of doing what it was advertised for. G3 is worse than snap gen2 in terms of performance, maybe close to gen 1(and I'm not sure battery consumption is comparable) and snap is about to release gen3
Speech recognition, wakeword detection, object recognition, auto complete, ...
Wakeword especially is a continuously running process that needs a relatively small foot print CNN over a fixed window, where power is extremely important.
It's not fine when this "magic" is being advertised as on-device.
After reading (and attempting to quickly implement the models ensembles within) both the RealFill[0] and Break-A-Scene[1] papers published from Google researchers just prior to the Pixel 8 launch I was expecting either a leap in their G3 tensor core akin to 2013 Moto X NLP+contextual awareness cores[2] (which provided better implementations of Active Display, gesture recognition, and voice recognition in loud environs than 95% of current mobile devices) or the Coral[3], the edge TPU they developed that got shockingly amazing inference performance from (though HW production handed off to ASUS in 2022--thanks to the chip shortage, the general arbitrary nature of the company, and their wholesale divestment from IoT) I expected more.
All that to say this: your assumptions of inference performance on >$1000 hardware are fundamentally flawed (the fact that you reach for the buzzy "generative" prefix suggests they're erroneously informed by twitter influencers and attempting to deploy current LLMs.)
Custom hardware can and has been developed in the past (on mobile devices) that could've been tailored to the task at hand. If they failed to meet performance, power draw, or processing time requirements, they should've reframed their pitch instead of exposing themselves to what is likely going to be yet another class action suit focusing on their hardware.
> It's not fine when this "magic" is being advertised as on-device.
I can't help but notice that you included a lot of references, but none for this claim.
Neither of the features mentioned in the article is claimed to be on-device in the official Pixel 8 Pro announcement blog[0]. The only feature that the blog post claims is on-device is the Best Take feature, which the article does not say requires an internet connection.
But of course that's just one bit of marketing material, and I'm sure you've seen these features advertised as happening on-device. Maybe you could post a link?
> Custom hardware can and has been developed in the past (on mobile devices) that could've been tailored to the task at hand.
You think google doesn't know that? are you aware of what's inside of Google's phones?
I'm not sure what performance benefit you expect out of custom hardware. How many orders of magnitude? You're going to probably need at least a few, probably more, to make generative AI work well in the palm of your hand.
Oh, and if you've figured that out, Apple, Google, OpenAI, and other AI companies would like a word.
I'm aware that what's in Google's phones aren't capable of doing the on-device ML inference they claim. You might want to actually read what both I and the article are addressing in particular beyond the broad "generative AI" umbrella that you and other philistines new to the field are imagining aren't capable of being performed on device.
> But of course, on-device generative AI is really complex: 150 times more complex than the most complex model on Pixel 7 just a year ago. Tensor G3 is up for the task, with its efficient architecture co-designed with Google Research.
This is a direct quote from an official press release [0]. They claimed Tensor G3 is "up for the task" that is "on-device generative AI".
I'd say "if you can't do it, simply don't promise it", but the fact is, this is the third time Tensor has been outright incapable of what was promised. People pointing that out are more than justified.
Prior to the launch of the Pixel 6, with their first generation of Tensor SOC, they made big promises concerning HDR video performance [1], implying heavily or outright stating (depending on whom generous you want to be) that they'd finally manage to be on par with Apple. They weren't, by a lot. Pixel 6 video performance was neither on par with Apple nor did it exceed the Pixel 5 on an upper-mid SD765G. Still, first-gen and a bit of overhyping happen to the best of us.
During the Pixel 7 launch [2], they claimed Tensor G2 enabled users to finally get computational photography for high-quality videos. Spoiler alert: It didn't. Fool me once...
Now, on the Pixel 8 with their third generation of Tensor, they finally have a solution that gets their nighttime video processing results competitive with the current iPhone in the form of Video Boost. Instead of doing that processing on their amazing Tensor SOC though, they offload that to the cloud [3]. At least they didn't promise on-device processing improvements to video with the G3, only a tone of GenAI capabilities...
I have followed Tensor extensively, and I am happy to see that they are at least utilizing their control over the silicon to provide a longer update cycle. But few of their local processing promises have held water, and even fewer appear to be impossible on contemporary SOCs from competitors such as Qualcomm (who are by no means angles and need all the competition the market can provide).
If the Pixel team were more honest about their SOCs capabilities and proactively transparent on what they run locally vs off-load to datacenters, that'd be appreciated. With Video Boost they did just that, though I fear that was mainly because of the upload times...
The days of pining for the highest geekbench score are over.
It's all about utility and how much I can do with my phone. I don't care about how many cpu megarams my phone has. Can I fix a picture I took of my wife and kids as they passed by on the Hulk rollercoaster?
There's a huge difference in battery life between Snapdragon and Exynos devices though. It's so clear because Samsung makes the same device in both configurations.
For tensors we don't have such an exact comparison but it does seem that Samsung's fab plays a role in this.
It's going to be an interesting conundrum for Apple, with their drive to keep everything on phone.
Both the computing requirements and the data requirement on modern AI make it a challenging fit. So it will be a curiosity about how Apple tries to balance a feature set that they can power on the phone, net take advantage of the modern AI capabilities.
Apple is super stingy about RAM capacity on their devices, which is a problem for GenAI
Also, the M1 NPU (the equivalent to Google's offloading) is not very fast in Apple's own Stable Diffusion port, and I'm not aware of any community LLM frameworks that even use the NPU. They all run Metal implementations on the M1's GPU.
Everything old is new again. After delivering technology over the last 20 years to seamlessly offload arbitrary computation to a distributed cluster of dedicated hardware, managed entirely transparently by people you've never met, yes let's just go back to slow, local compute because that's "cooler" now or something.
It is a lie that we tell people "your compute is offloaded to the cloud" until now. User-centric heavy compute in the past 20 years simply don't happen in "the cloud". Monetization and engagement tricks yes (ads bidding, behavior analysis, friend / content recommendations, content recompression), but not user-centric ones.
It's strange as the Magic Editing feature, whilst showing promise turned out to be massively underwhelming... It's kind of slow, so you can't really experiment freely and the results usually aren't that great (maybe with practice it would be easier to get good results but that's not happening due to the clunky speed)
Google's smartphone/AI announcements are used to portray them as being massively ahead of their smartphone competitors. The actual product on the other hand is not what they advertise.
Take Duplex for example. It was meant to be this flexible and natural AI that even the store wouldn't notice. In 2023 it's a clunky chatbot experience, it has a patchwork of features depending on the user's location and sounds so much like a robocall that there is a problem with how many stores hang up on it.
If I remember correctly, Duplex was intentionally modified this way after controversy from the initial demo, when, during a call, it didn't obviously identify itself as an automated system.
Having to identify as a chatbot and provide a call back number was a requirement under some states' laws - while this seems like something they should have looked into before advertising it, my comment was with a wider scope to how it performs: it's not flexible and it doesn't sound natural.
> And in our new Tensor G3 chip, every major subsystem has been upgraded, paving the way for on-device generative AI.
This is definitely weasel-worded, but note that this does not actually have to mean that all the generative AI stuff can be done on Tensor G3. They could (credibly, but, again, weasel-y) claim that the stuff they've done on G3 is prep work that is "paving the way" for a future chip to be able to do it all.
But either way, I'm not terribly surprised? Generative AI on a mobile SoC (even with special-purpose hardware) still sounds like a bit of a stretch at this point, no? At least, doing it on-device with acceptable performance and power consumption seems a little unlikely?