The technique was popularised in 2015 by Leon Gatys, Alexander Ecker, and Matthias Bethge in a paper with the unglamorous title "A Neural Algorithm of Artistic Style". They noticed that the layers inside an image-recognition network separate cleanly into two kinds of information:
If you optimise an image so its content matches one source and its style matches another, you get a re-rendering of the first picture in the visual language of the second. That was the original method.
Gatys-style transfer required running an optimisation loop for every output image. Each picture took minutes on a fast GPU. Useful as a demo, hopeless as a phone app.
The next wave (Johnson, Ulyanov, Dumoulin) trained fast style networks: one network per style, where running it forward once on your photo gives you the stylised result. That dropped the cost to milliseconds. Then came the universal style transfer family, where a single network handled any style with a reference image. Modern apps use a mix of these techniques, plus diffusion-based variants for higher-fidelity results.
Cartoonization (like the kind in our Photo Cartoonizer app) is a relative of style transfer with a twist. Pure style transfer copies textures from a single reference image. Cartoonization is usually trained on many cartoon images at once and learns the distribution of cartoon-ness: outlines, flat colour regions, simplified shading.
The CartoonGAN and AnimeGAN families, for example, learn to map a photo into a cartoon style end-to-end. They are not picking textures from one painting; they are learning what makes something "look like a cartoon" and then applying that look to your photo.
Practically, the difference is that cartoonizers tend to handle faces, landscapes, and pet photos more consistently, while traditional style transfer is better for matching a very specific painter's style.
These models used to need a GPU and 8+ GB of memory. Modern phones changed that. Apple's Neural Engine (the dedicated ML hardware in iPhone, iPad, and Apple Silicon Macs) can run a typical style-transfer or cartoonization model in a few seconds per image, using a quantised on-device build.
Cloud-only services still exist because some styles (especially the diffusion-based ones) are too heavy for a phone. But for most "turn this into a cartoon / anime / oil painting" use cases, on-device is fast enough and considerably more private.
Not every "AI cartoonizer" is actually using AI. Some apps slap an edge-detect filter over your photo, flatten the colour palette, and call it a day. Tells:
A real model treats portraits, landscapes, and screenshots differently and is sensitive to the strength slider in a non-linear way.
Turn photos into cartoons, anime, oil paintings, and pop-art posters with on-device AI on iPhone, iPad, and Mac. · iPhone, iPad & Mac