- The Atomic Builder
- Posts
- Can a Photo Become Something You Feel Again?
Can a Photo Become Something You Feel Again?
I used Google’s model suite to explore whether specialised models can work together inside one emotional product.
Photos are strange things.
We take them because something matters: a birthday, a quiet morning, a face, a joke, a small moment that somehow feels worth keeping. But once they land in the camera roll, most of them flatten into files. They become thumbnails in an endless scroll. Things we meant to come back to, but often don’t.
That was the starting point for my next build, Encore.
I wasn’t trying to build “an AI app” for the sake of it. I was trying to solve a more human problem: how do you stop a memory becoming static? How do you turn it into something you can feel again?
That led me to a second question, which ended up being just as interesting: could Google’s new model suite work together inside one emotional product flow, with different models doing different jobs, rather than one model trying to do everything?
Encore became the experiment - an app I built that turns memories into music.

Encore. Turn memories into music.
What Encore actually does
Encore takes a photo and turns it into three things: an original music track, generated album art, and a saved memory card you can come back to later.
You upload an image. The app interprets the mood and visual cues in the photo, generates a track title, artist identity, and genre, creates album art, then produces a short song inspired by the moment. If you want, you can save it to your personal collection and revisit it later.

From memory to music: the original photo, the generated cover, and the final playable track.
One detail I particularly like is including the element of the app being multilingual. In Settings, you can choose your preferred language, and the lyrics will be generated in that language (starting with the 8 below). That makes the app feel less like a novelty and more like something personal. The same memory can be reinterpreted in a different linguistic and emotional register depending on how you want to hear it.

Your song can be generated in different languages via Settings
On the surface, that sounds like one neat AI trick. In reality, it only started to feel compelling once I stopped treating “AI” as one monolithic capability and started thinking in terms of orchestration.
That was the real shift.
Why orchestration mattered
What I found with Encore is that the product improved the moment I stopped treating “AI” as one capability and started giving different models different jobs.
That’s what I mean by orchestration.
Not jargon for its own sake. Not an architecture diagram nobody asked for. Just a practical product decision: which model should do which part of the work, and how do those handoffs create something that feels coherent to the user?
People may never use the word orchestration, but they absolutely feel the difference between a product where one model is being stretched too far and one where specialised models are doing the right jobs. The user doesn’t need to know what sits underneath. They just need the experience to feel like it hangs together.
The Google model flow
In practice, Encore uses four Google models in sequence, each with a very specific job, its suited to.
Gemini 3 Flash Preview handles the image safety check.
Gemini 3.1 Pro Preview interprets the photo and builds the musical profile.
Gemini 3.1 Flash Image Preview generates the album art.
Lyria 3 Clip Preview generates the music itself.

How Encore works: one photo, multiple models, one emotional product flow
That mix ended up being the whole point.
The breakthrough was not “AI can make a song”. It was seeing how much stronger the product became once each model had a distinct role inside the same flow. Safety, interpretation, cover art, music. Different jobs, different models, one experience.
That feels much closer to how real AI products will be built than the “one model does everything” framing we still see a lot of.
The interpretation layer mattered more than I expected
If you’d asked me at the start which part would matter most, I probably would have said the music generation. That’s the output people notice first. It’s the flashy bit.
But the deeper product quality sits one layer earlier.
Before the song is generated, the system has to decide what this moment actually is. What kind of music fits it. What emotional weight it carries. Which details matter, and which don’t. That interpretive step is where the experience stops feeling random.
If Encore had simply generated “AI music inspired by your photo”, it would have worn thin quickly. It only started to feel believable when the interpretation layer could pull out a musical identity from the image itself.
That was a useful reminder for me: in AI products, the obvious output isn’t always the most important part. Sometimes the hidden interpretive layer is where the real product quality lives.
From prototype to production
Encore wasn’t the first app I’ve built, but it was a useful test of a different kind of build workflow. Google AI Studio helped me get from concept to believable prototype quickly, which mattered because this was the kind of idea you need to feel, not just explain.
Once the experience clicked, the real work shifted. Authentication. Storage. Firestore. Secrets. Rules. Indexes. Deletion behaviour. Mobile playback quirks. Social previews. Custom domains. All the things that sit just outside the prototype, but ultimately decide whether something feels like a real product.
That was probably the clearest reminder from this build: the prototype is not the product. The interesting skill now is not just being able to prompt something into existence. It’s being able to take an AI-native idea and wire the rest of the system around it until it behaves like something people can use.
What surprised me
A few things stood out.
First, the emotional framing mattered more than the technical trick. “Turn a photo into a song” works because people already treat photos as containers for feeling. The product just changes the format.
Second, orchestration really was the product. Not because users care about the word, but because they feel the difference between a product with clear model roles and one where everything is being forced through the same layer.
Third, production changes what matters. Once something is live, you stop caring about abstract cleverness and start caring about whether authentication works, whether delete really deletes, whether the image gets crushed, whether mobile playback lies about being active, whether the app survives being shared. That shift is healthy. It forces honesty.
The bigger takeaway
Encore is a small app, but it made something clearer for me.
We may be moving away from “AI as one feature” and towards AI as a coordinated system of specialised capabilities. The product question becomes less “which model is best?” and more “which model should do which job, and how do they work together inside one coherent experience?”
That’s the part I find most interesting.
The user never needs to know that one model handled safety, another interpretation, another imagery, and another music. They just need the end result to feel coherent. They need it to feel like one product.
That, to me, is the standard worth building towards.
So, can a photo become something you feel again?
I think it can.
Not every time. Not perfectly. But enough to be interesting. Enough to suggest there’s a product category here somewhere between memory, media, and generative creation.
And that’s the real lesson I’m taking from Encore. The most compelling AI products may not come from one model doing everything. They may come from orchestration, where specialised models each do one part of the job, and the product turns that complexity into something simple, human, and worth coming back to.
If you want to try Encore, it’s live at musicfromphoto.com.
Pick your language. Upload a photo. See what kind of song your memory becomes.
If a photo can become something you feel again, I think there’s a whole category of products still waiting to be built. See you next week. Faisal | ![]() |
P.S. Know someone else who’d benefit from this? Share this issue with them.
Received this from a friend? Subscribe below.
The Atomic Builder is written by Faisal Shariff, Human Productivity Lead at Tomoro AI. Views are my own.
