21.【Generative AI Experiment】How Much “Kyoto” Can Change

Comparing 10 Prompt Directive Variations in Image Generation

tags:


🎯 Introduction

When using image-generation AI, it’s easy to fall into this situation:

“The image looks plausible, but I don’t know why it turned out that way.”

In this article,
we fix the theme to “Kyoto”,
and change only the “directive words” inside the prompt to observe:

🎯 Goal
👉 To understand prompts
not as “magic spells,” but as control inputs.


🧠 Why Choose “Kyoto”?

“Kyoto” is an excellent subject for observing image-generation behavior.

In short:

Small wording differences produce visually obvious changes

This article is
not about finding the “correct” image, but about observing change.


🔬 Experimental Conditions (Fixed Rules)

The following conditions are fixed throughout the experiment:

🔬 Ensure that “what was changed” is always unambiguous.


⚙️ Base Prompt (Common to All Tests)

The following prompt is used as the common base,
and only the differences are modified in each experiment.

Kyoto landscape,
clean composition,
natural lighting,
no text, no logo, no watermark

⚙️ The goal here is not to generate a “good” image,
but to make differences easy to observe.


🔬 “Kyoto” Fixed — 10-Pattern Experiment Table

No. Change Type Prompt Difference Main Visual Change Observation Focus
1 Noun Kyoto landscape photograph Photo / tourism-like Attraction to real locations
2 Noun Kyoto landscape illustration Poster / artwork style Color and composition cleanup
3 Abstraction abstract visualization of Kyoto landscape Buildings dissolve Atmosphere over meaning
4 Specificity realistic Kyoto street scene Everyday life, alleys Increased realism
5 Emotion calm and quiet Kyoto landscape Static, low saturation Light and color suppression
6 Emotion dramatic Kyoto landscape Overstaged visuals Strong clouds and sunsets
7 Era Kyoto scenery in Edo period style Nihonga / woodblock style Style overrides realism
8 Era modern Kyoto cityscape Contemporary urban feel Deviation from “tourist Kyoto”
9 Viewpoint aerial view of Kyoto Bird’s-eye composition Emphasis on geography
10 Viewpoint street-level view of Kyoto Human-eye perspective Depth and narrative

🔬 Generated Image Comparison (10 Patterns)

Kyoto 10 patterns experiment

Figure 1: Ten generated results with “Kyoto” fixed,
changing only directive words in the prompt.


🧠 Observations and Findings

① The First Noun Defines the Worldview

This single noun at the beginning
almost completely determines the image’s direction.

🧠 The most important word for locking the visual world.


② Abstract Words Dissolve “Meaning”

Adding words like:

causes intentional ambiguity in:

👉 Ambiguity is not a bug, but an exploration parameter.


③ Emotion Words Control Color and Light

🧠 Emotion words do not control emotions,
they control visual staging.


⚠️ ④ Era Specification Is the Strongest Parameter

Words like:

instantly override:

architecture, clothing, color palette, and composition

⚠️ Extremely powerful, but often ignores other instructions.


⑤ Viewpoint Fixes the Composition

almost forcibly determine:

👉 If composition control is the goal, this should be prioritized.


⚙️ Practical Implications


✅ Summary

Start with:

“Fixed theme × one-word difference”

This is the fastest way to truly understand prompt behavior.


End.