Word Embeddings Align with Kandinsky’s Theory of Color | by Danielle Boccelli | Dec, 2022

By Jessie Hobb On Dec 9, 2022

A quick experiment at the intersection of Art and Science

In this article, I describe a quick experiment that aims to quantify the associations among two seemingly orthogonal sets — basic shapes and primary colors—by using word embedding representations and cosine similarity to test Wassily Kandinsky’s theory of form and color. However, while the use of cosine similarity to measure associations among words is broadly accepted in the natural language processing literature, and while the results of this experiment are positive (i.e., the results align with the underlying theory), this article is intended neither as a confirmation of the theory nor as a tutorial highlighting good experimentation practices in data science but rather as a call to question the value of cosine similarity in a qualitative context, as described in more detail the Discussion.

Over the summer or so, while researching Bauhaus (I had wanted to try to think in forms rather than words for a change), I came across the Getty Research Institute’s Kandinsky Form and Color Exercise; being an appreciator of the various exercises, quizzes, puzzles, etc. that can be found online as I am, and being genuinely intrigued by the question this exercise in particular posed (i.e., “What shape is color?”) as I was, I decided to give it a go.

The exercise simply asks participants to one-to-one correspond {circle, triangle, square} with {red, blue, yellow}, and after giving this task some thought, I decided first that triangles were red (they are pointiest, and red has its sharpness), then that circles were yellow (perhaps too influenced by the yellow circles of children’s books’ suns), and finally that squares must be blue (a sea against the horizon, conveniently); excitedly, I proceeded to the next screen to see if I was correct (willing to assume that one could be correct with respect to the question posed).

My model was not correct. According to Kandinsky’s theory, triangles are yellow (yellow being the sharp color for which I mistook red), circles are blue (aligning their roundness with blue’s depth and calm), and squares are red (“forceful” and “assertive”, mirroring the inherent “tension” of the squareness of a plane). (Having read them, I am fully on board with these explanations and accept them as correct — they are fine explanations — but I do have a fondness for my own, even if they overextend, overly rely on secondary representations, and rely on post hoc reasoning, respectively.)

To test his theory, Kandinsky presented a survey (similar to the exercise above) to his students, asking them to match form with color; whether he had let his model slip during his teaching, I do not know, but the majority of responses did align with Kandinsky’s model.

In contrast to Kandinsky’s students, who may have been influenced by his model prior to completing the survey (they learned from him, after all), I see no real reason to assume that the model itself has influenced a large random sample of English speakers (such as that on whose text word2vec embeddings [3] were trained) to such an extent that there is an alignment of these sets of forms and colors within the English language itself due to the model specifically (i.e., the usage of the words in standard English seems independent of the model).

So I did an experiment with pretrained word2vec embeddings to see if there was any pattern worth noting in the cosine similarities between paired shapes and colors, the results of which are printed below; nicely, they also align with Kandinsky’s model (!?), with ‘triangle’ being closest to ‘yellow’, ‘circle’ to ‘blue’, and ‘square’ to ‘red’.

Testing Kandinsky’s model with word embeddings (source: author)

Although I would not necessarily call Kandinsky’s model confirmed from my little experiment (What does it mean for the shape of a circle to have an “inner resonance” with the color blue? How much has Kandinsky’s work influenced our associations and thus our language?), the results — even if they are only lucky correlations — are interesting to consider alongside word embeddings as a way to quantify conceptual information [1, 2, 4].

Consider as a further example that, while the ranks for the set of colors {red, blue, yellow} correspond with Kandinsky’s model for set of shapes {circle, triangle, square}, by extending the list of colors to {pink, crimson, red, maroon, brown, rose, salmon, coral, chocolate, orange, ivory, gold, yellow, olive, chartreuse, lime, green, aquamarine, turquoise, azure, aqua, cyan, teal, tan, beige, blue, navy, smoke, slate, violet, purple, plum, indigo, lavender, magenta, white, silver, gray, black}, e.g., we find that ‘square’, which had the lowest similarity with its top color from the original set, is actually more ‘crimson’ or ‘coral’ than ‘red’ (similarities of 0.153 and 0.171, respectively, compared to the 0.140 of ‘red’), and that it’s more ‘green’ and ‘orange’ as well (0.171 and 0.155). (Interestingly, however, ‘yellow’ and ‘blue’ are still the ruling colors of ‘triangle’ and ‘circle’.)

So while I may have had some fun with this experiment, and while I was surprised that the results aligned with the theory, I would not exactly say that I have confirmed the theory, as there are still questions worth asking around the meaning of cosine similarity in this context (e.g., Where is the boundary between association and non-association?, Is a winner-takes-all strategy of association enough?) and the mechanism through which these correlations become present within language (e.g., Is there more to the link between triangles and yellow than their sharpness?, Are circles blue for a smoothness which mimics calm or for their curve like a lake viewed from above?, Are squares the green of colorless ideas sleeping furiously or that most furious red of rage or crimson blood?)

Despite the narrow scope of my experiment and the (perhaps strange) question on which it is based (i.e., “What shape is color?”), which beg the questions listed above, I believe that the surprise I felt relative to the results presented herein can be said to mirror the surprise I have seen expressed relative to language models such as ChatGPT; thus, by extension, I believe that the skepticism with which I considered the results of my experiment should be mirrored in the consideration of the outputs of language models more generally.

Our language (~our system of concepts) is clearly well-structured enough for models trained on it to pick up on its patterns in ways that can impress (especially when what the models do is poorly understood); however, when evaluating such models (when deciding whether we are impressed by their capabilities or whether we trust their outputs), I think we may overvalue the importance of our intention to speak, relative to that of the patterns of our speech, in our understanding of ourselves as speaking beings; in other words, while humans speak with intention (some complex of biology and learning) and according to linguistic patterns, models generate text only according to linguistic patterns (I see no reason to assume intention); this capability seems to be enough to produce coherent texts (at least for short texts), but it also raises questions around why one text was generated over another.

Returning to the experiment presented in this article, perhaps Kandinsky’s theory is correct, in that it captures something about forms and colors that is truly there, and perhaps what is truly there is then captured within our language (perhaps through the very statement of the theory), but without mechanism being known (without combing the data or dissecting the embeddings to trace back to some reason why results align with theory), what do we have but what we already had, decorated by some numbers?

Garg, Nikhil, Londa Schiebinger, Dan Jurafsky, and James Zou. 2018. “Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes.” Proceedings of the National Academy of Sciences 115 (16). https://doi.org/10.1073/pnas.1720347115.
Kozlowski, Austin C., Matt Taddy, and James A. Evans. 2019. “The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings.” American Sociological Review 84 (5): 905–49. https://doi.org/10.1177/0003122419877135.
Mikolov, Tomas, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. “Efficient Estimation of Word Representations in Vector Space,” January. http://arxiv.org/abs/1301.3781.
Stoltz, Dustin S., and Marshall A. Taylor. 2021. “Cultural Cartography with Word Embeddings.” Poetics 88 (October): 101567. https://doi.org/10.1016/j.poetic.2021.101567.

A quick experiment at the intersection of Art and Science

Garg, Nikhil, Londa Schiebinger, Dan Jurafsky, and James Zou. 2018. “Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes.” Proceedings of the National Academy of Sciences 115 (16). https://doi.org/10.1073/pnas.1720347115.
Kozlowski, Austin C., Matt Taddy, and James A. Evans. 2019. “The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings.” American Sociological Review 84 (5): 905–49. https://doi.org/10.1177/0003122419877135.
Mikolov, Tomas, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. “Efficient Estimation of Word Representations in Vector Space,” January. http://arxiv.org/abs/1301.3781.
Stoltz, Dustin S., and Marshall A. Taylor. 2021. “Cultural Cartography with Word Embeddings.” Poetics 88 (October): 101567. https://doi.org/10.1016/j.poetic.2021.101567.

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.