Techno Blender
Digitally Yours.

Recurrence, Rashomon, and Arbitrary Labels | by Danielle Boccelli | Dec, 2022

0 46


Readings in computational social science

This article comments on three papers associated with computational social science research: (1) Do Cascades Recur? (Cheng et al., 2016); (2) Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts (Kruk et al., 2019); and (3) Statistical Modeling: The Two Cultures (Breiman, 2001); these papers focus on (1) recurrence in content sharing on social media (here, Facebook); (2) the use of multimodal data (image–caption pairs) from social media (here, Instagram) to improve models of authorial intent; (3) and the shift from data modeling to algorithmic modeling within the field of statistics.

Below, rather than provide a summary of these works, all of which I highly recommend, I focus on the following topics: (1) the fuzziness of similarity (as opposed to identicality) within datasets; (2) the role of modeling in understanding (and affecting) a system; and (3) the stability of labels as cultural categories. In combination, these topics aim to explore sources of ambiguity within data science.

In the paper that asks through its title Do Cascades Recur?, Cheng et al. [1] discuss the long-term patterns of image sharing on Facebook to answer concomitant questions such as Can once-viral content regain its virality? and Which factors affect recurrence?. Within the work, the authors explore phenomena such as (1) the recurrence of sharing cascades (even after periods of quiescence) for both original content (i.e., posts) and copied content (i.e., reposts), (2) the saturation of an audience toward content (after which interest drops off), and (3) the moderating effect of content’s broadness of appeal (as approximated by the magnitude of the content’s initial burst of shares) on recurrence behavior.

As mentioned, the study considers both original posts and reposts that copy or imitate previous posts. To do so, exact copies and nearly identical copies are grouped with original content to study cascade recurrence over the group. By grouping across (re)posts, the authors were able to broadly study the journey of a (low-variance) content type as it is shared across the network; however, the method of grouping (in this case, binary k-means clustering) is not a neutral choice and thus raises many questions about what it means for content to be similar. More specifically, while this study was primarily concerned with near-identical images (including images with differing overlaid text), content could also be similar along other dimensions, such as symbolism or style.

In the study, image clusters are randomly sampled to confirm (via manual inspection) that they contain sufficiently similar images; 94% of clusters sampled contain near-identical images, and the remaining 6% varied in terms of their text). However, a follow-up study with this or similar data could question the dimensions along which posts are similar (e.g., Do they share themes? Do they transform images of different people in the same way? Do they convey similar messages or inspire similar emotions?), and how these dimensions can be discerned via machine learning. Questions around content mimicry — how different users play on the same themes and their intentions when doing so — may be interesting to ask in and of themselves, as they can be tied to creativity and social behavior; however, these questions are also interesting from a technical perspective, as they can be viewed in light of the clustering algorithm, which may be able to successfully capture only certain dimensions of similarity while largely missing others.

In Statistical Modeling: The Two Cultures [2], Leo Breiman explains how different models can produce predictions of roughly the same accuracy (with respect to the response being considered), resulting in different yet equally compelling depictions of the studied system; this effect is known as the Rashomon Effect (after the movie Rashomon, wherein one story is told from the perspectives of multiple characters). Extending from this idea (of there being multiple depictions of a system derived from different models of the same predictive value), the interpretations of these models can be said to form one narrative space that tells the complete story available to them; in other words, if each model can tell the story from only one perspective (paralleling the characters in Rashomon), then by combining these stories (in some overlapping way), one narrative space containing all equally compelling stories is formed (paralleling the movie itself).

(There seems to be something worth saying about models of differing accuracy (~stories that are differently compelling) as well, but to do so, I think, requires another dimension of consideration: It is not only that each character has his version of the story to tell but also that some characters are more or less reliable as narrators than others, which affects the viewer’s (or analyst’s) interpretation of the narrative; to add dimensions in this way, however, makes interpretation more complex, and thus, for simplicity, it might make more sense to consider only models that obtain competitive performance. In addition, it seems possible that a so-called narrative space could be mapped as a density in some number of dimensions such that the aspects of the story (or features of the model) that are interpretable from the perspectives of many models are most prominent. [As a side note to an aside, I recently watched a presentation on conformal prediction given by Emmanuel Candès at the Conference on Neural Information Processing Systems (NeurIPS), and it seems that there could be nice connections (and perhaps no differences other than shortcomings) between the work around prediction intervals outlined therein and the idea of a density that I present here.])

While the idea of narrative space is well-aligned with ideas from statistics (e.g., prediction interval, uncertainty, ensemble), under the paradigm of Big Data and many-parameter models trained to maximize predictive accuracy, model interpretation (or application) can be less about finding ~truth (i.e., locating high-density areas in narrative space) and more about regulating decision-making without having to find ~truth; as such, rather than interpret a model as a representation of the studied system to gain an understanding of the system, the model can be evaluated according to its role as decision-maker and modified to limit which narratives (within the narrative space) can be furthered (~magnified) via prediction.

As an example, data may show that the same response is always produced given a certain set of inputs to a model, and the model may learn that these inputs should always predict that response, but that does not necessarily mean that the response must always be produced given those inputs to the system (there may be no natural law prescribing the response, the sample may be biased to include only certain types of cases that occur within the system, the response may be downright undesirable to repeat); thus, with model acting as decision-maker, prediction may maintain the status quo unless the narrative tellable by the model is limited, and for the tellable narrative to be limited, the model must be either interpretable and tunable or adequately checkable and overridable.

To explore a semiotic concept called meaning multiplication, Kruk et al. [3] consider a multimodal dataset of image-caption pairs collected from social media. Captions are neither pure transcriptions of images nor are images pure depictions of captions, and so instead of assuming such a direct and asymmetric relationship to exist between the two data types, images and captions can be viewed as having a complex relationship that depends on the message the author wants to convey through the combination of the two. The authors annotate a fairly small dataset (n=1299) of images from Instagram with three sets of captions (one capturing authorial intent, one capturing contextual relationship, and one capturing semiotic relationship) and then build a model for annotating posts according to this taxonomy; they show that the model performs better when given both image and caption than when given only one or the other (and that the lift is greatest when the image and caption diverge semantically), which supposedly shows that meaning is multiplied but also sounds like something of an inversion of the case of Linda the Bank Teller.

From what I gather, this line of work was in its early stages at the time of this paper’s writing (2019), and so there are many ways to slice and extend this study; for example, it might be interesting to know which ~features of the image-caption pairs explain the lift in accuracy under the combination of data types, the extent to which these ~features act as interaction terms rather than individuals, and whether these ~features are culturally stable within society. More specifically regarding the last point: Because labels (as flat symbols that are attached to data rather than deep concepts as defined in culture) are inherently arbitrary (defined implicitly through a path from label to data rather than explicitly through predefined features and relative to other labels), it could be interesting to consider whether explicit cultural definitions of the concepts used as labels in this paper are stably related to the ~features selected by the model rather a convenient pathway from data to (arbitrary) label with no stable basis in culture, because after all, what is currently provocative or controversial (two categories used to annotate the dataset) may be merely expressive or entertaining (two others) elsewhere or at a different times.

  1. Breiman, Leo. 2001. “Statistical Modeling: The Two Cultures.” Statistical Science 16 (3): 199–231.
  2. Cheng, Justin, Lada A. Adamic, Jon M. Kleinberg, and Jure Leskovec. 2016. “Do Cascades Recur?” In Proceedings of the 25th International Conference on World Wide Web, 671–81. Montréal Québec Canada: International World Wide Web Conferences Steering Committee. https://doi.org/10.1145/2872427.2882993.
  3. Kruk, Julia, Jonah Lubin, Karan Sikka, Xiao Lin, Dan Jurafsky, and Ajay Divakaran. 2019. “Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts.” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 4621–31. Hong Kong, China: Association for Computational Linguistics. https://doi.org/10.18653/v1/D19-1469.


Readings in computational social science

This article comments on three papers associated with computational social science research: (1) Do Cascades Recur? (Cheng et al., 2016); (2) Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts (Kruk et al., 2019); and (3) Statistical Modeling: The Two Cultures (Breiman, 2001); these papers focus on (1) recurrence in content sharing on social media (here, Facebook); (2) the use of multimodal data (image–caption pairs) from social media (here, Instagram) to improve models of authorial intent; (3) and the shift from data modeling to algorithmic modeling within the field of statistics.

Below, rather than provide a summary of these works, all of which I highly recommend, I focus on the following topics: (1) the fuzziness of similarity (as opposed to identicality) within datasets; (2) the role of modeling in understanding (and affecting) a system; and (3) the stability of labels as cultural categories. In combination, these topics aim to explore sources of ambiguity within data science.

In the paper that asks through its title Do Cascades Recur?, Cheng et al. [1] discuss the long-term patterns of image sharing on Facebook to answer concomitant questions such as Can once-viral content regain its virality? and Which factors affect recurrence?. Within the work, the authors explore phenomena such as (1) the recurrence of sharing cascades (even after periods of quiescence) for both original content (i.e., posts) and copied content (i.e., reposts), (2) the saturation of an audience toward content (after which interest drops off), and (3) the moderating effect of content’s broadness of appeal (as approximated by the magnitude of the content’s initial burst of shares) on recurrence behavior.

As mentioned, the study considers both original posts and reposts that copy or imitate previous posts. To do so, exact copies and nearly identical copies are grouped with original content to study cascade recurrence over the group. By grouping across (re)posts, the authors were able to broadly study the journey of a (low-variance) content type as it is shared across the network; however, the method of grouping (in this case, binary k-means clustering) is not a neutral choice and thus raises many questions about what it means for content to be similar. More specifically, while this study was primarily concerned with near-identical images (including images with differing overlaid text), content could also be similar along other dimensions, such as symbolism or style.

In the study, image clusters are randomly sampled to confirm (via manual inspection) that they contain sufficiently similar images; 94% of clusters sampled contain near-identical images, and the remaining 6% varied in terms of their text). However, a follow-up study with this or similar data could question the dimensions along which posts are similar (e.g., Do they share themes? Do they transform images of different people in the same way? Do they convey similar messages or inspire similar emotions?), and how these dimensions can be discerned via machine learning. Questions around content mimicry — how different users play on the same themes and their intentions when doing so — may be interesting to ask in and of themselves, as they can be tied to creativity and social behavior; however, these questions are also interesting from a technical perspective, as they can be viewed in light of the clustering algorithm, which may be able to successfully capture only certain dimensions of similarity while largely missing others.

In Statistical Modeling: The Two Cultures [2], Leo Breiman explains how different models can produce predictions of roughly the same accuracy (with respect to the response being considered), resulting in different yet equally compelling depictions of the studied system; this effect is known as the Rashomon Effect (after the movie Rashomon, wherein one story is told from the perspectives of multiple characters). Extending from this idea (of there being multiple depictions of a system derived from different models of the same predictive value), the interpretations of these models can be said to form one narrative space that tells the complete story available to them; in other words, if each model can tell the story from only one perspective (paralleling the characters in Rashomon), then by combining these stories (in some overlapping way), one narrative space containing all equally compelling stories is formed (paralleling the movie itself).

(There seems to be something worth saying about models of differing accuracy (~stories that are differently compelling) as well, but to do so, I think, requires another dimension of consideration: It is not only that each character has his version of the story to tell but also that some characters are more or less reliable as narrators than others, which affects the viewer’s (or analyst’s) interpretation of the narrative; to add dimensions in this way, however, makes interpretation more complex, and thus, for simplicity, it might make more sense to consider only models that obtain competitive performance. In addition, it seems possible that a so-called narrative space could be mapped as a density in some number of dimensions such that the aspects of the story (or features of the model) that are interpretable from the perspectives of many models are most prominent. [As a side note to an aside, I recently watched a presentation on conformal prediction given by Emmanuel Candès at the Conference on Neural Information Processing Systems (NeurIPS), and it seems that there could be nice connections (and perhaps no differences other than shortcomings) between the work around prediction intervals outlined therein and the idea of a density that I present here.])

While the idea of narrative space is well-aligned with ideas from statistics (e.g., prediction interval, uncertainty, ensemble), under the paradigm of Big Data and many-parameter models trained to maximize predictive accuracy, model interpretation (or application) can be less about finding ~truth (i.e., locating high-density areas in narrative space) and more about regulating decision-making without having to find ~truth; as such, rather than interpret a model as a representation of the studied system to gain an understanding of the system, the model can be evaluated according to its role as decision-maker and modified to limit which narratives (within the narrative space) can be furthered (~magnified) via prediction.

As an example, data may show that the same response is always produced given a certain set of inputs to a model, and the model may learn that these inputs should always predict that response, but that does not necessarily mean that the response must always be produced given those inputs to the system (there may be no natural law prescribing the response, the sample may be biased to include only certain types of cases that occur within the system, the response may be downright undesirable to repeat); thus, with model acting as decision-maker, prediction may maintain the status quo unless the narrative tellable by the model is limited, and for the tellable narrative to be limited, the model must be either interpretable and tunable or adequately checkable and overridable.

To explore a semiotic concept called meaning multiplication, Kruk et al. [3] consider a multimodal dataset of image-caption pairs collected from social media. Captions are neither pure transcriptions of images nor are images pure depictions of captions, and so instead of assuming such a direct and asymmetric relationship to exist between the two data types, images and captions can be viewed as having a complex relationship that depends on the message the author wants to convey through the combination of the two. The authors annotate a fairly small dataset (n=1299) of images from Instagram with three sets of captions (one capturing authorial intent, one capturing contextual relationship, and one capturing semiotic relationship) and then build a model for annotating posts according to this taxonomy; they show that the model performs better when given both image and caption than when given only one or the other (and that the lift is greatest when the image and caption diverge semantically), which supposedly shows that meaning is multiplied but also sounds like something of an inversion of the case of Linda the Bank Teller.

From what I gather, this line of work was in its early stages at the time of this paper’s writing (2019), and so there are many ways to slice and extend this study; for example, it might be interesting to know which ~features of the image-caption pairs explain the lift in accuracy under the combination of data types, the extent to which these ~features act as interaction terms rather than individuals, and whether these ~features are culturally stable within society. More specifically regarding the last point: Because labels (as flat symbols that are attached to data rather than deep concepts as defined in culture) are inherently arbitrary (defined implicitly through a path from label to data rather than explicitly through predefined features and relative to other labels), it could be interesting to consider whether explicit cultural definitions of the concepts used as labels in this paper are stably related to the ~features selected by the model rather a convenient pathway from data to (arbitrary) label with no stable basis in culture, because after all, what is currently provocative or controversial (two categories used to annotate the dataset) may be merely expressive or entertaining (two others) elsewhere or at a different times.

  1. Breiman, Leo. 2001. “Statistical Modeling: The Two Cultures.” Statistical Science 16 (3): 199–231.
  2. Cheng, Justin, Lada A. Adamic, Jon M. Kleinberg, and Jure Leskovec. 2016. “Do Cascades Recur?” In Proceedings of the 25th International Conference on World Wide Web, 671–81. Montréal Québec Canada: International World Wide Web Conferences Steering Committee. https://doi.org/10.1145/2872427.2882993.
  3. Kruk, Julia, Jonah Lubin, Karan Sikka, Xiao Lin, Dan Jurafsky, and Ajay Divakaran. 2019. “Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts.” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 4621–31. Hong Kong, China: Association for Computational Linguistics. https://doi.org/10.18653/v1/D19-1469.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment