๐ Unveiling the Truth and Facilitating Change: Towards Agent-based Large-scale Social Movement Simulation
This is the repository for the preprint paper: Unveiling the Truth and Facilitating Change: Towards Agent-based Large-scale Social Movement Simulation.
The webpage can be visited at https://xymou.github.io/social_simulation/.
Social media has emerged as a cornerstone of social movements, wielding significant influence in driving societal change. Simulating the response of the public and forecasting the potential impact has become increasingly important. However, existing methods for simulating such phenomena encounter challenges concerning their efficacy and efficiency in capturing the behaviors of social movement participants. In this paper, we introduce a hybrid framework for social media user simulation, wherein users are categorized into two types. Core users are driven by Large Language Models, while numerous ordinary users are modeled by deductive agent-based models. We further construct a Twitter-like environment to replicate their response dynamics following trigger events. Subsequently, we develop a multi-faceted benchmark SoMoSiMu-Bench for evaluation and conduct comprehensive experiments across real-world datasets. Experimental results demonstrate the effectiveness and flexibility of our method.
To handle the challenges in conducting large-scale online social movement simulation, this paper introduces a novel hybrid framework for social media user simulation.
User engagement in social networks often exhibits a Pareto distribution, where the bulk of content originates from a small fraction of individuals. Thus, those more active and influential such as opinion leaders should be modeled finely, while the silent majority can be controlled by simpler models. The overall framework is illustrated in the Figure, where social media users are divided into core users and ordinary users. The two types of users are driven by different models, to address the cost and efficiency issues of using thousands of LLMs.
- Core users are empowered by large language models.
- Ordinary users are driven by conventional agent-based models, such as the Bounded Confidence Model.
We build an agent architecture by empowering LLMs with the necessary capabilities for core user simulation. An overview of the agent's architecture is illustrated in the left part of the Figure. Empowered by LLMs, the agent is equipped with a profile module, a memory module, and an action module.
- Profile Module: each agent's profile contains Demographics, Social Traits and Communication Roles initialized from real data.
- Memory Module: a memory module is integrated to manipulate memories of agents, including three operations - memory writing, memory retrieval and memory reflection.
- Action Module: We consider actions that are highly related to information and attitude propagation, including: post, retweet, reply, like, and do nothing.
To comprehensively evaluate the effectiveness of simulation, we consider both micro-level evaluation and macro-level evaluation, focusing on individual user alignment and systemic outcomes respectively.
- Micro Alignment Evaluation: simulate in single rounds by providing authentic contextual information to each core user agent and assess their decision-making, in terms of stance, content and behavior.
- Macro System Evaluation: quantify the attitude distribution in a complete multi-round simulation, considering both static attitude distribution and time series of the average attitude.
To be in compliance with Twitterโs terms of service, we can not publish the raw data. Instead, we only disclose the original tweet ids, from which you can filter out the users you want to study, to minimize the privacy risk.
- Metoo: from #metoo Digital Media Collection, we further keep the tweets during the events, where the ids can be downloaded at metoo_link.
- Roe: from #RoeOverturned, we further keep the tweets during the events, where the ids can be downloaded at roe_link.
- BLM: from blm_twitter_corpus, we further keep the tweets during the events, where the ids can be downloaded at blm_link.
For user list we used in our paper, we can only provide the ids id_link.
Coming soon... (Our paper is under review, we will release code as soon as possible once accepted.)
- For LLM-empowered core users, part of the implementation is based on AgentVerse, many thanks to THUNLP for the open-source resource.
- For ordinary users supported by conventional ABMs, we use the mesa library to implement the agent-based models.
Please consider citing this paper if you find this repository useful:
@article{mou2024unveiling,
title={Unveiling the Truth and Facilitating Change: Towards Agent-based Large-scale Social Movement Simulation},
author={Xinyi Mou and Zhongyu Wei and Xuanjing Huang},
year={2024},
journal = {arXiv preprint arXiv: 2402.16333},
}