Finding Collusive Spam in Community Question Answering Platforms: A Pattern and Burstiness Based Method
Author
Abstract

Community question answering (CQA) websites have become very popular platforms attracting numerous participants to share and acquire knowledge and information in Internet However, with the rapid growth of crowdsourcing systems, many malicious users organize collusive attacks against the CQA platforms for promoting a target (product or service) via posting suggestive questions and deceptive answers. These manipulate deceptive contents, aggregating into multiple collusive questions and answers (Q&As) spam groups, can fully control the sentiment of a target and distort the decision of users, which pollute the CQA environment and make it less credible. In this paper, we propose a Pattern and Burstiness based Collusive Q&A Spam Detection method (PBCSD) to identify the deceptive questions and answers. Specifically, we intensively study the campaign process of crowdsourcing tasks and summarize the clues in the Q&As’ vocabulary usage level when collusive attacks are launched. Based on the clues, we extract the Q&A groups using frequent pattern mining and further purify them by the burstiness on posting time of Q&As. By designing several discriminative features at the Q&A group level, multiple machine learning based classifiers can be used to judge the groups as deceptive or ordinary, and the Q&As in deceptive groups are finally identified as collusive Q&A spam. We evaluate the proposed PBCSD method in a real-world dataset collected from Baidu Zhidao, a famous CQA platform in China, and the experimental results demonstrate the PBCSD is effective for collusive Q&A spam detection and outperforms a number of state-of-art methods.

Year of Publication
2022
Conference Name
2021 Ninth International Conference on Advanced Cloud and Big Data (CBD)
Google Scholar | BibTeX