Readability Analysis of Privacy Policies for Large-Scale Websites: A Perspective from Deep Learning and Linguistics
Author
Abstract

Privacy Policies - Privacy policy statements are an essential approach to self-regulation by website operators in the area of personal privacy protection. However, these policies are often lengthy and difficult to understand, with users appearing to actually read the privacy policy in only a few cases. To address these obstacles, we propose a framework, Privacy Policy Analysis Framework for Automatic Annotation and User Interaction (PPAI) that stores, classifies, and categorizes queries on natural language privacy policies. At the core of PPAI is a privacy-centric language model that consists of a smaller fine-grained dataset of privacy policies and a new hierarchy of neural network classifiers that take into account privacy practices with high-level aspects and finegrained details. Our experimental results show that the eight readability metrics of the dataset exhibit a strong correlation. Furthermore, PPAI’s neural network classifier achieves an accuracy of 0.78 in the multi-classification task. The robustness experiments reached higher accuracy than the baseline and remained robust even with a small amount of labeled data.

Year of Publication
2022
Date Published
dec
Publisher
IEEE
Conference Location
Haikou, China
ISBN Number
9798350346558
URL
https://ieeexplore.ieee.org/document/10189656/
DOI
10.1109/SmartWorld-UIC-ATC-ScalCom-DigitalTwin-PriComp-Metaverse56740.2022.00249
Google Scholar | BibTeX | DOI