Analyzing Metadata in PDF Files Published by Police Agencies in Japan
Author
Abstract

In recent years, new types of cyber attacks called targeted attacks have been observed. It targets specific organizations or individuals, while usual large-scale attacks do not focus on specific targets. Organizations have published many Word or PDF files on their websites. These files may provide the starting point for targeted attacks if they include hidden data unintentionally generated in the authoring process. Adhatarao and Lauradoux analyzed hidden data found in the PDF files published by security agencies in many countries and showed that many PDF files potentially leak information like author names, details on the information system and computer architecture. In this study, we analyze hidden data of PDF files published on the website of police agencies in Japan and compare the results with Adhatarao and Lauradoux's. We gathered 110989 PDF files. 56% of gathered PDF files contain personal names, organization names, usernames, or numbers that seem to be IDs within the organizations. 96% of PDF files contain software names.

Year of Publication
2022
Conference Name
2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)
Google Scholar | BibTeX