PhishBENCH: A benchmark dataset for legal tasks related to phishing attacks

Submitted by admin on Tue, 09/30/2025 - 11:56

Title	PhishBENCH: A benchmark dataset for legal tasks related to phishing attacks
Publication Type	Conference Proceedings
Year of Conference	2025
Authors	Katoikos, I, Kosmopoulos, D, Iglezakis, I, Kiortsi, P, Thomopoulos, G, Fidas, C
Conference Name	International Conference on Artificial Intelligence Tools
Keywords	benchmark, large language model, phishing law, retrieval augmented generation
Abstract	Phishing attacks constitute a significant cyber-threat, causing financial losses and undermining digital trust. Despite advancements in awareness campaigns and regulatory efforts, many individuals and organizations struggle to understand their legal rights and obligations when victimized by phishing. This paper presents a benchmark framework designed to support researchers in the analysis of legal texts and cases, with the goal of helping users navigate the complex legal landscape surrounding phishing attacks. We evaluate BERT-based models, large language models, and retrieval-augmented generation using a curated knowledge base of phishing laws, regulations, and common scenarios. We provide baseline evaluations across multiple task categories, including true-false questions, phishing attack cases, penal code analysis, legal cases with judicial decisions, and question-answering tasks, all within the Greek legal jurisdiction.

Main menu