L2IR: Revealing Latent Intent in Graph Fraud Detection

What the paper is about

Graph fraud detection has long depended on Graph Neural Networks (GNNs) to propagate and aggregate information across relational data. A critical obstacle in practice, however, is that fraudsters frequently disguise themselves by forging numerous connections with benign users, causing fraud signals to be progressively diluted during neighborhood aggregation and undermining detection reliability. While recent efforts have used Large Language Models (LLMs) to provide rich semantic cues for fraud detection, the underlying intent behind suspicious connections remains insufficiently explored. Compounding this issue, the scarcity of annotated fraud samples makes it difficult to train detectors that remain robust under heavy camouflage. To address these gaps, we propose L2IR, an LLM-driven Latent Intent Revealing framework for graph fraud detection. By uncovering latent intent from both user behaviors and suspicious connections, L2IR extracts intent-aware representations from raw behavioral traces and reasons about the true purpose behind individual connections, effectively distinguishing supportive links from misleading ones. It further incorporates adaptive self-training to enhance robustness under limited supervision. Evaluations on two real-world datasets characterized by pervasive camouflage demonstrate that L2IR surpasses strong baselines and can function as a plug-in enhancement for a range of GNN-based detectors, improving AUPRC by up to 8.27%.

What it covers

L2IR: Revealing Latent Intent in Graph Fraud Detection Jinsheng Guo [email protected] Hefei University of Technology Hefei China , Zhenhao Weng [email protected] Hefei University of Technology Hefei China , Yibo Liu [email protected] Hefei University of Technology Hefei China , Yan Qiao [email protected] Hefei University of Technology Hefei China and Meng Li [email protected] Hefei University of Technology Hefei China Abstract. Graph fraud detection has long depended on Graph Neural Networks (GNNs) to propagate and aggregate information across relational data. A critical obstacle in practice, however, is that fraudsters frequently disguise themselves by forging numerous connections with benign users, causing fraud signals to be progressively diluted during neighborhood aggregation and undermining detection reliability. While recent efforts have used Large Language Models (LLMs) to provide rich semantic cues for fraud detection, the underlying intent behind suspicious connections remains insufficiently explored. Compounding this issue, the scarcity of annotated fraud samples makes it difficult to train detectors that remain robust under heavy camouflage. To address these gaps, we propose L2IR, an L LM-driven L atent I ntent R evealing framework for graph fraud detection. By uncovering latent intent from both user behaviors and suspicious connections, L2IR extracts intent-aware representations from raw behavioral traces and reasons about the true purpose behind individual connections, effectively distinguishing supportive links from misleading ones. It further incorporates adaptive self-training to enhance robustness under limited supervision. Evaluations on two real-world datasets characterized by pervasive camouflage demonstrate that L2IR surpasses strong baselines and can function as a plug-in enhancement for a range of GNN-based detectors, improving AUPRC by up to 8.27%. Fraud Detection, Graph Neural Networks, Large Language Models † † ccs: Computing methodologies Artificial intelligence † † ccs: Information systems Data mining † † ccs: Information systems Social networks 1. Introduction Modern online information systems, such as e-commerce platforms, review communities, social networks, and financial services, generate rich relational data among users, items, content, and transactions (Akoglu et al. , 2015 ; Rayana and Akoglu, 2015 ; Weng et al. , 2019 ; Wang et al. , 2019a ) . Fraudulent activities in these systems reduce service quality and weaken system trust, making fraud detection an important task (Rayana and Akoglu, 2015 ; Lu et al. , 2022 ) . In response, graph-based methods have become an effective paradigm. Graph-based methods model entities and interactions as graphs and exploit both node attributes and relational structure for detection (Akoglu et al. , 2015 ; Hooi et al. , 2016 ) . Among them, Graph Neural Networks (GNNs) are widely used, since they learn center node representations by aggregating information from connected neighbors, thereby capturing patterns from neighboring nodes that are critical for distinguishing fraudulent entities from benign ones (Kipf and Welling, 2016 ; Hamilton et al. , 2017 ; Veličković et al. , 2018a ) . Figure 1 . Comparison of our proposed L2IR with GNN-based and LLM enhanced GNN methods under camouflage. (a) Standard GNN-based methods rely solely on graph structure. (b) LLM enhanced GNN methods incorporate dataset-level semantic descriptions to improve performance. (c) Our L2IR analyzes the intent behind each user interaction, enabling effective identification of camouflaged fraud. However, existing GNN-based fraud detection methods mainly rely on neighborhood aggregation over observed graph structures (as illustrated in Fig. 1 (a) (Huang et al. , 2025 ; Yang et al. , 2025b ) ) and insufficiently explore fraudsters’ camouflage behaviors (Hooi et al. , 2016 ; Dou et al. , 2020 ) . In relation camouflage scenarios (Dou et al. , 2020 ) , fraudsters build many connections with benign users while conducting fraudulent activities on only a small fraction of them, so the benign-dominated neighborhood dilutes fraud signals during aggregation, leading to less reliable detection and increased false negatives (Dou et al. , 2020 ; Liu et al. , 2020 ) . Recently, driven by the context reasoning and semantic comprehension capabilities of Large Language Models (LLMs) (He et al. , 2024 ; Tang et al. , 2024 ; Zhu et al. , 2024 ) , existing studies leverage LLMs to incorporate rich semantics to assist graph fraud detection (Huang et al. , 2025 ; Yang et al. , 2025b ; Li et al. , 2026 ) . These studies have revealed potential opportunities for detecting camouflaged fraud by integrating descriptive semantic characterization with graph structural information (as illustrated in Fig. 1 (b)). However, existing methods primarily restrict LLMs to generating global semantic descriptions of the dataset, without reasoning about the intent behind connections. Consequently, they fail to fully exploit the potential of LLMs for recognizing camouflaged fraud, still suffering from notable misclassifications. Even worse, real-world camouflaged fraud data is often constrained by limited supervision: the strong stealthiness of camouflaged fraud makes it extremely difficult to label such cases in datasets, leaving insufficient supervision for robust training and generalization (Wang et al. , 2019a ; Liu et al. , 2021 ; Yu et al. , 2024 ) . To tackle these challenges, we propose L2IR , an L LM-driven L atent I ntent R evealing framework for graph fraud detection. Specifically, L2IR leverages LLMs to analyze the behavioral semantics of each user to infer the intent behind individual connections, and then incorporates this inferred intent information into node features, thereby helping to uncover deeply camouflaged fraudulent nodes (as illustrated in Fig. 1 (c)). In addition, L2IR adopts an adaptive self-training mechanism to augment supervision with reliable signals from previous stages, thereby improving robustness under limited fraud labels. Extensive experiments on two real-world datasets demonstrate the effectiveness of L2IR in fraud detection, with clear gains under camouflage and limited supervision scenarios. In summary, our main contributions are as follows:

• We introduce a novel LLM-driven perspective for graph fraud detection under camouflage by inferring the intent behind suspicious connections with LLMs, which effectively improves the reliability of graph fraud detection.

• We propose L2IR, an intent modeling framework for graph fraud detection, which jointly models behavior intent and connection intent to extract deep semantic features. It also integrates adaptive self-training to address the scarcity of fraud labels, which enables accurate detection of heavily camouflaged fraud even under limited supervision.

• We evaluate L2IR against nine representative baselines on two real-world datasets. The results demonstrate that L2IR achieves superior empirical performance and can also serve as a plug-in to improve existing GNN-based fraud detectors. 2. Related Works 2.1. Graph Neural Networks for Fraud Detection GNNs provide a powerful framework for capturing fraudulent patterns by aggregating neighborhood information through message propagation mechanisms (Liu et al. , 2018 , 2019 ; Shi et al. , 2022 ; Wang et al. , 2020 ) . Early works directly applied classic GNNs like Graph Convolutional Networks (GCN) (Kipf and Welling, 2016 ) and Graph Attention Networks (GAT) (Veličković et al. , 2018a ) to fraud detection. Subsequent works introduced various enhancements. GraphSAGE (Hamilton et al. , 2017 ) introduces an inductive framework that learns to generate node embeddings by sampling and aggregating features from a node’s local neighborhood, allowing effective embedding creation for unseen nodes in dynamic or new graphs. RGTAN (Xiang et al. , 2025 ) models transaction sequences as temporal graphs and uses gated attention to capture temporal fraud patterns. It also learns neighbor risk representations to detect multi-hop fraud structures. DiffGraph (Li et al. , 2025 ) employs a latent diffusion paradigm to filter noise and captures relation transitions through a bidirectional diffusion process for better node representations. To combat label scarcity, some unsupervised methods employ mutual information maximization or contrastive learning (Veličković et al. , 2018b ; Zhu et al. , 2020 ) , while SemiGNN (Wang et al. , 2019a ) links labeled and unlabeled users via social relations and integrates heterogeneous data sources with hierarchical attention for fraud detection. However, these methods do not exploit the rich textual information and struggle to cope with the challenges posed by fraudsters deliberately camouflaging themselves as normal nodes (Hooi et al. , 2016 ; Liu et al. , 2021 ; Wang et al. , 2019b ) . 2.2. Integrating LLMs with GNNs for Fraud Detection In recent years, LLMs have demonstrated powerful language understanding (Minaee et al. , 2024 ; Thirunavukarasu et al. , 2023 ; Chang et al. , 2024 ; Achiam et al. , 2023 ) , opening a new path for graph learning by exploiting the rich textual semantics of nodes (Wei et al. , 2024 ; Tan et al. , 2023 ; Zhao et al. , 2022 ) . For instance, one paradigm utilizes GNNs to enhance LLMs, where LLMs serve as the primary predictors. GraphGPT (Tang et al. , 2024 ) employs graph instruction tuning to translate graph structures into LLM-compatible representations, while InstructGLM (Ye et al. , 2024 ) relies on natural language prompts for the same purpose. In both cases, the goal is to enable direct graph learning with LLMs, eliminating the need for task-specific GNN fine-tuning. DGP (Li et al. , 2026 ) proposes a dual granularity prompting framework retaining fine-grained text for target nodes while summarizing neighbors into coarse-grained prompts, reducing input size while preserving key fraud semantics. However, employing LLMs as the final predictor imposes significant computational resource demands and suffers from low inference efficiency. Conversely, an alternative paradigm employs LLMs to enhance GNNs, with GNNs serving as the primary predictors. TAPE (He et al. , 2024 ) pioneers this by using an LLM to generate explanations as augmented node features. TouchUp-G (Zhu et al. , 2024 ) further enhances node features from pre-trained models through graph-centric finetuning to better align them with the graph structure. MLED (Huang et al. , 2025 ) uses type-level and relation-level enhancers to integrate text knowledge with graph structure for better fraud-benign distinction. FLAG (Yang et al. , 2025b ) employs semantic similarity sampling to filter camouflaged neighbors and leverages LLMs to extract discriminative textual features, thereby improving fraud detection, particularly on text-rich graphs. Although LLM enhanced GNNs mitigate the high cost of the LLM-as-Predictor paradigm, they fail to explicitly capture the true intent behind camouflaged connections, leaving misleading links unrecognized and contributing to elevated false negatives. Moreover, these methods still struggle under label scarcity due to limited supervision. In contrast to prior work, L2IR leverages LLMs to infer the intent behind behaviors and connections, enhancing GNNs’ capacity to detect camouflaged fraud. Figure 2 . The overall framework of L2IR. It consists of three key modules: (b) Behavior Intent Profiling generates intent profiles; (c) Connection Intent Reasoning infers the intent behind each connection; (d) Adaptive Self-Training progressively augments training labels to handle label scarcity. 3. Preliminaries We consider graph-based fraud detection on a heterogeneous graph 𝒢 = ( 𝒱 , ℰ , ℛ ) \mathcal{G}=(\mathcal{V},\mathcal{E},\mathcal{R}) , where 𝒱 \mathcal{V} is the set of N N nodes, ℰ ⊆ 𝒱 × 𝒱 × ℛ \mathcal{E}\subseteq\mathcal{V}\times\mathcal{V}\times\mathcal{R} is the set of typed edges, and ℛ \mathcal{R} is the set of relation types. Each node v v is associated with a statistical feature vector x v ∈ ℝ d x_{v}\in\mathbb{R}^{d} derived from metadata, and a raw textual sequence T v T_{v} representing its historical behavior traces (e.g., textual reviews, item ratings, and interaction timestamps). These records are key to uncovering semantic intent hidden behind camouflaged connections. The neighborhood of node v v under relation r r is denoted as 𝒩 r ( v ) = { u ∣ ( u , v , r ) ∈ ℰ } \mathcal{N}{r}(v)={u\mid(u,v,r)\in\mathcal{E}} . We further define the homogeneous projection 𝒢 homo = ( 𝒱 , ℰ homo ) \mathcal{G}{\text{homo}}=(\mathcal{V},\mathcal{E}{\text{homo}}) , where ℰ homo = ⋃ r ∈ ℛ { ( u , v ) ∣ ( u , v , r ) ∈ ℰ } \mathcal{E}{\text{homo}}=\bigcup_{r\in\mathcal{R}}{(u,v)\mid(u,v,r)\in\mathcal{E}} collapses all specific relations (e.g., reviewing the same products or giving identical ratings) into an untyped edge set to represent basic structural connectivity. We formulate our target scenario as a graph fraud detection task operating under annotation scarcity. Formally, let 𝒱 L ⊂ 𝒱 \mathcal{V}{L}\subset\mathcal{V} be a small set of labeled nodes with ground-truth labels y v ∈ { 0 , 1 } y{v}\in{0,1} (where 1 1 denotes a fraudster and 0 a benign entity), and 𝒱 U = 𝒱 ∖ 𝒱 L \mathcal{V}{U}=\mathcal{V}\setminus\mathcal{V}{L} be the remaining unlabeled nodes. In this setting, the model observes the complete topology 𝒢 \mathcal{G} and all node statistical features 𝒳 = { x v } v ∈ 𝒱 \mathcal{X}={x_{v}}{v\in\mathcal{V}} during training, but is supervised only by 𝒱 L \mathcal{V}{L} . Our objective is to learn a predictive function f θ : 𝒱 → { 0 , 1 } f_{\theta}:\mathcal{V}\to{0,1} that accurately uncovers all hidden fraudsters without triggering false positives on incidentally connected benign users. 4. Methodology In this section, we present the technical details of the proposed L2IR framework. It integrates LLM-driven semantic reasoning with graph structural information by modeling both node behavior intent and edge connection intent. We first outline the overall architecture, and then provide a detailed description of the constituent modules. 4.1. Framework Overview At a high level, L2IR uncovers camouflaged fraudsters by aligning graph structure with intent features. As illustrated in Figure 2 , L2IR consists of three key modules: 1) Behavior Intent Profiling: it prompts an LLM to digest raw chronological behavior traces and retrieved exemplars, and yields node-level behavior intent representations ( h v node h_{v}^{\text{node}} ) that complement statistical features with behavior semantic evidence. 2) Connection Intent Reasoning: this module applies a preliminary GNN to score nodes and flag suspicious connections, then prompts an LLM to cross-audit each one, reasoning about the true intent behind the interaction. This generates edge-level connection intent representations ( h v edge h_{v}^{\text{edge}} ) that help distinguish supportive connections from misleading ones. 3) Adaptive Self-Training: under severe supervision scarcity, this module deploys an iterative self-training loop tailored to fraud detection. Using asymmetric confidence thresholds, it progressively adds high-confidence pseudo-labels to the training set, preventing early-round errors from cascading into later ones. Through this pipeline, semantic reasoning and structural learning reinforce each other to robustly identify camouflaged fraudsters. 4.2. Behavior Intent Profiling To infer behavior intent from raw traces T v T_{v} , we prompt an LLM to analyze each target node against a group of retrieved exemplars of labeled fraud and benign nodes, allowing the model to recognize latent intent by identifying which behavioral patterns align with these exemplars. Dynamic Exemplar Retrieval. To provide the LLM with references for behavior intent profiling, we dynamically retrieve a few exemplars from the labeled nodes for each target node v v . Specifically, we compute the similarity between v v and each labeled node v l ∈ 𝒱 L v_{l}\in\mathcal{V}{L} using a combined metric: (1) sim ⁡ ( v , v l ) = α ⋅ sim text ⁡ ( q v , q v l ) + ( 1 − α ) ⋅ sim inter ⁡ ( I v , I v l ) , \operatorname{sim}(v,v{l})=\alpha\cdot\operatorname{sim}{\text{text}}(q{v},q_{v_{l}})+(1-\alpha)\cdot\operatorname{sim}{\text{inter}}(I{v},I_{v_{l}}), where q v q_{v} denotes the textual vector constructed from T v T_{v} , I v I_{v} denotes the historically interacted item set of node v v , sim text \operatorname{sim}{\text{text}} is the cosine similarity between textual vectors, and sim inter \operatorname{sim}{\text{inter}} is the Jaccard index between item sets, where a higher value indicates a greater degree of interaction. The coefficient α ∈ [ 0 , 1 ] \alpha\in[0,1] balances semantic content and behavioral patterns. Based on the similarities, we construct an exemplar pool for target node v v . For each class c ∈ { 0 , 1 } c\in{0,1} , we rank nodes v l ∈ 𝒱 L c ∖ { v } v_{l}\in\mathcal{V}{L}^{c}\setminus{v} according to the similarity score sim ⁡ ( v , v l ) \operatorname{sim}(v,v{l}) , and retrieve the top- k k most similar nodes to form the class-specific exemplar set 𝒮 v c \mathcal{S}{v}^{c} . The integrated exemplar set 𝒮 v = 𝒮 v 0 ∪ 𝒮 v 1 \mathcal{S}{v}=\mathcal{S}{v}^{0}\cup\mathcal{S}{v}^{1} serves as a contrastive basis, which provides behaviorally grounded references for the LLM to effectively distinguish benign from fraudulent behavior intent. Behavior Intent Profiling with LLM. We design a prompt to guide a pre-trained LLM, denoted by ℳ LLM \mathcal{M}{\text{LLM}} , in reasoning over the target node’s behavior traces T v T{v} and the labeled exemplars 𝒮 v \mathcal{S}{v} , yielding a structured profile b v = ℳ LLM ( Prompt profile ( v ) ) b{v}=\mathcal{M}{\text{LLM}}(\texttt{Prompt}{\text{profile}}(v)) . As shown in Table 1 , the prompt consists of four components: Role , Exemplars , Target Node and Output . The Role anchors the LLM in the fraud detection domain, reducing irrelevant reasoning. The Exemplars provides k k pairs of fraud-benign cases with consistent metadata, offering contrastive behavioral references that guide the LLM toward patterns relevant to intent inference. The Target Node supplies the node’s statistical data, graph relation context, and chronological behavior traces, providing the LLM with concrete behavioral evidence to analyze. The Output enforces a structured schema progressing from profile summary to overall assessment, making the resulting profile b v b_{v} suitable for downstream encoding. The complete template for behavior intent profiling is provided in Appendix A . Table 1 . Core format of prompt for Behavior Intent Profiling. [Role] Domain-expert persona: senior fraud detection analyst specializing in review behavior and relation camouflage. The core task is to infer the target node’s behavior intent by strictly following all provided constraints and analytical requirements. [Exemplars] > > [Target Node] Node ID: > Total reviews: > Avg. rating: > Graph Relation Context: [Neighbor Metadata | | Behavior Similarities | | Risk Distribution] Chronological Behavior Traces: [Product | | Star Rating | | Text Content | | Helpfulness Score] [Output] 1) Node Profile Summary → \to 2) Behavior Pattern Analysis → \to 3) Fraud Signal Analysis → \to 4) Overall Assessment Semantic Encoding. To integrate the captured behavior intent into node features, we encode the structured output b v b_{v} into a dense vector using a frozen pre-trained language encoder ℳ Enc \mathcal{M}{\text{Enc}} : (2) h v node = ℳ Enc ( b v ) ∈ ℝ d s . h{v}^{\text{node}}=\mathcal{M}{\text{Enc}}(b{v})\in\mathbb{R}^{d_{s}}. The resulting embedding h v node h_{v}^{\text{node}} complements the statistical features x v x_{v} as a node-level semantic prior, integrating behavior intent before neighborhood aggregation and equipping the downstream module with richer evidence for connection intent reasoning. 4.3. Connection Intent Reasoning To reason the connection intent in the graph structure, we first apply a preliminary GNN to score nodes and flag suspicious connections, then prompt an LLM to cross-audit each one by reasoning over both endpoints’ interaction histories, allowing the model to identify intent signals that distinguish supportive connections. Suspicious Connection Detection. We train a preliminary GNN on the enhanced features [ x v ∥ h v node ] [x_{v}|h_{v}^{\text{node}}] to produce fraud risk scores for all nodes. To enhance the robustness of scoring, we split the training set into K K distinct folds and train a separate model on each fold. For labeled nodes 𝒱 L \mathcal{V}{L} , we adopt an out-of-fold (OOF) strategy. Specifically, each node falls into exactly one validation fold and is scored exclusively by that fold’s model; this prevents the bias from the remaining K − 1 K-1 models that observed its label during training, thereby yielding an unbiased prediction p ^ v oof \hat{p}{v}^{\text{oof}} . For unlabeled nodes 𝒱 U \mathcal{V}{U} , since no ground-truth label is available, all K K fold models provide unbiased predictions; we therefore denote by p ^ v ( k ) \hat{p}{v}^{(k)} the prediction from the k k -th fold model, and average these K K predictions to improve stability. The final risk score Z v Z_{v} is defined as: (3) Z v = { p ^ v oof , v ∈ 𝒱 L 1 K ∑ k = 1 K p ^ v ( k ) , v ∈ 𝒱 U Z_{v}=\begin{cases}\hat{p}{v}^{\text{oof}},&v\in\mathcal{V}{L}\[4.0pt] \dfrac{1}{K}\displaystyle\sum_{k=1}^{K}\hat{p}{v}^{(k)},&v\in\mathcal{V}{U}\end{cases} Table 2 . Core format of prompt for Connection Intent Reasoning. [Role] Senior fraud audit analyst specializing in review graphs and relation camouflage. Node v v / u u roles ( Suspected Fraud Node or Suspected Benign Node ) are dynamically assigned based on their relative preliminary risk scores Z Z , with the higher-scored node explicitly designated as the suspected side to analyze intent and assess evidence. [Target Connection] Connection Metadata: [Node IDs & Roles | | Risk scores: ⟨ Z v ⟩ \langle Z_{v}\rangle / ⟨ Z u ⟩ \langle Z_{u}\rangle | | Contradictory Magnitude: ⟨ c u v ⟩ \langle c_{uv}\rangle ] Chronological Interaction History ( v v & u u ): [Product | | Date | | Star Rating | | Text Content | | Helpfulness Score] [Output] 1) Connection Overview (risk scores, suspicious, fraud signals) → \to 2) Behavior Difference ( v v / u u rating and review divergence) → \to 3) Connection Intent Analysis → \to 4) Counter Evidence and Uncertainty → \to 5) Risk Verdict (Low/Med/High + + confidence + + key evidence) Given the node risk scores Z v Z_{v} , we partition the nodes into a suspected fraud set 𝒱 + \mathcal{V}{+} and a suspected benign set 𝒱 − \mathcal{V}{-} using dual thresholds τ h \tau_{h} and τ l \tau_{l} ( τ h > τ l \tau_{h}>\tau_{l} ): (4) 𝒱 + \displaystyle\mathcal{V}{+} = { v ∈ 𝒱 ∣ Z v > τ h } , \displaystyle={v\in\mathcal{V}\mid Z{v}>\tau_{h}}, 𝒱 − \displaystyle\mathcal{V}{-} = { v ∈ 𝒱 ∣ Z v < τ l } . \displaystyle={v\in\mathcal{V}\mid Z{v}<\tau_{l}}. Based on this partition, we identify all edges from ℰ homo \mathcal{E}{\text{homo}} whose two endpoints belong to different node sets—one in 𝒱 + \mathcal{V}{+} and the other in 𝒱 − \mathcal{V}{-} —and define them as the contradictory edge set , denoted by ℰ contra \mathcal{E}{\text{contra}} : (5) ℰ contra = ℰ homo ∩ ( ( 𝒱 + × 𝒱 − ) ∪ ( 𝒱 − × 𝒱 + ) ) . \mathcal{E}{\text{contra}}=\mathcal{E}{\text{homo}}\cap\Big((\mathcal{V}{+}\times\mathcal{V}{-})\cup(\mathcal{V}{-}\times\mathcal{V}{+})\Big). Next, we compute the contradictory magnitude c u v = | Z u − Z v | c_{uv}=|Z_{u}-Z_{v}| for each contradictory edge and retain the top- s s edges to form ℰ suspicious \mathcal{E}{\text{suspicious}} . The budget s s strictly bounds the LLM computational cost, focusing our reasoning directly on the deceptive boundaries that trigger false positives/negatives during GNN message passing. LLM Cross-Audit. For each suspicious edge ( u , v ) ∈ ℰ suspicious (u,v)\in\mathcal{E}{\text{suspicious}} , we design a prompt incorporating the interaction histories of both endpoints alongside their risk scores Z u Z_{u} , Z v Z_{v} and contradictory magnitude c u v c_{uv} . As shown in Table 2 , the prompt consists of three components: Role , Target Connection and Output . The Role is dynamically assigned based on relative risk scores, so that the LLM can focus its reasoning on the right target. The Target Connection provides both the quantitative risk context and the chronological interaction histories, grounding the audit in structural evidence. The Output progresses from connection overview to a final risk verdict, ensuring the reasoning is systematic and the resulting report r u v r_{uv} captures the assessed connection intent. The complete template is provided in Appendix A . Connection Intent Encoding. We encode the audit report r u v r_{uv} into a continuous representation using the language encoder ℳ Enc \mathcal{M}{\text{Enc}} : (6) e u v = ℳ Enc ( r u v ) ∈ ℝ d s . e{uv}=\mathcal{M}{\text{Enc}}(r{uv})\in\mathbb{R}^{d_{s}}. This maps the LLM’s edge-level reasoning into the GNN’s embedding space, enabling downstream aggregation to suppress deceptive message propagation. Feature Fusion. To incorporate the edge-level intent embeddings e u v e_{uv} into node-level features, we perform mean pooling over all suspicious edges connected to v v : (7) h v edge = { 1 | 𝒩 suspicious ( v ) | ∑ u ∈ 𝒩 suspicious ( v ) e u v , if 𝒩 suspicious ( v ) ≠ ∅ 𝟎 , otherwise h_{v}^{\text{edge}}=\begin{cases}\frac{1}{|\mathcal{N}{\text{suspicious}}(v)|}\sum{u\in\mathcal{N}{\text{suspicious}}(v)}e{uv},&\text{if }\mathcal{N}{\text{suspicious}}(v)\neq\emptyset\ \mathbf{0},&\text{otherwise}\end{cases} where 𝒩 suspicious ( v ) = { u ∣ ( u , v ) ∈ ℰ suspicious } \mathcal{N}{\text{suspicious}}(v)={u\mid(u,v)\in\mathcal{E}{\text{suspicious}}} . This mean pooling distills the relational intent from suspicious connections, while for nodes without suspicious edges, we set h v edge = 𝟎 h{v}^{\text{edge}}=\mathbf{0} to avoid introducing noise. Finally, we fuse the multi-view signals via concatenation: (8) H v = [ x v ‖ h v node ‖ h v edge ] ∈ ℝ d + 2 d s . H_{v}=[x_{v}|h_{v}^{\text{node}}|h_{v}^{\text{edge}}]\in\mathbb{R}^{d+2d_{s}}. The three components x v x_{v} , h v node h_{v}^{\text{node}} , and h v edge h_{v}^{\text{edge}} together cover statistical features, node-level behavior intent, and edge-level connection intent, forming a comprehensive node representation. 4.4. Adaptive Self-Training To address supervision scarcity under class imbalance, we deploy a self-training loop that progressively expands the training set with high-confidence pseudo-labels. At each round t t , we first re-initialize the GNN, then train it on the current labeled set 𝒱 L ( t ) \mathcal{V}{L}^{(t)} with the enriched features H v H{v} , and use it to generate pseudo-labels for the next round. Re-Initialization of GNN. At the start of the t t -th training round, the GNN is re-initialized from scratch. Early models trained on limited labels tend to misclassify camouflaged fraudsters as benign due to structural camouflage. If the model parameters θ \theta are carried over iteratively, these biases can propagate through multi-hop message passing and compound across rounds. Re-initializing at each round forces the model to relearn from the updated labeled set, allowing it to progressively correct earlier misclassifications as more reliable pseudo-labels are incorporated. GNN Training. With freshly initialized parameters, the GNN is trained on 𝒢 homo \mathcal{G}{\text{homo}} using the enriched features H v H{v} and the current labeled set 𝒱 L ( t ) \mathcal{V}{L}^{(t)} . The model minimizes the binary cross-entropy loss: (9) ℒ = − 1 | 𝒱 L ( t ) | ∑ v ∈ 𝒱 L ( t ) [ y v log ⁡ p ^ v + ( 1 − y v ) log ⁡ ( 1 − p ^ v ) ] , \mathcal{L}=-\frac{1}{|\mathcal{V}{L}^{(t)}|}\sum_{v\in\mathcal{V}{L}^{(t)}}\left[y{v}\log\hat{p}{v}+(1-y{v})\log(1-\hat{p}{v})\right], where p ^ v \hat{p}{v} is the fraud probability for node v v predicted by the current model. Once training converges, the model produces p ^ v \hat{p}{v} for all nodes in 𝒱 U \mathcal{V}{U} for pseudo-label generation. Algorithm 1 The overall procedure of L2IR framework. 1: Heterogeneous graph 𝒢 = ( 𝒱 , ℰ , ℛ ) \mathcal{G}=(\mathcal{V},\mathcal{E},\mathcal{R}) , statistical features 𝒳 \mathcal{X} , textual traces { T v } v ∈ 𝒱 {T_{v}}{v\in\mathcal{V}} , labeled set 𝒱 L \mathcal{V}{L} , risk thresholds τ h , τ l \tau_{h},\tau_{l} , confidence thresholds τ fraud , τ benign \tau_{\text{fraud}},\tau_{\text{benign}} , maximum self-training rounds T T 2: Trained model f θ f_{\theta} 3: Behavior Intent Profiling: 4: for each node v ∈ 𝒱 v\in\mathcal{V} do 5: Retrieve exemplars 𝒮 v \mathcal{S}{v} via textual and interaction similarity 6: Prompt ℳ LLM \mathcal{M}{\text{LLM}} with T v T_{v} and 𝒮 v \mathcal{S}{v} to generate profile b v b{v} 7: Encode b v b_{v} to semantic embedding h v node ← ℳ Enc ( b v ) h_{v}^{\text{node}}\leftarrow\mathcal{M}{\text{Enc}}(b{v}) 8: end for 9: Connection Intent Reasoning: 10: Train preliminary GNN on [ x v ∥ h v node ] [x_{v}|h_{v}^{\text{node}}] to get risk scores Z v Z_{v} 11: Partition nodes into 𝒱 + \mathcal{V}{+} and 𝒱 − \mathcal{V}{-} via thresholds τ h , τ l \tau_{h},\tau_{l} 12: Retain top- s s

L2IR: Revealing Latent Intent in Graph Fraud Detection | AI Research

Key Takeaways

What the paper is about

What it covers

Comments (0)

No comments yet