Back to AI Research

AI Research

Abstracting Cross-Domain Action Sequences into Inte... | AI Research

Key Takeaways

  • Abstracting Cross-Domain Action Sequences into Interpretable Workflows Digital applications generate massive amounts of interaction logs—records of every cli...
  • Sequential or time-stamped interaction logs provide objective records of digital application usage, yet their granularity and noise often obscure meaningful insights into people's work.
  • Such insights are essential for improving digital products in ways grounded in real-world user interactions.
  • Prior research has applied deep learning models to cluster user actions into high-level activities, but these approaches are highly sensitive to noise and struggle to generalize across applications.
  • To address this limitation, we introduce WorkflowView, a framework that uses large language models (LLMs) to abstract low-level action sequences into high-level activities.
Paper AbstractExpand

Sequential or time-stamped interaction logs provide objective records of digital application usage, yet their granularity and noise often obscure meaningful insights into people's work. Such insights are essential for improving digital products in ways grounded in real-world user interactions. Prior research has applied deep learning models to cluster user actions into high-level activities, but these approaches are highly sensitive to noise and struggle to generalize across applications. To address this limitation, we introduce WorkflowView, a framework that uses large language models (LLMs) to abstract low-level action sequences into high-level activities. We establish the effectiveness and generality of our approach across three distinct, challenging sequential tasks and diverse domains: (a) zero-shot task description reconstruction from browser logs (achieving high semantic similarity, $\mu_{sim} = 0.91$), (b) few-shot student dropout prediction using MOOC interaction logs (reaching weighted $F_1 = 0.90$ with only five few-shot examples), and (c) anonymized, privacy-preserving analysis of AI tool integration within document workflows in Microsoft Word. Our work demonstrates that LLM-based abstraction is a robust and efficient path forward for transforming low-level behavioral data into high-level, interpretable, and actionable insights. We also discuss practical considerations for deploying LLM-based inferences within logging infrastructures, including computational efficiency and user privacy.

Abstracting Cross-Domain Action Sequences into Interpretable Workflows
Digital applications generate massive amounts of interaction logs—records of every click, scroll, and keystroke a user makes. While these logs are objective, they are often too noisy and granular to reveal the actual intent behind a user's behavior. This paper introduces WorkflowView, a framework that uses Large Language Models (LLMs) to transform these raw, low-level logs into high-level, human-interpretable activities. By moving from simple data points to meaningful workflows, the researchers aim to help developers better understand how people use their products and identify where improvements are needed.

How WorkflowView Works

The framework uses a hierarchical, multi-layer approach to process data. Instead of trying to interpret raw logs in one step, WorkflowView breaks the process down:

  • Layer 1: Converts raw, timestamped UI events into a natural language description of the actions.

  • Layer 2: Infers the high-level activity (e.g., "reviewing comments") from those descriptions.

  • Layer 3 (Optional): Categorizes these activities into specific classes, such as predicting whether a student is likely to drop out of a course.
    This modular design allows the system to "denoise" data, filtering out irrelevant clicks or interface explorations to focus on the user's core goal.

Versatility Across Domains

The researchers tested WorkflowView across three distinct scenarios to prove its flexibility:

  • Web Browsing: The model successfully reconstructed user tasks from browser logs with high accuracy, achieving a semantic similarity score of 0.91.

  • Education: By analyzing interaction logs from Massive Open Online Courses (MOOCs), the model predicted student dropout rates with a weighted F1 score of 0.90, using only five examples to guide its learning.

  • Document Workflows: The framework was used to analyze how users integrate AI tools into their work in Microsoft Word, successfully categorizing complex behaviors like collaborative editing and document formatting.

Key Advantages and Considerations

WorkflowView offers a significant advantage over traditional methods, which often require extensive, task-specific training data and manual labeling. Because it leverages the broad knowledge already encoded in LLMs, it can perform well in "zero-shot" or "few-shot" settings, meaning it can adapt to new tasks with little to no prior training.
However, the authors note that deploying this technology requires careful consideration. Because the framework relies on LLM-based inference, developers must account for computational costs, latency, and, most importantly, user privacy. The researchers emphasize that their approach is designed to provide aggregated, anonymized insights, ensuring that while product teams can learn from user workflows, individual user privacy remains protected.

Comments (0)

No comments yet

Be the first to share your thoughts!