Abstracting Cross-Domain Action Sequences into Interpretable Workflows
Digital applications generate massive amounts of interaction logs—records of every click, scroll, and keystroke a user makes. While these logs are objective, they are often too noisy and granular to reveal the actual intent behind a user's behavior. This paper introduces WorkflowView, a framework that uses Large Language Models (LLMs) to transform these raw, low-level logs into high-level, human-interpretable activities. By moving from simple data points to meaningful workflows, the researchers aim to help developers better understand how people use their products and identify where improvements are needed.
How WorkflowView Works
The framework uses a hierarchical, multi-layer approach to process data. Instead of trying to interpret raw logs in one step, WorkflowView breaks the process down:
Layer 1: Converts raw, timestamped UI events into a natural language description of the actions.
Layer 2: Infers the high-level activity (e.g., "reviewing comments") from those descriptions.
Layer 3 (Optional): Categorizes these activities into specific classes, such as predicting whether a student is likely to drop out of a course.
This modular design allows the system to "denoise" data, filtering out irrelevant clicks or interface explorations to focus on the user's core goal.
Versatility Across Domains
The researchers tested WorkflowView across three distinct scenarios to prove its flexibility:
Web Browsing: The model successfully reconstructed user tasks from browser logs with high accuracy, achieving a semantic similarity score of 0.91.
Education: By analyzing interaction logs from Massive Open Online Courses (MOOCs), the model predicted student dropout rates with a weighted F1 score of 0.90, using only five examples to guide its learning.
Document Workflows: The framework was used to analyze how users integrate AI tools into their work in Microsoft Word, successfully categorizing complex behaviors like collaborative editing and document formatting.
Key Advantages and Considerations
WorkflowView offers a significant advantage over traditional methods, which often require extensive, task-specific training data and manual labeling. Because it leverages the broad knowledge already encoded in LLMs, it can perform well in "zero-shot" or "few-shot" settings, meaning it can adapt to new tasks with little to no prior training.
However, the authors note that deploying this technology requires careful consideration. Because the framework relies on LLM-based inference, developers must account for computational costs, latency, and, most importantly, user privacy. The researchers emphasize that their approach is designed to provide aggregated, anonymized insights, ensuring that while product teams can learn from user workflows, individual user privacy remains protected.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!