dave v
Hero animation poster

Researching contextual AI frameworks to evaluate user-AI interactions and support better creative outcomes

My Role

UX Researcher

Team

1 PhD Lead
4 UX Researchers

Skills

Interaction Design
Prototyping
UX Research

Timeline

10 months
Dec. 2023 - Sep. 2024

01 - Solution

Design smarter: An adaptive plugin that guides, challenges, and strengthens your creative problem solving

POV: Knee deep in brainstorming novel solutions? Imagine how our AI-powered plugin can leave your idea with a better value prop, with more resilience against edge cases, and more!

Scoping the Problem

Engage with different perspectives by generating reflective questions.


Consolidate your line of reasoning by generating a root cause to the problem.


Researching the Space

Get a focused snapshot of how competitors in the market stack up on each key factor.


Uncover more market players and factors by expanding your table with AI-suggested competitors and dimensions.

02 - Background

AI tools are major disrupters in creative work

Cursor AI
Adobe Generative Fill

It's true. They're not just accelerating productivity, but redefining how ideas are produced and refined altogether. Embedding these features directly into the workflow can reduce friction and spark new directions for creative problem-solving.

The Current Space

Although these features are changing the game…

They also raise more questions about their usefulness long-term.

Questions about AI tools

Today’s first pick might not be the same a year from now. Let's look at image generation for example: first it was DALL-E, then Midjourney took over, and now Adobe Firefly is built right into the Adobe Suite.

Tools come and go fast, and creatives have learned to be more selective with their toolkit. Adopting a new tool can be a significant decision, especially when it could mean changing habits, workflows, or sometimes even their own creative voice.

Identifying a Gap

Speed and seamlessness aren’t enough to secure users' trust

Criticisms about creative AI tools

If AI products in this space want to secure a foothold, their products have to help people become more creative instead of providing a shortcut to it. For this reason, it's risky to only be concerned with just AI performance or user satisfaction.

While so many evaluations focus on outputs, fewer study how the structure of user-AI interaction shapes the experience in terms of cognitive engagement and creative depth.

Research Question

How does the positioning of AI within a creative workflow influence creative outcomes, cognitive effort, and how users perceive their own agency and the value of the AI?


Our Goals

To understand how the framing and placement of AI support shapes users’ creative thinking.

To evaluate if different forms of AI positioning can help users produce more creative results.

03 - Design

Redesigning the original plugin prototype

In reviewing the PhD’s earlier version of the plugin, we identified two major limitations with integrating AI into the problem-solving workflow:

Lacks user guidance

Users got stuck on how to reflect on AI-generated responses and extract key insights.

Not immediately usable

Users found AI-generated responses to be long-winded, inaccurate, repetitive, and hard to leverage.

In the interest of our research question, we needed the interactions to keep users thinking so we could focus on how they’re affected by the tool. Here are the major adjustments we made to the interface:

Interactions with AI should be easy to comprehend and actionable.

Our approach to actionable, in-situ template insights.

Original outputs were long and repetitive, which left users skimming instead of thinking.

We restructured responses into shorter insights embedded directly in the template so we could see how users actually applied them, not just skim through them.


The plugin should guide users to keep thinking and exploring across sections without dictating exactly what to do next.

A snippet of additional guidance screens we added.

Early participants often lost momentum because they weren’t sure what to do next.

We added lightweight guidance screens to provide context to each interactive element in the template, keeping flow without dictating what to do next.


Keep users focused on their main goals by ensuring every interaction directly supports their progress.

Our approach to keeping the process seamless using copywriting.

There was a feature for users to mark AI responses as “valid” or “invalid.” In practice, it was ignored because it felt like busywork.

For AI conditions, we replaced it with subtle language nudges (e.g. “Review and edit if needed”) so reflection happened naturally within the task itself.

Interested in seeing my full design rationale?

Design Decisions

Interactions with AI should be easy to comprehend and actionable.


The initial plugin had these AI-generated Q&As to prompt reflection. However, previous participants felt like they were repetitive summaries without offering specific angles or concrete ways to explore a competitor further.

Before picture of old plugin, showing AI generation features for the Competitive Analysis exercise. Before and after picture of generated insights.

We found that users struggled to act on AI insights because they had to sift through long, repetitive text before rewriting their own takeaways in the template.

To close that gap, we shifted to concise, targeted insights embedded directly in the template—letting users cut straight to the action.

For us as researchers, this also made it easier to observe how they synthesized and applied AI input, rather than just skimming it.

Design Decisions

The plugin should guide users to keep thinking and exploring across sections without dictating exactly what to do next.


As mentioned previously, there was no smooth handoff between reading AI insights and applying them. Users were often left unsure of what to do next, breaking their momentum instead of advancing their analysis.

Before picture showing the gap between interacting with plugin and template. Before and after. We added more guidance screens.

So we decided to expand the interface to include guidance screens for every interactive element in the table, not just competitor headers or add-buttons.

These screens provide light context on what each element is, how users can interact with it, and what role it plays in the overall exercise, keeping exploration fluid without defining a fixed path.

Design Decisions

Keep users focused on their main goals by ensuring every interaction directly supports their progress.


The previous researchers included a feature for users to mark AI-generated content as “valid” or “invalid.” In practice, most skipped it—seeing it as an extra task that broke their flow and pulled focus away from research and synthesis.

Before picture of old feature 'Actions to Take'. After picture of copywriting.

We decided to not move forward with this feature and instead embedded simple, contextual language like “Review and edit if needed” or “This may or may not reflect your own analysis.”

This subtly encourages reflection without pulling users out of the task.

Rejected Explorations

Possible actions to keep the flow going

We explored adding an idle state for the plugin for possible actions.

Something we considered was creating an idle state that suggested next steps for exploration. Due to how non-linear the exercise is, we wanted to prevent decision paralysis by offering possible actions to keep them actively engaged.

We realized later this risked narrowing or biasing their decision-making—especially in an open-ended task where we want to observe how AI organically influences their thinking.

Rejected Explorations

Populate competitors as well

Explored allowing competitor population as well.

In addition to populating dimensions, we explored letting users auto-generate entire rows for new competitors. The idea was to see whether giving users a full view of a single competitor might shift how they approach comparisons or structure their analysis.

We realized this risked over-structuring the competitor at once, turning it into a one-by-one review of each (just like the old plugin) instead of encouraging comparison and synthesis.

04 - Methods

How we constructed our research methodology

User Study Setup

Defining our conditions

We ran a between-subjects study with N=47 university participants, most with little to no design thinking experience, randomly assigned to one of three conditions:

No-AI

Users approach a problem/solution without LLM assistance, manually filling out the templates based only on their current knowledge.


Co-Led

Users gain access to LLM generation features in specific parts of the templates, assisting with reflection or proposing alternative ideas.


AI-Led

Templates are already filled out by AI. Users don't have to initiate any writing, only reading and processing what was generated.

User Study Setup

Why design templates?

5 Whys Template
5 Whys
Competitive Analysis Template
Competitive Analysis

To explore how different conditions think and engage in creative thinking, we looked at integrating an LLM into design templates. The key advantage comes in being both a familiar and task-oriented format for guiding users to dive into problems and develop solutions.

This allowed us to visibly observe their thinking at each stage and which content/factors inspired their end deliverables.

Data Analysis

How we measured user outcomes across conditions


We were curious how our plugin impacted our users’

  • Reflective thinking
  • Creative quality
  • Cognitive load
  • Usability

After conducting all the sessions, the research team and I began coding timestamps for key task activities and running thematic analyses on user interviews/survey responses to pinpoint patterns.

05 - Results

Results & Takeaways

The Golden Question

Can AI empower users to be more creative?

AI conditions had more idea units.

To a certain extent, yes. As expected, AI integration exposed participants to more topics and perspectives. Hence, Co-Led and AI-Led users tended to cover more categories in their understanding of the problem and their written solutions.

But quantity wasn’t the whole story.

Behavioral Insights

Differences emerged in how participants engaged with the template.

No-AI

They spent the most time revisiting and editing earlier responses, refining half-formed ideas as their understanding evolved.

No-AI participants' engagement behavior

Co-Led

In contrast, these participants treated the AI as a dialogue partner. Their focus was on shaping responses in the moment, responding rather than revising.

Co-Led participants' engagement behavior

AI-Led

On the other hand, they spent their time digesting generated content. Their process leaned less on imagination and more on remixing already provided information rather than constructing a new line of reasoning.

AI-Led participants' engagement behavior

As a result,

While No-AI participants expressed more confidence and ownership,

No-AI participants' feeling about the process

They were burdened with keeping track of everything as their understanding evolved or while gathering more context, which left less time for actually synthesizing ideas together.

While AI-led participants had exposure to more perspectives and topics early on,

AI-Led participants' feeling about the process

Users found it hard to explore beyond the AI’s suggestions because they seemed complete and convincing. And so, they spent more time deciding on their idea’s direction, leaving less room for depth and creativity.

And due to the AI’s perceived comprehensiveness, some users even accepted surface-level ideas without fully questioning them.

Co-Led participants struck more of a happy medium, but...

Co-Led participants' feeling about the process

Users expanded their thinking with AI while staying active in shaping/challenging ideas without being overwhelmed by information.

However, having a more complex understanding made them more self-critical toward their idea, as they grappled with unanswered “what-ifs” they felt weren't easy to resolve.

Product Strategy

With that in mind, what might creative AI support look like moving forward?


The patterns we saw in our study echoed a broader trend in today’s AI tools.

Much of the market still leans into one of two extremes:

1️⃣

Generate fast, connect later

AI can flood users with insights instantly, but this “instant gratification” risks leaving them prematurely satisfied with surface-level ideas or stunt deeper exploration.

2️⃣

Exclusively serve a supporting role

Respecting users’ agency is crucial for aligning with their real needs. But AI that only reacts to user input can be limiting at times — especially if the user get stuck or leaves their own assumptions unspoken.

Are we truly doing enough with AI tools?


My initial thoughts going forward was “Oh, we just need to structure AI to invite users into deeper engagement, expose them to more ideas, all while empowering their creative control”. But then, I felt a tinge of hesitation. Is that advice really enough to fuel a better experience with creative AI tools?

History may not repeat, but it often rhymes

Compilation of old dot com era websites: MySpace, YouTube, Google, eBay

In many ways, the AI bubble is this generation’s dot com bubble. Early internet tools weren’t just a way to help people search, create, and have fun. They fundamentally changed the way we approach those things.

So what?

AI has the same opportunity with creativity


If we want to both improve creative outcomes and imagine creative thinking in a new light, we need to build past the cliché problems that come with the territory.

Justine Du discovered this quite well in her journey designing Microsoft’s Co-Pilot:

“Well, why aren’t we doing more? ... Aren’t we trying to make the Outlook experience easier so users don’t have to go into Settings? We have the freedom to put whatever we want in there.”

With this amount of potential, why stop at just doing?

AI Design Principles

We distilled three principles for future design


From our findings and a closer look at the AI industry’s trajectory, we distilled three design principles. Across any domain, they point to an opportunity to guide users into new territory step by step, helping them bring out the best in their ideas:

Feed-Forward Prompting

Proactively prime users with reflective prompts to guide exploration

In order to better align with users’ real needs, AI could proactively prime users with reflective questions that guide exploration in specific, fruitful directions.

A pop-up modal that appeared when a user fixated on a solution before understanding the problem and audience.

Catering to Evolving Needs

Say bye to context switching, hello to contextually-aware support

Adapt wherever the user is—gather inspiration during ideation, help users decide during comparison, or step back when users need to focus.

AI serving in mulitple contexts seamlessly (in this case, gathering sources for the user & comparing two designs).

Gentle Troubleshooting

Ensure users feel confident and in control while navigating complexity

If users are stuck with big ideas and bigger unknowns, AI could offer gentle scaffolding by helping users test their ideas against constraints.

User asks a question using cursor chat to help troubleshoot viability of their idea.
06 - Our Journey

The trials of storytelling in research

Iteration

How our research question continually evolved

Evolution of our research question

In the early stages of data analysis, we were primarily concerned with which conditions expanded more, who had the more defined angle, who went deeper, and so on.

This lens captured differences in behavior—yet it didn’t explain why they happened, or how they connected to the experience of being creative.

So we revisited the data. Again. And again.


Each pass gave us a new angle on what really shaped participants’ behavior. AI wasn’t just a well of extra ideas—it actively shifted how people approached the task (for better or for worse). Same data, different interpretations.

An Unexpected Turn

Reconciling with unexpected results


In a pre-survey, we asked users if they had any design thinking background. Then, experts blindly rated user responses across Likert-scale benchmarks, which we graphed against experience levels.

We initially anticipated a clear correlation—but the data suggested something more complex.

Solution quality is not black and white – what qualities define a strong solution?

Shows the example of a participant's solution and its blind expert rating.

Most expert ratings aligned with the quality of the participants’ solutions—but a few outliers reminded us that strong ideas don’t always look polished.

One participant, for instance, proposed a loosely structured solution around adaptive cultural change in education.

Experts blindly rated their solution low in several areas due to the lack of “how” this solution would be implemented and “how” this affects its audience.

Shows their interview response explaining their process

Yet in their interview, they demonstrated clear signs of deep reflection: they questioned the AI’s logic, merged multiple ideas, and grounded their decisions in a human-centered way.

The Lightbulb Moment

Our most meaningful findings were about how users learned and felt during the process—not just what they produced.


Cases like those highlighted a key nuance in evaluating creativity: strong thinking isn’t always neatly packaged.

This further drove our final thesis, saying that AI tools should go beyond generating outputs. Instead, they should anchor their value propositions in providing structures for nuanced thinking and fluid ways to elevate their creative workflows.

That’s the real value.

07 - Reflection

Informing design guidelines for future AI-assisted creativity tools

This project taught me about discovering the invisible experiences that shape how people create. It also reinforced how design and research are inseparable in crafting effective user studies. Thank you UCSD Design Lab for bringing me on!

Learning

Humanizing our key findings

I thought research was only about pushing the envelope and finding novelty. Over these 10 months, I realized that results only matter if they reflect something about us, the people who use and get influenced by this type of technology.

Challenge

Doing academic research for the first time

It’s no secret that academic research demands rigor and many hats to be worn. While it took time to get up to speed with the literature and methods, the biggest lesson was staying present, being a self-starter, and staying hungry for more.

What I would have done differently

Starting with a storytelling perspective

In hindsight, I see how easy it was to get lost in patterns and numbers without a clear anchor. We had coding schemes, but next time I’d lean less on proving absolutes and more on using data to tell stories about how people think/do.

Have more questions about our paper?


I’m happy to walk through the process in more depth or talk about creative support tools, feel free to reach out to me at [email protected] or LinkedIn. Click the image below to read our paper!

Link to our paper