Discussion about this post

User's avatar
Alex Gorischek's avatar

I’ve been exploring the opportunities in Generative UI as well. Some thoughts:

- UI generator as agent: If the AI has some set of (configurable) components to work with, it has similarities to modern semantic agents (e.g. ChatGPT with Plugins), which choose from a set of tools based on their declared purposes. Agent frameworks might even be directly usable for generative UI.

- Application state as agent goal: Taking the above further, the application itself can be conceptualized as an agent whose goal is to reach some desired internal state (e.g. a form is completely filled out), in which case the UI components aren’t the entire “tools”, but rather just the contracts; the human user is the actor behind those contracts.

- Markup format & generation efficiency: Users tend to have limited patience for UI rendering, and token generation is going to take time. Additionally, common markup languages for this purpose (e.g. JSX) may be particularly poorly suited, e.g. due to large numbers of opening and closing brackets. There may be “abbreviated” formats, i.e. JSON is to YAML as JSX is to ___? Something like Pug comes to mind.

- Streaming support: The way chat interfaces stream text is important for perceived UI responsiveness. But UI markup languages that require a nesting structure aren’t ideal for this because you can’t parse it until the very end (unless someone builds a fuzzy parser?). There may be alternate markup formats, e.g. I can imagine a format where the elements are generated in a completely flat list, one per line, where each one declares its immediate parent. As soon as a line is generated, that element is attached to its parent. This UI could even be interactive while still being generated, and the model could be told to generate the (contextually) most likely options first.

- Division of labor: I suspect there will be some elements that we want to display deterministically. E.g., as an easy example, the logic to display loading UI should possibly be mechanical; if we had to *generate* the loading UI via LLM, what do we show while the loading UI is loading?

Expand full comment
Vincent Pulling's avatar

Cool idea. I think giving up the reins entirely to AI is folly, but tasteful addition of rehash is a nice thought.

Expand full comment
5 more comments...

No posts