Side Track: AI Painter – Painting as a Metaphor to Interact with AI Systems

Here is a quick personal side experiment I have implemented a while ago to play with an idea I had to leverage a painting application for interacting with an AI system. It was also a good excuse to play with PySide6 a little. Please note that the AI itself here is not impressive at all and not meant to be – I just built the bare minimum to investigate the interaction metaphor.

  • Facilitate symbol grounding and building visual inventories, e.g. to train agents on these representations rather than raw pixels
  • Define relevant regions
  • Interact with the environment via keys and record user input, e.g. for imitation learning
  • Works against local windows, containers, stepped environments and both 2D and 3D
  • Pointer:
    • Execute click in environment
    • Select point in environment, e.g. to have agent explain what it is acc. to its belief state or to select and extract underlying object
  • Draw / Erase:
    • Define map, e.g. for interactive segmentation or to explain scene or object parts to agent
  • Rectangular Selection:
    • Initialize object tracking (ball, paddles)
    • Define game areas (score, game field)
    • Build visual inventory
  • Color Selection:
    • Initialize color segmentation
  • Extensible widgets approach, e.g. it is easy to render depth, segmentation and RGB maps as well as point clouds, occupancy grid and floor maps simultaneously and add new components as they become desirable
  • Can easily pull in data from outside sources, e.g. agent can log its observations and belief state against an in-memory store like Redis (successfully tried both MsgPack and JSON in parallel) and then developer can explore both via the tool
  • Application is containerized and can be run from container (via X passthrough) making it easy to deploy and share without compilation

If I should decide to extend this at some point, there are a few obvious ideas to incorporate:

  • Use arrow tool to position agent in map and to define curriculum learning steps (e.g. to help RL agent train successfully against Montezuma’s Revenge)
  • Add other obviously useful AI visualizations like saliency maps or integrate with things like the various GAN painting applications, e.g. my colleagues’ amazing GAN Paint Studio

Of course, it would be straightforward to extend this with all kinds of object trackers, pose estimators, segmentation algorithms etc.

A final interesting extension is that I have added a simple slider to the dashboard with a play button which extends the painting application metaphor to a video editor metaphor and allows to play entire agent runs as depicted below. Since the actual agent is a much more sophisticated piece of software that i) is a part of my day job rather than free time (which means it is confidential for now) and ii) we will publish on, I can unfortunately not yet show its architecture or probabilistic programming models in this blog post, but I am very much looking forward to being able to write about it in the future when the publications are out. [For the record: I have made sure before this blog post that we have no intentions to use AI Painter for any publications and even got approval from my manager to make the entire repository public despite it was just a private side experiment – since it is very hacked together, I have decided not to do so for now, but in the unlikely case that I find time to clean things up, I will revisit this.]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s