InfoCon: Concept Discovery with Generative and Discriminative Informativeness

1Peking University, 2University of Hong Kong
Interpolate start reference image.

InfoCon endeavors to discover manipulation concepts that characterize the goal an agent aim to fulfill at a specific juncture while interacting with the environment.

When provided with a dataset comprising demonstration trajectories of manipulation tasks, we are capable of segmenting these trajectories into distinct stages and assigning each stage with a unique label derived from a learnable codebook. This codebook serves as the symbolic representation of the goals, encapsulating the manipulation concepts we aim to uncover.

Abstract

We focus on the self-supervised discovery of manipulation concepts that can be adapted and reassembled to address various robotic tasks. We propose that the decision to conceptualize a physical procedure should not depend on how we name it (semantics) but rather on the significance of the informativeness in its representation regarding the low-level physical state and state changes.

We model manipulation concepts – discrete symbols – as generative and discriminative goals and derive metrics that can autonomously link them to meaningful sub-trajectories from noisy, unlabeled demonstrations. Specifically, we employ a trainable codebook containing encodings (concepts) capable of synthesizing the end-state of a sub-trajectory given the current state – generative informativeness. Moreover, the encoding corresponding to a particular sub-trajectory should differentiate the state within and outside it and confidently predict the subsequent action based on the gradient of its discriminative score discriminative informativeness.

These metrics, which do not rely on human annotation, can be seamlessly integrated into a VQ-VAE framework, enabling the partitioning of demonstrations into semantically consistent sub-trajectories, fulfilling the purpose of dis- covering manipulation concepts and the corresponding sub-goal (key) states.

We evaluate the effectiveness of the learned concepts by training policies that utilize them as guidance, demonstrating superior performance compared to other baselines. Additionally, our discovered manipulation concepts compare favorably to human-annotated ones while saving much manual effort.

Model of Goal

Generative Goal

Generative Goal is indicative of the state that signifies the attainment of the goal.

Discriminative Goal

Discriminative Goal assesses the appropriateness and completeness of a goal, thereby offering directional guidance.

Experiments

Parition of Trajectory based on InfoCon is similar to our intution.

Pick Cube

Stack Cube

Turn Faucet

Peg Insertion

And the concepts are also more helpful than others if make use of CoTPC

Interpolate start reference image.