Title:Unsupervised learning of object structure and dynamics from videos

The paper proposes a new model for video prediction with a structured representation based on object keypoints. It is a novel approach and also experiment methodology is interesting and generalizable. Reviewers initially asked many questions and the rebuttal was convincing, at least for the majority of reviewers. Thus according with their discussion the area chair suggest an acceptance