Introduction

Learning and imitating behavioral intelligence from human demonstrations is a promising approach towards the intuitive programming of robots for enhanced dynamic dexterity. However, there has been no publicly available dataset in this domain. To address this gap, we introduce the first large-scale dataset and recording framework specifically designed for studying human collaborative dynamic dexterity in throw&catch tasks. The dataset, named H2TC, contains 15,000 multi-view and multi-modal synchronized recordings of diverse Human-Human Throw&Catch activities. It currently includes 34 healthy human subjects and a variety of 52 objects commonly manipulated in human throw&catch tasks. The dataset is accompanied by a hierarchy of manually annotated semantic and dense labels, such as ground truth human hand and object motions, making it well-suited for a wide range of robot studies, including low-level motor skill learning and high-level action recognition. We envision that the proposed dataset and recording framework will facilitate learning pipelines to extract insights on how humans coordinate both intra- and interpersonally to throw and catch objects, ultimately leading to the development of more capable and collaborative robots.

About the Webpage

On this project webpage, as well as the accompanying GitHub repository, you will find comprehensive documentation, tutorials, and examples to help you understand the dataset effectively. It also introduces a suite of utility tools that have been designed to assist you in preprocessing, visualization, and analyzing the dataset. The webpage also provides information on the data collection process, preprocessing steps, and data structure, making it easy for you to get started with the dataset.

The dataset has been well processed and stored in Dropbox, ensuring high quality and usability. All the code and resources required to use the dataset have been saved in the GitHub repository. Please feel free to explore the resources provided and let us know if you have any questions or need further assistance.

The Dataset in Numbers

Some facts about our dataset:

Subjects 34 subjects (29 males, 5 females, 20-31 yrs)
Objects 52 objects (21 rigid objects, 16 soft objects, 15 3D-printed objects)
Actions Every two humans perform 10 random actions (5 throwing actions, 5 catching actions)
Recordings 15K recordings
Visual Modality RGB, depth, event
View Egocentric, static third-person (side), static third-person (back)
Annotations Human hand and body motion, object motion, average object velocity, human grasp mode, etc.


Cite

If you use our Dataset, please consider citing

@article{lipeng2023h2tc,
  title={Advancing Robots with Greater Dynamic Dexterity: A Large-Scale Multi-View and Multi-Modal Dataset of Human-Human Throw&Catch of Arbitrary Objects},
  author={Lipeng, Chen* and Jianing, Qiu* and Lin, Li* and Xi Luo and Guoyi, Chi and Yu, Zheng},
  journal={arXiv},
  year={2023}
}