UnrealEgo Benchmark Challenge

References


Introduction

We provide datasets for stereo egocentric 3D human pose estimation. Our datasets capture a wider variety of human motions that can be seen in daily human activities.

  • UnrealEgo2 is a new large-scale synthetic dataset with 1.25M stereo images with 72 body joint annotations (32 for body and 40 for hand).

  • UnrealEgo-RW is a real-world dataset captured with our portable device, providing 130k stereo images and 68 body joint annotations (28 for body and 40 for hand).

Following the best practices for the performance evaluation benchmarks in the literature, we withhold the ground truth of test datasets to prevent overfitting and tuning on the test set. For performance evaluation of your methods on the test set, please see the instruction in the Evaluation section.


Project page

See our project pages and papers for more details of our proposed datasets.


Contact

For questions, please directly contact the first author.

Hiroyasu Akada: hakada@mpi-inf.mpg.de


Citing the dataset

If you use our datasets, please cite all of the papers below.

@inproceedings{hakada2022unrealego,
    title = {UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture},
    author = {Akada, Hiroyasu and Wang, Jian and Shimada, Soshi and Takahashi, Masaki and Theobalt, Christian and Golyanik, Vladislav},
    booktitle = {European Conference on Computer Vision (ECCV)},
    year = {2022}
}

@inproceedings{hakada2024unrealego2,
    title = {3D Human Pose Perception from Egocentric Stereo Videos},
    author = {Akada, Hiroyasu and Wang, Jian and Golyanik, Vladislav and Theobalt, Christian},
    booktitle = {Computer Vision and Pattern Recognition (CVPR)},
    year = {2024}
}



Example data of UnrealEgo2 and UnrealEgo-RW


UnrealEgo2

  • Stereo fisheye images with 3D-to-2D reprojections of ground-truth human poses
Example images of UnrealEgo2


  • Synthetic 3D environemnts with virtual human characters performing motions


UnrealEgo-RW

  • Stereo fisheye images with 3D-to-2D reprojections of ground-truth human poses
Example images of UnrealEgo2


  • Real-world 3D environemnt with human subjects performing motions


Download


UnrealEgo2 Dataset

You can use our download scripts to donwload UnrealEgo2. Please make sure to have ~3TB free space for storing UnrealEgo2.


  • Training/validation splits

Script for RGB data

Script for pose data

Script for depth data (optional)

    bash ./download_unrealego2_train_val_rgb.sh
    bash ./download_unrealego2_train_val_pose.sh
    bash ./download_unrealego2_train_val_depth.sh


  • Testing splits

Script for RGB data

Script for depth data (optional)

    bash ./download_unrealego2_test_rgb.sh
    bash ./download_unrealego2_test_depth.sh


The data is stored as follows:

UnrealEgoData2 [_train_val_rgb, _train_val_pose, _train_val_depth, _test_rgb, _test_depth]
├── Environment Name 1 (ArchVisInterior_ArchVis_RT, etc.)
│   ├── Day
│   │   ├── Human Model Name 1
│   │   │   └── Glasses Name [SKM_MenReadingGlasses_Shape_01]
│   │   │       ├── Motion Category 1
│   │   │       │   ├── Motion Sequence 1
│   │   │       │   │   ├── fisheye_final_image (RGB data)
│   │   │       │   │   ├── fisheye_depth_image (Depth data)
│   │   │       │   │   └── json (keypoint, etc.)
│   │   │       │   │
│   │   │       │   ├── Motion Sequence 2
│   │   │       │   └──...
│   │   │       │
│   │   │       ├── Motion Category 2
│   │   │       └──...
│   │   │
│   │   ├── Human Model Name 2
│   │   └──...
│   │
│   └── (Night)
│
├── Environment Name 2
├── Environment Name 3
└──...


  • Data split txt files (Updated on Sep 16th, 2024)

We provide txt files that divide UnrealEgo2 into training, validation, and testing splits.


  • Calibration file for the camera intrinsic paramters

We provide the calibration file of the cameras in UnrealEgo2 based on the OpenCV and Scaramuzza fisheye camera models.

Note that UnrealEgo and UnrealEgo2 datasets use the same camera. Also, the left and right cameras share the same intrinsic parameters.


UnrealEgo-RW Dataset

You can use our download scripts to donwload UnrealEgo-RW. Please make sure to have ~170GB free space for storing UnrealEgo-RW.


  • Training/validation splits

Script for RGB data

Script for pose data

    bash ./download_rw_train_val_rgb.sh
    bash ./download_rw_train_val_pose.sh


  • Testing splits

Script for RGB data

    bash ./download_rw_test_rgb.sh


The data is stored as follows:

    UnrealEgoData_rw [_train_val_rgb, _train_val_pose, _test_rgb]
    └── studio
        ├── Date 1 (2023_01_17, etc.)
        │   ├── Human Model 1 (S1, etc.)
        │   │   ├── Motion Sequence 1
        │   │   │   ├── fisheye_final_image (RGB data)
        │   │   │   └── json (keypoint, etc.)
        │   │   │
        │   │   ├── Motion Sequence 2
        │   │   └──...
        │   │
        │   ├── Human Model 2
        │   └──...
        │
        ├── Date 2
        └──...


  • Data split txt files (Updated on Sep 16th, 2024)

We provide txt files that divide UnrealEgo-RW into training, validation, and testing splits.


  • Calibration file for the camera intrinsic paramters

We provide the calibration file of the cameras in UnrealEgo-RW based on the OpenCV and Scaramuzza fisheye camera models.


Metadata

We provide metadata for each frame:

  • fisheye_final_image: rgb images, 8-bit png, 1024 × 1024
  • fisheye_depth_image: depth images, 8-bit png, 1024 × 1024 (only for UnrealEgo2)
  • json: json files with camera and body poses
    • trans: global translation (X, Y, Z) in the UE4 coordinate system (only for UnrealEgo2)
    • rot: global rotation (P, Y, R) in the UE4 coordinate system (only for UnrealEgo2)
    • camera_[right/left]_pts3d: camera-relative 3D keypoint location (X, Y, Z) in the OpenCV coordinate system
    • camera_[right/left]_pts2d: 2D keypoint location (X, Y) in the image coordinate
    • device_pts3d: 3D keypoint location (X, Y, Z) relative to the midpoint between stereo cameras

Note that device_pts3d is used as a ground truth for the device-relative pose estimation task.


Evaluation

This page provides instructions on how to prepare your predictions for evaluation on the UnrealEgo2 and UnrealEgo-RW datasets.

Predictions should emailed to [hakada at mpi-inf.mpg.de] as a zip file.

At most four submissions for the same approach are allowed. Any two submissions must be 72 hours apart.

Please also check our inference code and exemplar prediction files stored in the directory explained here.


Prepare the directory for predictions

The predictions should be stored in the following directory structure.

In each motion sequence directory, please create a single json directory for each motion.

Please also check the provided test.txt file for the name of environments, human models and motions.

(Note that this structure is similar to that of the RGB data shown in the Download page.)


  • Directory structure for the pose predictions from UnrealEgo2 test RGB dataset
UnrealEgoData2_test_pose
├── Environment Name 1 (ArchVisInterior_ArchVis_RT, etc.)
│   ├── Day
│   │   ├── Human Model Name 1
│   │   │   └── Glasses Name [SKM_MenReadingGlasses_Shape_01]
│   │   │       ├── Motion Category 1
│   │   │       │   ├── Motion Sequence 1
│   │   │       │   │   └── json
│   │   │       │   │
│   │   │       │   ├── Motion Sequence 2
│   │   │       │   └──...
│   │   │       │
│   │   │       ├── Motion Category 2
│   │   │       └──...
│   │   │
│   │   ├── Human Model Name 2
│   │   └──...
│   │
│   └── (Night)
│
├── Environment Name 2
├── Environment Name 3
└──...


  • Directory structure for the pose predictions from UnrealEgo-RW test RGB dataset
UnrealEgoData_rw_test_pose
└── studio
    ├── Date 1 (2023_01_17, etc.)
    │   ├── Human Model 1 (S1, etc.)
    │   │   ├── Motion Sequence 1
    │   │   │   └── json
    │   │   │
    │   │   ├── Motion Sequence 2
    │   │   └──...
    │   │
    │   ├── Human Model 2
    │   └──...
    │
    ├── Date 2
    └──...


Save predictions as json files.

In the created json directory, the predicted 3D joint position for each frame should be stored as a single json file named frame_[index].json.

The name index of the json file should match that of the corresponding RGB data, e.g., frame_13.json for final_13.png.

 └── json
      ├── frame_0.json
      ├── frame_1.json
      ├── frame_2.json
      └──...


The json file should contain the dictionary of 3D positions (X, Y, Z) of the following 16 joints.


  • UnrealEgo2

    • head: head
    • neck_01: neck
    • upperarm_l: left upper arm
    • upperarm_r: right upper arm
    • lowerarm_l: left lower arm
    • lowerarm_r: right lower arm
    • hand_l: left hand
    • hand_r: right hand
    • thigt_l: left thigt
    • thigt_r: right thigt
    • calf_l: left calf
    • calf_r: right calf
    • foot_l: left foot
    • foot_r: right foot
    • ball_l: left ball
    • ball_r: right ball
{
    "head": [
        0.04534833335854688,
        -10.98678477881302,
        6.916525121237093
    ],
    "neck_01": [
        4.9283838141698135,
        -13.806552234190605,
        16.961396946737494
    ],
    "upperarm_l": [
        15.086385778590072,
        -1.2434797844483088,
        18.904893206924573
    ],

    ...

}


  • UnrealEgo-RW

    • Head: head
    • Neck: neck
    • LeftArm: left upper arm
    • RightArm: right upper arm
    • LeftForeArm: left lower arm
    • RightForeArm: right lower arm
    • LeftHand: left hand
    • RightHand: right hand
    • LeftUpLeg: left upper leg
    • RightUpLeg: right upper leg
    • LeftLeg: left leg
    • RightLeg: right leg
    • LeftFoot: left foot
    • RightFoot: right foot
    • LeftToeBase: left toe
    • RightToeBase: right toe
{
    "Head": [
        0.04534833335854688,
        -10.98678477881302,
        6.916525121237093
    ],
    "Neck": [
        4.9283838141698135,
        -13.806552234190605,
        16.961396946737494
    ],
    "LeftArm": [
        15.086385778590072,
        -1.2434797844483088,
        18.904893206924573
    ],

    ...

}


Submit the predictions

Submit zip files of the prepared directory to [hakada at mpi-inf.mpg.de].

Please use institutional email address and include the following information:

  • Your first and last name.
  • Your Affiliation
  • Name of datasets used to train your method (See Results section.)
  • Name of your method (See Results section. This can be modified later on request.)
  • URL link to the corresponding paper or GitHub if applicable (This can be added later on request.)


Device-relative 3D human pose estimation


Overall performance

- UnrealEgo2 test dataset

Method Training data MPJPE (mm) PA-MPJPE (mm) 3D PCA AUC
Akada et al., CVPR 2024 (T=5) UnrealEgo2 30.53 26.72 97.22 80.75
Akada et al., ECCV 2022 UnrealEgo2 72.82 52.90 91.32 55.81
Zhao et al., 3DV 2021 * UnrealEgo2 79.64 58.22 88.50 53.82

Methods with * are re-implemented by Akada et al., CVPR 2024


- UnrealEgo-RW test dataset

Method Training data MPJPE (mm) PA-MPJPE (mm) 3D PCA AUC
Akada et al., CVPR 2024 (T=5) UnrealEgo2, UnrealEgo-RW 72.89 56.19 90.29 57.19
Akada et al., ECCV 2022 UnrealEgo2, UnrealEgo-RW 92.48 67.15 84.25 48.04
Zhao et al., 3DV 2021 * UnrealEgo2, UnrealEgo-RW 97.86 69.92 81.53 46.32
Akada et al., CVPR 2024 (T=5) UnrealEgo-RW 104.14 82.18 80.20 46.22

Methods with * are re-implemented by Akada et al., CVPR 2024




Performance by joints (MPJPE with mm-scale)

- UnrealEgo2 test dataset

Method Training data Upper body (avg) Lower body (avg) Head Neck Left upperarm Right upperarm Left lowerarm Right lowerarm Left hand Right hand Left thigh Right thigh Left calf Right calf Left foot Right foot Left ball Right ball
Akada et al., CVPR 2024 (T=5) UnrealEgo2 17.10 43.97 1.85 4.85 10.49 10.17 19.11 18.71 36.67 34.96 17.92 18.14 33.09 32.74 59.93 58.33 66.53 65.09
Akada et al., ECCV 2022 UnrealEgo2 50.09 95.52 16.71 21.21 39.14 37.56 62.19 63.94 84.00 75.98 51.77 50.52 95.83 91.68 116.38 110.53 128.08 119.36
Zhao et al., 3DV 2021 * UnrealEgo2 55.67 103.97 21.50 26.67 45.02 43.85 67.66 67.12 89.55 83.99 61.33 62.41 101.96 99.37 123.11 118.52 138.09 126.97

Methods with * are re-implemented by Akada et al., CVPR 2024


- UnrealEgo-RW test dataset

Method Training data Upper body (avg) Lower body (avg) Head Neck Left upperarm Right upperarm Left lowerarm Right lowerarm Left hand Right hand Left thigh Right thigh Left calf Right calf Left foot Right foot Left ball Right ball
Akada et al., CVPR 2024 (T=5) UnrealEgo2, UnrealEgo-RW 46.23 99.55 18.66 22.37 27.68 33.24 55.42 58.45 75.45 78.55 56.41 57.14 84.81 84.14 122.52 119.35 139.61 132.41
Akada et al., ECCV 2022 UnrealEgo2, UnrealEgo-RW 60.00 125.04 20.29 29.63 46.26 39.50 73.83 75.11 94.11 101.29 62.20 62.91 113.13 107.11 162.13 147.31 182.68 162.85
Zhao et al., 3DV 2021 * UnrealEgo2, UnrealEgo-RW 64.87 130.85 25.54 34.33 50.21 45.77 77.87 79.23 100.16 105.85 67.45 67.09 118.20 112.94 168.92 151.77 188.27 172.16

Methods with * are re-implemented by Akada et al., CVPR 2024