Datasets

Multi-view Facial Image Dataset Based on LFW: Using software that is based on the code that accompanies  this paper  a set of synthetically generated multi-view facial images has been created within OpenDR H2020 research project by Aristotle University of Thessaloniki based on the  LFW   image dataset which is a facial image dataset that consists of 13,233 facial images in the wild for 5,749 person identities collected from the Web. The resulting set, named AUTH-OpenDR Augmented LFW (AUTH-OpenDR ALFW), consists of 5,749 person identities. From each image of these subjects (13,233 in total), 13 synthetic images generated by yaw axis camera rotation in the interval [0◦: +60◦ ] with step +5◦ are obtained. Moreover, 10 synthetic images generated by pitch axis camera rotation in the interval [0◦ : +45◦ ] with step +5◦ are also created for each facial image of the aforementioned dataset. The dataset structure is as follows. Two folders exist for every person identity with his/her name as folder name: one of these folders contains the aligned and the other the original facial images (this is also indicated in the folder name). The file names of the images in these folders are as follows: {Name of the person} {current ID in case of multiplicity} {yaw/pitch direction} {angle in rad}.jpg. The ID is used to distinguish between the various images of the same person, if more than one are available. Examples: ”Alicia Witt 0001 pitch 0.26.jpg” or ”Alicia Witt 0002 yaw 0.52.jpg”. The ALFW dataset is covered by a Creative Commons Attribution-NonCommercial 4.0 International license and can be downloaded from this FTP site.

                                                                                    Table 1: Summary of ALFW Multi-view Facial Image Dataset

                                                                                                          Samples from the ALFW dataset

Multi-view Facial Image Dataset Based on CelebA: For performance evaluation or training of face recognition methods, a dataset of facial images from several viewing angles was created by Aristotle University of Thessaloniki based on the CelebA image dataset, using the software that was created in OpenDR H2020 research project based on this paper and the respective code provided by the authors. CelebA is a largescale facial dataset and consists of 202,599 facial images of 10,177 celebrities captured in the wild. The new dataset is named AUTH-OpenDR Augmented CelebA (AUTH-OpenDR ACelebA). The set was generated from 140,000 facial images corresponding to 9161 persons, i.e. a subset of CelebA was used. For each CelebA image used, 13 synthetic images generated by yaw axis camera rotation in the interval [0◦ : +60◦ ] with step +5◦ were obtained. Moreover, 10 synthetic images generated by pitch axis camera rotation in the interval [0◦: +45◦] with step +5◦ are also created for each facial image of the aforementioned dataset. Since CelebA license does not allow distribution of derivative work we do not make AcelebA directly available but instead provide instructions and scripts on how to recreate it. The process that the user shall follow in order to reproduce the Augmented CelebA (ACelebA) is the following:

  • Download to your local folder the public available code of the Github repo Rotate-and-Render
  • Download the CelebA facial image dataset in a local folder
  • Download the python script Do_main_Person_Identity.py and identity_CelebA.csv from this FTP site
  • Create the folder: 3ddfa/Pre-processing
  • Add script and csv to this folder
  • Include the appropriate input paths in the script (rootdir-folder of CelebA dataset)Execute the command python3 Do_main_Person_Identity_ACelebA.py
  • Execute bash experiments/v100_test.sh with appropriate parameters (e.g. gpu_ids) and after modifying the last two lines as follows:

       --yaw_poses 0 5 10 15 20 25 30 35 40 45 50 55 60 \

       --pitch_poses 5 10 15 20 25 30 35 40 45 \

 

                                                                                 Table 2: Summary of and ACelebA Multi-view Facial Image Dataset

SMPL is a parametric statistical body shape model. SMPL+D is an extension of SMPL, which can encode shape deformations from clothes and hair as vertex displacements. The dataset, generated by Aristotle University of Thessaloniki (AUTH), contains 2928 human models in various shapes and textures. At its core, the dataset consists of 183 unique SMPL+D bodies, which were generated through non-rigid shape registration of manually generated MakeHuman models. The rest were generated by applying shape and texture alterations to those models.  In addition, code is provided for converting those human models in the FBX format. This format is supported by a wide range of simulators, including Webots. However, pose-dependent deformations are not applied to the human models. To convert those models to the FBX format, the original SMPL body model must be downloaded from the authors website. Finally, instructions for setting a demo project in the Webots simulator are provided. In the project, one the SMPL+D bodies in FBX format can perform an animation from AMASS.

 

                                          SMPL+D bodies generated through non-rigid shape registration of MakeHuman manually created models

 

The dataset is available through the official GitHub repository of the OpenDR toolkit here. Detailed instructions on how to download the dataset are provided in the repository.

The dataset is licensed under the Creative Commons Attribution 4.0 International License.  It should be noted that the SMPL-Model is needed for generating the SMPL+D bodies in the FBX format. The SMPL-Model is distributed only by its authors and is copyrighted under a separate license. Thus, we do not provide the SMPL-Model in our dataset in any form, and we encourage the users, through our instructions to download it from the webpage of the authors [here]. However, the SMPL-Body, which is directly related to our dataset, is licensed by its authors under the Creative Commons Attribution 4.0 International License [here].

The dataset was generated through a mixed (real and synthetic) image data generation method which utilizes real background images and DL-generated human models. The method was developed by Aristotle University of Thessaloniki (AUTH) within the H2020 OpenDR Project. The dataset is suitable for training/evaluating (a) pose estimation, (b) person detection, (c) identity recognition methods. The 3D human models, required by the method, were generated using the Pixel-aligned Implicit Function (PIFu) and full-body images of people from the Clothing Co-Parsing (CCP) dataset. As background images, a subset of the Cityscapes dataset was used. The Cityscapes license prohibits the distribution of any modified versions of itself. Thus, code is provided, through the official GitHub repository of the OpenDR toolkit, that can re-generate the exact same dataset, given that the Cityscapes dataset is downloaded by the website of its authors. However, the set of the 3D human models is directly available through OpenDR’s GitHub repository.

 

                                      Image from the mixed image dataset. Ground truth keypoints for human pose estimation and bounding

                                                                                                        boxes for person detection are drawn.

 

 

The following annotations are provided for the mixed image dataset:

  • 2D Bounding Boxes of humans

                - A .csv file is provided for each image, specifying the 2d bounding box of each depicted human.

 

  • Human identity labels

                - The same .csv file provided for each image, specifies the identity label of each depicted human.

 

  • 2D keypoints of human poses

                -csv file provided for each image, also specifies the image coordinates of the keypoints (COCO format) of each depicted human.

                -The standard COCO JSON annotation format is also provided.

 

The following annotations are provided for the 3D human models used to create the image dataset:

  • 3D bounding boxes and 3D keypoints

                 -Each human model contains a pickle file (.pkl) that specifies the locations of the 8 vertices of its 3D bounding box.

                 -Each human model contains a pickle file (.pkl) that specifies the 3D locations of its keypoints (COCO format).

 

                                                                                          3D human models in various views generated by PIFu.

                                                                                       Full-body images from the CCP dataset were used as input.

 

Code and instructions for re-generating the dataset are provided here. Instructions for downloading the 3D human models are also provided. The code, the annotations and the 3D human models are licensed under the Apache 2.0 License. The final dataset is subject to the original license of the authors of the Cityscapes dataset [here].