Dataset Structure#

For ease of use, TensorBay defines a uniform dataset format. This topic explains the related concepts. The TensorBay dataset format looks like:

dataset
├── notes
├── catalog
│   ├── subcatalog
│   ├── subcatalog
│   └── ...
├── segment
│   ├── data
│   ├── data
│   └── ...
├── segment
│   ├── data
│   ├── data
│   └── ...
└── ...

dataset#

Dataset is the topmost concept in TensorBay dataset format. Each dataset includes a catalog and a certain number of segments.

The corresponding class of dataset is Dataset.

notes#

Notes contains the basic information of a dataset, including

the time continuity of the data inside the dataset
the fields of bin point cloud files inside the dataset

The corresponding class of notes is Notes.

catalog#

Catalog is used for storing label meta information. It collects all the labels corresponding to a dataset. There could be one or several subcatalogs (Label Format) under one catalog. Each Subcatalog only stores label meta information of one label type, including whether the corresponding annotation has tracking information.

Here are some catalog examples of datasets with different label types and a dataset with tracking annotations(Table. 6).

Table 6 Catalogs#
Catalogs	Description
elpv Catalog	This example is the catalog of elpv Dataset, which is a dataset with Classification label.
BSTLD Catalog	This example is the catalog of BSTLD Dataset, which is a dataset with Box2D label.
Neolix OD Catalog	This example is the catalog of Neolix OD Dataset, which is a dataset with Box3D label.
Leeds Sports Pose Catalog	This example is the catalog of Leeds Sports Pose Dataset, which is a dataset with Keypoints2D label.
NightOwls Catalog	This example is the catalog of NightOwls Dataset, which is a dataset with tracking Box2D label.

Note that catalog is not needed if there is no label information in a dataset.

segment#

There may be several parts in a dataset. In TensorBay format, each part of the dataset is stored in one segment. For example, all training samples of a dataset can be organized in a segment named “train”.

The corresponding class of segment is Segment.

data#

Data is the structural level next to segment. One data contains one dataset sample and its related labels, as well as any other information such as timestamp.

The corresponding class of data is Data.