For ease of use, TensorBay defines a uniform dataset format. This topic explains the related concepts. The TensorBay dataset format looks like:
dataset ├── notes ├── catalog │ ├── subcatalog │ ├── subcatalog │ └── ... ├── segment │ ├── data │ ├── data │ └── ... ├── segment │ ├── data │ ├── data │ └── ... └── ...
Dataset is the topmost concept in TensorBay dataset format. Each dataset includes a catalog and a certain number of segments.
The corresponding class of dataset is
Notes contains the basic information of a dataset, including
the time continuity of the data inside the dataset
the fields of bin point cloud files inside the dataset
The corresponding class of notes is
Catalog is used for storing label meta information. It collects all the labels corresponding to a dataset. There could be one or several subcatalogs (Label Format) under one catalog. Each Subcatalog only stores label meta information of one label type, including whether the corresponding annotation has tracking information.
Here are some catalog examples of datasets with different label types and a dataset with tracking annotations(Table. 6).
Note that catalog is not needed if there is no label information in a dataset.
There may be several parts in a dataset. In TensorBay format, each part of the dataset is stored in one segment. For example, all training samples of a dataset can be organized in a segment named “train”.
The corresponding class of segment is
Data is the structural level next to segment. One data contains one dataset sample and its related labels, as well as any other information such as timestamp.
The corresponding class of data is