Glossary

accesskey

An accesskey is an access credential for identification when using TensorBay to operate on your dataset.

To obtain an accesskey, log in to Graviti AI Service(GAS) and visit the developer page to create one.

For the usage of accesskey via Tensorbay SDK or CLI, please see SDK authorization or CLI configration.

basehead

The basehead is the string for recording the two relative versions(commits or drafts) in the format of “base…head”.

The basehead param is comprised of two parts: base and head. Both must be revision or draft number in dataset. The terms “head” and “base” are used as they normally are in Git.

The head is the version which changes are on. The base is the version of which these changes are based.

branch

Similar to git, a branch is a lightweight pointer to one of the commits.

Every time a commit is submitted, the main branch pointer moves forward automatically to the latest commit.

commit

Similar with Git, a commit is a version of a dataset, which contains the changes compared with the former commit.

Each commit has a unique commit ID, which is a uuid in a 36-byte hexadecimal string. A certain commit of a dataset can be accessed by passing the corresponding commit ID or other forms of revision.

A commit is readable, but is not writable. Thus, only read operations such as getting catalog, files and labels are allowed. To change a dataset, please create a new commit. See draft for details.

On the other hand, “commit” also represents the action to save the changes inside a draft into a commit.

continuity

Continuity is a characteristic to describe the data within a dataset or a fusion dataset.

A dataset is continuous means the data in each segment of the dataset is collected over a continuous period of time and the collection order is indicated by the data paths or frame indexes.

The continuity can be set in notes.

Only continuous datasets can have tracking labels.

dataloader

A function that can organize files within a formatted folder into a Dataset instance or a FusionDataset instance.

The only input of the function should be a str indicating the path to the folder containing the dataset, and the return value should be the loaded Dataset or FusionDataset instance.

Here are some dataloader examples of datasets with different label types and continuity(Table. 5).

Table 5 Dataloaders

Dataloaders

Description

LISA Traffic Light Dataloader

This example is the dataloader of LISA Traffic Light Dataset,
which is a continuous dataset with Box2D label.

Dogs vs Cats Dataloader

This example is the dataloader of Dogs vs Cats Dataset,
which is a dataset with Classification label.

BSTLD Dataloader

This example is the dataloader of BSTLD Dataset,
which is a dataset with Box2D label.

Neolix OD Dataloader

This example is the dataloader of Neolix OD Dataset,
which is a dataset with Box3D label.

Leeds Sports Pose Daraloader

This example is the dataloader of Leeds Sports Pose Dataset,
which is a dataset with Keypoints2D label.

Note

The name of the dataloader function is a unique indentification of the dataset. It is in upper camel case and is generally obtained by removing special characters from the dataset name.

Take Dogs vs Cats dataset as an example, the name of its dataloader function is DogsVsCats().

See more dataloader examples in tensorbay.opendataset.

dataset

A uniform dataset format defined by TensorBay, which only contains one type of data collected from one sensor or without sensor information. According to the time continuity of data inside the dataset, a dataset can be a discontinuous dataset or a continuous dataset. Notes can be used to specify whether a dataset is continuous.

The corresponding class of dataset is Dataset.

See Dataset Structure for more details.

diff

TensorBay supports showing the status difference of the relative resource between commits or drafts in the form of diff.

draft

Similar with Git, a draft is a workspace in which changing the dataset is allowed.

A draft is created based on a branch, and the changes inside it will be made into a commit.

There are scenarios when modifications of a dataset are required, such as correcting errors, enlarging dataset, adding more types of labels, etc. Under these circumstances, create a draft, edit the dataset and commit the draft.

fusion dataset

A uniform dataset format defined by Tensorbay, which contains data collected from multiple sensors.

According to the time continuity of data inside the dataset, a fusion dataset can be a discontinuous fusion dataset or a continuous fusion dataset. Notes can be used to specify whether a fusion dataset is continuous.

The corresponding class of fusion dataset is FusionDataset.

See Fusion Dataset Structure for more details.

revision

Similar to Git, a revision is a reference to a single commit. And many methods in TensorBay SDK take revision as an argument.

Currently, a revision can be in the following forms:

  1. A full commit ID.

  2. A tag.

  3. A branch.

tag

TensorBay SDK has the ability to tag the specific commit in a dataset’s history as being important. Typically, people use this functionality to mark release points (v1.0, v2.0 and so on).

TBRN

TBRN is the abbreviation for TensorBay Resource Name, which represents the data or a collection of data stored in TensorBay uniquely.

Note that TBRN is only used in CLI.

TBRN begins with tb:, followed by the dataset name, the segment name and the file name.

The following is the general format for TBRN:

tb:[dataset_name]:[segment_name]://[remote_path]

Suppose there is an image 000000.jpg under the train segment of a dataset named example, then the TBRN of this image should be:

tb:example:train://000000.jpg

tracking

Tracking is a characteristic to describe the labels within a dataset or a fusion dataset.

The labels of a dataset are tracking means the labels contain tracking information, such as tracking ID, which is used for tracking tasks.

Tracking characteristic is stored in catalog, please see Label Format for more details.