Update Dataset
This topic describes how to update datasets, including:
The following scenario is used for demonstrating how to update data and label:
Upload a dataset.
Update the dataset’s labels.
Add some data to the dataset.
Update Dataset Meta
TensorBay SDK supports a method to update dataset meta info.
gas.update_dataset("DATASET_NAME", alias="alias", is_public=True)
Update Dataset Notes
TensorBay SDK supports a method to update dataset notes. The dataset can be updated into continuous
dataset by setting is_continuous
to True
.
dataset_client = gas.get_dataset("DATASET_NAME")
dataset_client.create_draft("draft-1")
dataset_client.update_notes(is_continuous=True)
dataset_client.commit("update notes")
Update Label
TensorBay SDK supports methods to update labels to overwrite previous labels.
Get a previously uploaded dataset and create a draft:
dataset_client.create_draft("draft-2")
Update the catalog if needed:
dataset_client.upload_catalog(dataset.catalog)
Overwrite previous labels with new label on dataset:
for segment in dataset:
segment_client = dataset_client.get_segment(segment.name)
for data in segment:
segment_client.upload_label(data)
Commit the dataset:
dataset_client.commit("update labels")
Important
Uploading labels operation will overwrite all types of labels in data.
Update Data
Add new data to dataset.
gas.upload_dataset(dataset, jobs=8, skip_uploaded_files=True)
Set skip_uploaded_files=True to skip uploaded data.
Overwrite uploaded data to dataset.
gas.upload_dataset(dataset, jobs=8)
The default value of skip_uploaded_files is false, use it to overwrite uploaded data.
Note
The segment name and data name are used to identify data, which means if two data’s segment names and data names are the same, then they will be regarded as one data.
Important
Uploading dataset operation will only add or overwrite data, Data uploaded before will not be deleted.
Delete segment by the segment name.
dataset_client.create_draft("draft-3")
dataset_client.delete_segment("SegmentName")
Delete data by the data remote path.
segment_client = dataset_client.get_segment("SegmentName")
segment_client.delete_data("a.png")
For a fusion dataset, TensorBay SDK supports deleting a frame by its id.
segment_client.delete_frame("00000000003W09TEMC1HXYMC74")