This topic describes how to use cache while opening remote data on Tensorbay.
While using online data, sometimes it may be neccessary to use the entire dataset multiple times, such as training model.
This would cause redundant requests and responses between the local computer and TensorBay, and cost extra time.
Therefore, TensorBaySDK provides caching to speed up data access and reduce repeated requests.
Get Remote Dataset#
To use the cache, first get the remote dataset on TensorBay.
from tensorbay import GAS from tensorbay.dataset import Dataset # Please visit `https://gas.graviti.com/tensorbay/developer` to get the AccessKey. gas = GAS("<YOUR_ACCESSKEY>") dataset = Dataset("<DATASET_NAME>", gas)
enable_cache() to start using cache for this dataset.
The cache path is set in the temporary directory by default, which differs according to the system.
It’s also feasible to pass a custom cache path to the function as below.
Please make sure there is enough free storage space to cache the dataset.
cache_enabled to check whether the cache is in use.
print(dataset.cache_enabled) # True
Cache is not available for datasets in draft status.
dataset.cache_enabled will remain
False for datasets in draft status,
even if the cache has already been set by
After enabling the cache, use the data as desired.
Note that the cache works when the
data.open() method is called,
and only data and mask labels will be cached.
segment = dataset MAX_EPOCH = 100 for epoch in range(MAX_EPOCH): for data in segment: data.open() # code using opened data here
Delete Cache Data#
After use, according to the cache path, the cache data can be deleted as needed.
Note that if the default cache path is used, the cache will be removed automatically when the computer restarts.