Datasets
- class indico.queries.datasets.CreateDataset(name, files, wait=True, dataset_type='TEXT', from_local_images=False, image_filename_col='filename', batch_size=20, ocr_engine=None, omnipage_ocr_options=None, read_api_ocr_options=None)
Create a dataset and upload the associated files.
- Parameters
name (str) – Name of the dataset
files (List[str]) – List of pathnames to the dataset files
- Options:
dataset_type (str): Type of dataset to create [TEXT, DOCUMENT, IMAGE] wait (bool, default=True): Wait for the dataset to upload and finish
- Returns
Dataset object
- Raises
IndicoError –
- class indico.queries.datasets.GetDataset(id)
Retrieve a dataset description object
- Parameters
id (int) – id of the dataset to query
- Returns
Dataset object
Raises:
- class indico.queries.datasets.GetDatasetStatus(id)
Get the status of a dataset
- Parameters
id (int) – id of the dataset to query
- Returns
COMPLETE or FAILED
- Return type
status (str)
Raises:
- class indico.queries.datasets.GetDatasetFileStatus(id)
Get the status of dataset file upload
- Parameters
id (int) – id of the dataset to query
- Returns
DOWNLOADED or FAILED
- Return type
status (str)
Raises:
- class indico.queries.datasets.ListDatasets(*, limit=100)
List all of your datasets
- Options:
limit (int, default=100): Max number of datasets to retrieve
- Returns
List[Dataset]
Raises:
- class indico.queries.datasets.DeleteDataset(id)
Delete a dataset
- Parameters
id (int) – ID of the dataset
- Returns
The success of the operation
- Return type
success (bool)
Raises:
- class indico.queries.datasets.AddDatasetFiles(dataset_id, files, autoprocess=False, wait=True, batch_size=20)
Add files to a dataset.
- Parameters
dataset_id (int) – ID of the dataset
files (List[str]) – List of pathnames to the dataset files
- Options:
autoprocess (bool, default=False): Automatically process new dataset files wait (bool, default=True): Block while polling for status of files batch_size (int, default=20): Batch size for uploading files
- Returns
Dataset
Raises:
- class indico.queries.datasets.RemoveDatasetFile(dataset_id, file_id)
Remove a file from a dataset by ID. To retrieve a list of files in a dataset, see GetDatasetFileStatus.
- Parameters
dataset_id (int) – Dataset ID
file_id (int) – Datafile ID (returned by GetDatasetFileStatus)
- Returns
Dataset object
- Raises
IndicoError –