Dataset Catalog

Browse all available PAD datasets. Each dataset is available in Croissant-compliant format with metadata and split definitions.

Dataset Name	Description	Records	Files	Version	Published
FHI2020_Stratified_Sampling	Enhanced approach to selecting training/test sets for the FHI2020 dataset	8001	2	1.0	2025-03-24
FHI2021	Dataset: FHI2021 from PaperAnalyticalDeviceND/pad_dataset_registry	0	0	b26340e	2024-11-15
FHI2022	Dataset: FHI2022 from PaperAnalyticalDeviceND/pad_dataset_registry	0	0	70d4ad1	2024-11-15
FHI360_FHI2020-FHI2022_MidTrainingSet_Good_v1.0	Dataset: FHI360_FHI2020-FHI2022_MidTrainingSet_Good_v1.0 from PaperAnalyticalDeviceND/pad_dataset_registry	5924	3	fc7ff27	2025-03-24
FHI360_FHI2020_MidTrainingSet-Zero_Good_v1.1	Dataset: FHI360_FHI2020_MidTrainingSet-Zero_Good_v1.1 from PaperAnalyticalDeviceND/pad_dataset_registry	9027	3	fc7ff27	2025-03-24
FHI360_FHI2020_MidTrainingSet_Good_v1.0	Dataset: FHI360_FHI2020_MidTrainingSet_Good_v1.0 from PaperAnalyticalDeviceND/pad_dataset_registry	8792	2	fc7ff27	2025-03-24
FHI360_FHI360-FHI2020_BalancedData_v1.0	Dataset: FHI360_FHI360-FHI2020_BalancedData_v1.0 from PaperAnalyticalDeviceND/pad_dataset_registry	10483	4	ec795cc	2025-03-24
Leiberman-Lab_ChemoPADNNtraining2024_Partial-Drug-Set_v1.0	The ChemoPADNNtraining2024 Dataset is a curated collection of Paper Analytical Device (PAD) images used for chemotherapy drug identification and analysis.	3609	2	v1.0	2025-03-31
TFDA_MSH-Tanzania_v2.0	New version of the dataset based on the MSH Tanzania project.	2949	2	2.0	2025-03-24
Veripad_ChemoPAD-idPAD2.4_v1.0	Dataset: Veripad_ChemoPAD-idPAD2.4_v1.0 from PaperAnalyticalDeviceND/pad_dataset_registry	1424	3	ec795cc	2025-03-24

How to Access Datasets

All datasets are available through our GitHub repository and through our Croissant-compliant API.

You can directly access the dataset files in our GitHub repository.

All datasets are available through a Croissant-compliant API at:

https://paperanalyticaldevicend.github.io/pad_dataset_registry/api/datasets/{dataset-name}.json

A catalog of all available datasets can be accessed at:

https://paperanalyticaldevicend.github.io/pad_dataset_registry/api/catalog.json

The datasets in this registry are designed to be used with machine learning frameworks such as TensorFlow, PyTorch, and scikit-learn.

Each dataset includes:

To use these datasets, you can either: