Open Datasets
Open Research Datasets
I am committed to open science and have released several high-quality datasets for the research community:
1. CrowdGleason Dataset
Type: Public prostate cancer dataset with crowdsourced annotations
Repository: Zenodo
Related Publication: The CrowdGleason dataset: Learning the Gleason grade from crowds and experts
Description: Multi-annotated prostate cancer histological images for learning from non-expert crowdsourced annotations with ground truth from expert pathologists.
2. Fusocelular Dataset
Type: Public skin cancer dataset with multiple annotators
Repository: Figshare
Related Publication: A fusocelular skin dataset with whole slide images for deep learning models
Description: Skin cancer dataset with fusocelular cell types annotated by multiple resident physicians, providing diverse perspectives on histological classification.
3. CR-AI4SkIN Dataset
Type: Public crowdsourced skin cancer dataset
Repository: Zenodo
Related Publication: Annotation protocol and crowdsourcing multiple instance learning classification of skin histological images: The CR-AI4SkIN dataset
Description: Comprehensive crowdsourced annotation dataset for skin cancer histological images with multiple non-expert annotators, designed for multiple instance learning approaches.
Dataset Characteristics
All released datasets feature:
- Multi-annotator annotations for robust annotation quality assessment
- High-resolution histological images suitable for deep learning
- Comprehensive metadata including hospital source, tissue types, and pathology information
- Open-source licenses for reproducible research
- Detailed documentation and usage guidelines
These datasets have been instrumental in developing machine learning methods that can effectively learn from crowdsourced annotations and handle multiple expert opinions.
