Welcome to Semantic Feature Net - Healthcare, a comprehensive collection of tabular datasets related to the applications of machine learning in predictive tasks for healthcare. All datasets have been structured considering the semantic meaning of their features with concepts derived from the SNOMED-CT ontology. Each dataset has an associated set of features' annotations, which can be used for sharing valuable insights between diverse predictive tasks.

datasetcategoryinstancesfeaturesannotationsannotations %
Cardiovascular StudySurvey4238161594%
Diagnosis of COVID-19 (Subset)EHR603191895%
Diabetes Health IndicatorsSurvey253680222195%
Diabetes 130 USEHR101766493878%
GOSSIS-1-eICU Model ReadyEHR131051686088%
Stroke PredictionSurvey51101111100%
Heart Disease IndicatorsSurvey253680222195%
Heart Disease (Comprehensive)EHR1190121192%
HCV dataEHR6151313100%
HepatitisEHR142201995%
HiRID PreprocessedEHR33905181794%
Pima Indians DiabetesEHR7689889%
ILPDEHR5831111100%
Breast CancerEHR28610990%
metaMIMICEHR3492518417595%
Thyroid DiseaseEHR3772302790%

Annotated Features

Graphs below visualize the proximity of the annotated features. The heatmap shows the similarity between each pair of features. Their order corresponds to the clusters' belonging, illustrated on the scatter plot.

Annotated terms visualization
The graph above shows the relationship of annotated terms based on the taxonomy introduced in SNOMED-CT.

Citation