Welcome to Semantic Feature Net - Healthcare, a comprehensive collection of tabular datasets related to the applications of machine learning in predictive tasks for healthcare. All datasets have been structured considering the semantic meaning of their features with concepts derived from the SNOMED-CT ontology. Each dataset has an associated set of features' annotations, which can be used for sharing valuable insights between diverse predictive tasks.
dataset | category | instances | features | annotations | annotations % |
---|---|---|---|---|---|
Cardiovascular Study | Survey | 4238 | 16 | 15 | 94% |
Diagnosis of COVID-19 (Subset) | EHR | 603 | 19 | 18 | 95% |
Diabetes Health Indicators | Survey | 253680 | 22 | 21 | 95% |
Diabetes 130 US | EHR | 101766 | 49 | 38 | 78% |
GOSSIS-1-eICU Model Ready | EHR | 131051 | 68 | 60 | 88% |
Stroke Prediction | Survey | 5110 | 11 | 11 | 100% |
Heart Disease Indicators | Survey | 253680 | 22 | 21 | 95% |
Heart Disease (Comprehensive) | EHR | 1190 | 12 | 11 | 92% |
HCV data | EHR | 615 | 13 | 13 | 100% |
Hepatitis | EHR | 142 | 20 | 19 | 95% |
HiRID Preprocessed | EHR | 33905 | 18 | 17 | 94% |
Pima Indians Diabetes | EHR | 768 | 9 | 8 | 89% |
ILPD | EHR | 583 | 11 | 11 | 100% |
Breast Cancer | EHR | 286 | 10 | 9 | 90% |
metaMIMIC | EHR | 34925 | 184 | 175 | 95% |
Thyroid Disease | EHR | 3772 | 30 | 27 | 90% |
Annotated Features
Graphs below visualize the proximity of the annotated features. The heatmap shows the similarity between each pair of features. Their order corresponds to the clusters' belonging, illustrated on the scatter plot.
The graph above shows the relationship of annotated terms based on the taxonomy introduced in SNOMED-CT.