Editorial Summary :
In part 1 of this series, I provided an overview of the data mining platform Orange which focuses on data science education . In part 2 I will focus on some unique Orange educational features users should find helpful . Most of these features are not discussed on the Orange website nor in the YouTube videos . For classification, in the test and score widget, you are provided with AUC, classification accuracy (CA), F1 score, recall (sensitivity), and precision . You can also add the training and test times, specificity, and LogLoss . You can select a node in a decision tree, any category in a confusion matrix, a data point in a scatter plot, a box plot or histogram, or a section of the mosaic plot . The constant widget is a baseline algorithm that bases its prediction solely on frequency . This is an example of class imbalance where there are many more patients without cancer, compared to those with cancer . This creates multiple problems that we won’t cover in this series but suffice it to say that any algorithm you test will have to be better than this baseline . Orange provides a customizable and interactive nomogram that you can use with logistic regression and Naive Bayes to see how the prediction probabilities change after changing the predictors . The default baseline probability is 40% for colored (calcified coronary arteries) but when you slide the blue icon to the right to 3 (indicating 3 calcified arteries) the probability goes to 91% . The Receiver Operator Curve (ROC) is created by plotting the true positive rate against the false positive rate at multiple thresholds . The vertical slider can be moved to see what happens with different thresholds .
Key Highlights :
- In part 2 of this series, I provide an overview of the data mining platform Orange which focuses on data science education .
- In this article, I will focus on some unique features users should find helpful .
- The constant widget is a baseline algorithm that bases its prediction solely on frequency .
- The stacking widget combines multiple algorithms to improve performance .
- The selection rows widget can be used to filter data .
- Orange provides a customizable and interactive nomogram that you can use with logistic regression and Naive Bayes .
The editorial is based on the content sourced from medium.com