AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment

  1. Sharing workflows and integrated training/validation?

    Egon Willighagen, Karolinska Institutet

    7 September 2011

    When reading the paper I was wondering if the visual aspects of workflow system can be used to visualize how model validation is performed. The current Figure 3 shows clearly that the model parameter optimization is done, and Figure 5 clearly shows that this is done prior to external validation. I assume the workflows of Figure 3 and 5 can be combined without the parameter optimization affecting the external validation. But I was wondering if it could be visualized in the same integrated workflow, how the test set extraction is done (for which you use a stratified random sampling) and perhaps the method that is used for the cross-validation? I can understand that the diversity of modeling methods may make this hard.

    Another thing I was wondering is how easy it is to share AZOrange workflows with others? Can workflows be shared, for example, on Has such been explored yet?

    A minor question is about the GPL license. This is bound to be the GNU GPL license, but which version (v2 or v3)? The GitHub repository lists only the GNU LGPL v3 license, and did not answer that question either.

    That leaves me you thank you for this interesting work. After Taverna and KNIME (and others) AZOrange may indeed have a high impact as python is a popular language among scientists.

