Does anybody have advice on reporting ANN model performance for publication in a journal? It would seem to me the r value for the train, test and validate subsets would be most useful, but there are not too many publications which report all 3 results. Furthermore, any advice on which correlation to report - r or r2. Thanks
Tom Hill has authored books and journal articles, so I asked for his opinion on this topic. He says:
It depends on the traditions and expectations of the journal. But universally, to gage the performance of the neural network, and in particular ANN, as a reader I would want to see the performance of training, testing, and validation samples. If they are about the same I would have confidence that the population I am dealing with is homogeneous, and given a good R-square, that the model lift is good (of course, for classification problems there are numerous other statistics).
There are other ways to look at model performance, other than R or R-square (I wouldn't have a specific preference for either of those two), such as mean deviation, average relative deviation, etc.(see the Goodness of Fit module and documentation). In some applications, not only does the deviation-from-predicted-values matter, but also the distribution of the predictions which should match the observed distribution of the outputs of interest. This is relevant when building simulation models of manufacturing processes for example.
Hope this is useful.
Thomas Hill, Ph.D.
Executive Director, Analytics
Dell | Information Management Group, Dell Software
Thank you to both of you for your advice, if I could push it a little further, have you got any suggestions about which subset (train, test or validate) is most appropriate to report in the Abstract.
I have reported the correlation for each subset in my Results, but for my Abstract I want to succinctly describe the model performance so I have reported on the r for train subset, however, one of my co-authors is suggesting reporting on the correlation for the test subset.
I am perplexed, my understanding is that for a regression ANN, the data are partitioned, the model uses the train subset to ‘learn’ and the test subset is used to compare error (in train and test subsets, in parallel, as to attempt to avoid overfitting). Once the train and test error plateau, the network stops cycling through epochs. And then, I thought the validation subset was used to re-run the learned networks to check of there are aberrations using independent data. Perhaps, rather than reporting on the train (in the Abstract) I should be reporting on the correlation of the validation subset (in the Abstract)?
I realise this is quite a minor issue, but in my field (which is ecological modelling) the literature and published papers using neural networks is not as vast as other fields, most papers only report a single correlation statistic, which I have assumed to be from the train subset.
Thanks again for your advice, it has been very helpful.