Using Genstat in the ‘sustainable intensification’ of food production
I have been an avid supporter of Genstat since starting my PhD in 1985 (gulp!). As an applied plant biologist, it is perhaps unsurprising that I found procedures that had been developed with colleagues at Rothamsted Research to be highly relevant to the analysis of my own designed experiments. During my thirty-five years in research, teaching and management at higher education institutions, I have continued to rely on Genstat as my core statistical programme both as an investigator and an educator.
My own research is in the area of crop science, and in the last decade or so, justified (to research funders at least) on the basis of the need to ‘sustainably intensify’ our food production \[1\]. For many the phrase is an imperative, i.e. we have to use our land more efficiently and it will have to keep on being able to do this; for others the phrase is an oxymoron, e.g. if land-use efficiency means producing ever higher yields per unit area this will inevitably degrade the very ecosystems and resources on which agricultural production by future generations will depend; a third group would argue that sustainable intensification is unnecessary – if (big if) we were each satisfied with the per capita calorific availability that was the average for humans in 1960, we are already producing enough calories to feed 10 billion people. Hopefully my research has been relevant whichever ‘camp’ you may belong to.
In my early days as a researcher I was primarily interested in the optimising of rates of synthetic inputs (such as fertilizers and crop protection chemicals) to achieve yield and grain quality targets in wheat, to improve economic and environmental sustainability. Experiments were usually complete multi-factorial designs (genetic x management), replicated in randomised blocks usually with some split plotting. Given Genstat’s origins, it is unsurprising that the treatment structure and block structure routines allow a very large degree of flexibility to cope admirably with all types of designed field experiment. Of particular use to me was the ability to split treatment effects into polynomial contrasts within ANOVA \[2, 3, 4\]. ‘Life is not a polynomial’ but this approach provides an initial, statistically sensitive test as to whether there is a response to increasing rates of an input, and some idea as to the complexity of that response. Guided by the ANOVA, the FITCURVE routine provides a straightforward step for exploring more biologically meaningful and parsimonious fits to the responses to inputs \[5,6\]. The standard curves available in Genstat are highly relevant to the applied biologist but further development and modification of responses is supported by FITNONLINEAR \[7,8\]. FITCURVE and FITNONLINEAR have also been useful in fitting responses over time.
Often treatment (such as fungicide) effects on disease or green leaf area on individual days are associated with effects on the errors around the treatment means. This variance heterogeneity is easily identified in the residual plots provided by ANOVA and can be rectified by transformations. Transformed data, however, can be difficult to interpret, whereas fitting responses over time (e.g. logistic or gompertz) can generate new variables such as time scalars which are readily interpretable, and for which the residuals are often normally distributed and lacking variance heterogeneity \[7\].
Over the years the teams of PhD students and researchers I’ve been involved with have continued to be heavily reliant on ANOVA, FITCURVE and FITNONLINEAR although the responses these days are more likely to be due to climate variables (genetic x environment) \[9\]. As datasets and experiments have got bigger and less complete, residual maximum likelihood (REML) has proved more flexible \[10, 11, 12\] than ANOVA, and principal components (PCP) analysis ably condenses large variate x experimental unit matrixes \[13,14\]. The current portfolio of procedures is well matched for advances in applied biology whether involving meta-analyses, ‘big data’, statistics to underpin machine learning, and/or precision agriculture.
As an educator I have taught BSc and MSc statistics and research methods modules with Genstat, and have enabled level 5, 6, & 7 students using the programme to assess class-generated data. The interface is intuitive, and it is easy to demonstrate variations in approaches to highlight the importance of, for example, replication (and the curse of pseudo-replication), blocking structures, covariance, randomisation and the like. Genstat has been the main statistical programme used by the 40 PhD students who have been gracious enough to allow me to be involved in their supervision.
1 Reaping the benefits: science and the sustainable intensification of global agriculture ISBN: 978-0-85403-784-1;
2_J. Agric. Sci_ **138**, 317-331, doi: 10.1017\\S0021859602002137;
3_J. Cereal Sci._ **37**, 295-309, doi: 10.1006/jcrs.2002.0501;
4_Plant Soil_ **360**, 93-107, doi: 10.1007/s11104-012-1203-x;
5_J. Sci. Food Agric._ **85**, 727-742, doi: 10.1002/jsfa.2025;
6_J. Agron. Crop Sci._ **200**, 36-45, doi: 10.1111/jac.12038;
7_Annals Appl. Biol._ **136**, 77-84, doi: 10.1111/j.1744-7348.2000.tb00011.x;
8_Field Crops Res._ **95**, 49-63, doi: 10.1016/j.fcr.2005.02.001;
9Frontiers Plant Science **8**:51, doi: 10.3389/fpls.2017.00051;
10_J Agric Sci_ **128**, 135-142, doi: 10.1017/S0021859696004054;
11_Agric. Forest Meteor._ **94**, 159-170, doi: 10.1016/S0168-1923(99)00020-9;
12_PLoS One_ **11**(5), e0156056, doi: 10.1371/journal.pone.0156056;
13_J. Sci. Food Agric._ 84, 227-236, doi: 10.1002/jsfa.1657;
14Euphytica **166**, 249-263, doi: 10.1007/s10681-008-9838-7
Mike Gooding FRSB
(Mike has headed agriculture and been Professor of Crop Science at The University of Reading, Aberystwyth University, and the Royal Agricultural University).