The Northern Ireland group held a meeting at 3pm on Wednesday, November 18th, 2015 in the David Bates Building in the Queen’s University of Belfast. The speaker was Professor Peter McCullagh, FRS, University of Chicago, USA.
The talk dealt with Ewens’ processes and how a particular application involving species abundance counts, had been analysed by Fisher (Fisher, Corbet and Williams, 1943). The data comprised the abundance of Corbets Malayan butterflies and Williams moths. They arose as counts, where mr (m for multiplicity) is the number of species having exactly r specimens in a sample and can be represented by a vector . For example, if we have a maximum number of k = 3 specimens per species then a possible value for m is (2, 4, 3) = (12, 24, 33) = 1, 1, 2, 2, 2, 2, 3, 3, 3) so that the data may be viewed as a composition with k = 3 compartments and . Another quantity of interest was , the total number of specimens.
Professor McCullagh compared butterflies and moths by plotting multiplicities against r and showed that the plot of r * mr against r was approximately linear for both insects. Fisher had based the analysis on the Logarithmic distribution which the speaker derived via a Poisson Gamma mixture. When the resulting shape parameter, in the mixture, tends to zero , the one-parameter logarithmic distribution emerges. Fisher analysed the multiplicities by assuming that they followed a Poisson distribution with parameter . Professor McCullagh noted that this two-parameter model lay in the Exponential family with complete sufficient statistics . Fisher argued that a was the parameter of interest when comparing the “richness” of species and that could be treated as a nuisance parameter. This was not obvious, but Professor McCullagh showed why Fisher was correct. He then showed the connection with Ewens’ distribution which has probability function
note the conditioning on the second su.cient statistic N and note that . The results of analysing the data with Fisher’s models and Ewens’ model were similar: Fisher and Ewens’ 40:146 and 8:067, repectively. However in the celebrated 1943 paper Fisher had reported the which was an error. The reason for this error was unclear, especially when Fisher could easily have exploited the conditional arguments which led to Ewens’ distribution.
Professor McCullagh went on to extol the virtues of Ewens’ distribution, generalizing it to a Ewens’ process, making the connection with the well-known Chinese restaurant problem, en passant. After describing the properties of the process, he explained that the importance of the process lay in its extension to partitions involving sets rather than merely integers. This generalisation extended the scope of the model considerably and he remarked that process was not as well known as it should be, being rather under-studied in undergraduate curricula.
He concluded his talk by drawing attention to the search for empirical laws such as the finding described above and Taylor’s variance-to-mean power law.
The talk was received by acclaim and provoked a short discussion. The meeting then retired to a drinks reception held in the speaker’s honour.