CAVEAT ON THE ERROR ANALYSIS FOR STEREOLOGICAL ESTIMATES

It is frequently asked that how big a sample size, or how much measurement, is needed to achieve an accurate stereological estimate. The observed total error of a stereological estimate arises from individual difference (i.e. inter-animal / organ difference or biological variation) and intra-individual variation (or the stereological error). Statistical methods for error analysis familiar to most biological researchers are based on independent random sampling, however systematic random sampling, which is usually more efficient, is almost always performed in practice. A number of methods for error analysis were utilized in a number of model and actual studies in this paper to demonstrate from a practical point of view the pros and cons of different error analytical methods. Assumption of independence for a systematic sampling will result in overestimation of the stereological error as shown by the studies. A simple and practical approach for error analysis as recommended in this paper is to divide the systematic sample from an organ into two systematic sub-samples, regard them as two independent sub-samples and then compare the difference between the two sub-sample means.


INTRODUCTION
The prerequisite for a meaningful error analysis is that unbiased method, which normally means random sampling in stereological or morphometric practice, is used to obtain an estimate.The total error of a stereological estimate, which is usually expressed as CE (coefficient of error, equal to SEM, the standard error of the mean, divided by the mean), tells whether the overall estimate is satisfactory or not.As much effort is often made to starting an experiment and obtaining a sample for measurement, measurement with precision is often sought after.One of the most significant recent findings in stereology is considered to be the "Do more less well!" rule (Gundersen and ∅sterby, 1981), the fact that manual measurement with or without the help of an image system rather than automatic image analysis is often the choice in practice, or the rule of experience that the number of measurements per organ (or biological unit) does not have to be very big (e.g.no more than 100~200 point sampled intercepts or test points hitting the structure in concern to be measured or counted per organ as suggested by Gundersen et al., 1988a,b) if uniform (systematic) random sampling with a proper spacing of test probes is adopted (Miles, 1987).This is amazing to potential or new users of stereology and some actual error analysis are needed for them to be convinced.Statistical methods for error analysis familiar to most biological researchers are based on independent (simple) random sampling, however more efficient systematic sampling is almost always used in practice.A number of methods available for error analysis were applied in a number of model and actual studies in this paper to demonstrate from a practical point of view the virtues and defects of the methods.

METHODS FOR ERROR ANALYSIS
Three methods for error analysis are mainly concerned in this paper.The error of an estimate is expressed as a percentage of the total error contributed by the intra-individual (or organ or biological unit) stereological error and the calculation was based on the ratio between squared intra-and total SEMs or CEs.
Method 1.Consider all the n in measurements (e.g.intercept lengths or field measurements such as point fractions) sampled from an individual as an independent random sample, and the stereological error is calculated by Method 2. Consider the intra-individual multi-stage sampling as an independent nested sampling and calculate the error according to the classical method as described by Shay (1975) and Gundersen and ∅sterby (1981).
Method 3. Record the systematic intra-individual measurements from an individual in order, divide them into two systematic sub-samples, regard them as two independent random sub-samples and then calculate the stereological error by (2) where thex 1 and x 2 are the two sub-sample means.Similar method was tentatively used for the error analysis of the fractionator estimator by Geiser et al (1989).
Model Study 1 (an independent random sampling model for particle size estimation) Suppose there is a population of ORGAN bags in each of which there are billions of particles with sizes uniformly distributed between 1 and 99999 (u).And suppose that each ORGAN consists of a great number of SECTION bags and each SECTION consists of a great number of FIELD bags in each of which there are a great number of particles.To estimate the mean particle size, nested sampling is presumed: 5 ORGANs are sampled from the population, and then 5 SECTIONs from each ORGAN, 5 FIELDs from each SECTION and 5 particles from each FIELD are sampled step by step, all in an independent random manner.That is, a total of 125 particles from each ORGAN (625 from the population) are sampled.625 5-digit random numbers are chosen from a random number table to represent the particle sizes sampled from the 5 ORGANs in order.

Model Study 2 (a systematic random sampling model for volume fraction estimation)
Suppose there is a population of spherical cells and in the center of each cell there is one spherical nucleus.5 cells are randomly sampled and a set of 4 systematic random parallel sections through each cell is obtained.The diameters of these sampled cells are arbitrarily presumed to be 11, 12, 13, 12 and 11 (u) with their nuclear diameters being 7, 8, 9, 8 and 7 (u), respectively.The distance between the parallel sections is exactly 1/4 of each cell's diameter and the distances from the cell end to the first section are determined to be 1.04, 2.23, 2.41, 2.89 and 0.94 (u) for the 5 cells, respectively, using a random number table.
The areas of the nuclear and cell profiles on the sections are calculated according to their diameters.Thus, (i) a consistent estimate of the nuclear volume fraction for each of the 5 cells is estimated by dividing the total area of the cell profiles by the total area of the nuclear profiles on the 4 sections (Mayhew and Cruz-Orive, 1974).(ii) Consider the 4 sections through each cell as independent and the nuclear volume fraction for each cell is also estimated by averaging the 4 nuclear area fractions (area of nuclear profile / area of cell profile).This is an inconsistent and biased estimator for volume fraction (Mayhew and Cruz-Orive, 1974).(iii) The Matheron's transitive method for one-dimensional systematic sampling as described by Gundersen and Jensen (1987) and Cruz-Orive ( 1989) is also tentatively used for error analysis in this model.
Actual Study 3 (estimating the numerical density of spermatozoa in the rat epididymis) An epididymis was removed from each of 6 normal adult male SD rats and three systematic sections orthogonal to the long axis of each organ were cut.The sections were methacrylate-embedded 25 μm-thick sections, stained with hematoxylin, and observed on a video screen with a 100× oil lens at a final magnification of 3286.Fields were systematically sampled with a motorized stage, the space between fields being 0.75~1.00mm along X or Y-axis.On each field was superimposed a set of 12 regularly spaced counting frames each with area 48 μm 2 .Section was optically sectioned along Z-axis with a distance of 0.25 μm between the focusing planes (optical sections) using a computerized stage.Elongated and curved spermatozoa were counted in 10 μm of section in depth according to the optical disector principle (Gundersen et al., 1988b), thus the number of spermatozoa per volume of epididymal fluid filled with spermatozoa was estimated.The upper left corners (i.e.test points) of the counting frames hitting the spermatozoal fluid in the epididymal tubule lumen on the first focusing plane were counted to represent the number of disectors used for spermatozoal counting.The average number of disectors and the number of spermatozoa counted per animal were 138 and 148, respectively.To analyze inter-disector variation, those disectors with the test points not hitting the spermatozoal fluid but there were spermatozoa counted in them were also included.As a result, an average of 154 data (spermatozoal numbers per "disector") per organ and 7 to 112 data per section were collected.
Error analysis was also performed using the equation ( 7) for the error analysis of numerical density estimate in the paper by Braendgaard et al. (1990).
Actual Study 4 (estimating the volume fraction of the inter-villus space in placenta) 5 placenta were obtained from 5 full-term Chinese women with pregnancy anemia (maternal venous hemoglobin levels 80~90 g/L).7-8 (average 7.6) vertical sections (methacrylate-embedded, 5 μm-thick, stained with hematoxylin and eosin) orthogonal to the fetal side of the placenta and of similar sizes were cut from each placenta (Baddeley et al., 1986).Sections were observed on a video screen at a final magnification of 631 and fields were systematically sampled with a motorized stage, the space between fields being 800 μm along X or Y-axis.A test system with 20 regularly spaced test points was superimposed on each field and test points hitting the inter-villus space and the whole section were counted to estimate the volume fraction of the inter-villus space in placenta.9~20 fields (average 17.6) were measured per section.
Assuming a binomial distribution of the test points in space (i.e.regarding the test points independent), the intra-organ error was also evaluated according to the equation (3.26) in the book by Weibel (1979): where P and Po were the total numbers of test points hitting the inter-villus space and the placental section, respectively.
Actual Study 5 (estimating intercept lengths in the placental membrane) The same materials described above were used in this study.A straight test line with length 200 μm was also superimposed on each field.The test line was rotated after each field was measured so that the directions of test lines were isotropic in distribution in placenta (sine-weighted on the vertical sections according to Baddeley et al., 1986).Intercept lengths were measured along the direction of the test line, from the intersection between the test line and the boundary of the capillary vessels in the terminal villus to the nearest boundary of the terminal villus, to estimate the mean thickness of the placental membrane.Those intercepts without completely inside the placental membrane (e.g.those crossing the other side of the capillary vessel or another vessel) were not measured.2~65 (average 22) intercepts were measured per section.

RESULTS
The main results of the error analysis are shown in Table 1.
In model study 1, the true (theoretical) inter-ORGAN mean particle size should be 50000 (u) and was estimated to be 47734 (u).The true inter-ORGAN variation should be 0, i.e. the observed total error would be all contributed by the intra-ORGAN sampling.The error contribution of intra-ORGAN sampling as estimated by Method 2 appeared to be more consistent with the true value (100%) than by Method 3 in this model (Table 1).
In model study 2, from the true volume fractions of the 5 cells as calculated from the nuclear and cell (3D) diameters, the mean nuclear volume fraction estimate for the cell population is 28.8% with a CE of 4.85% which is contributed by inter-cell variation.From the consistent estimates of the 5 cells' volume fractions, the mean volume fraction estimate is 28.88% with a total CE of 6.34% which is contributed by both inter-and intra-cell variations.The intra-cell variation would therefore account for 41% [(0.06342 -0.04852) / 0.06342] of the total error.When the 4 systematic sections through each cell were regarded as independent (i.e. an estimate was calculated from each section), the mean volume fraction estimate was 21.41% (CE 6.88%).
In actual study 3, the spermatozoal number per unit volume (480 μm 3 ) of spermatozoal fluid in epididymis was estimated to be 1.16 (the total spermatozoal number counted per organ, divided by the total volume of the optical disectors used for the counting), with a total CE of 10.38%.When the number of disectors was specially handled as described for estimating the inter-disector variation, the spermatozoal number per "disector" was 0.97 (CE 6.49%).3 a: the true (theoretical) value; b: according to the true and consistently estimated volume fractions (see the second paragraph on next page); c: according to the Matheron's transitive method as described by Gundersen and Jensen (1987) and Cruz-Orive (1989); d: according to Braendgaard et al (1990); e: according to Eq. 3.
YANG Z ET AL: Caveat on the error analysis

DISCUSSION
It has been well recognized that systematic sampling is usually more efficient than independent sampling and the stereological error will be overestimated when a systematic sample is treated as an independent one (Gundersen and Jensen, 1987;Cruz-Orive, 1989;Mattfeldt, 1989).Methods 1 to 3 based on assumption of independent sampling will, therefore, tend to overestimate the stereological error when used in a study with a systematic sampling scheme.However the magnitude of the overestimation by Method 3 would not be as large as by Methods 1 and 2 as the two subsamples used in Method 3 are still systematic.Consistent results were obtained in the studies of this paper: (i) the errors estimated by Methods 1 and 2 were about 4 to 200 times larger than that by Method 3 in studies 2~5, and (ii) bias of the error estimate by Methods 1 and 2 was relatively comparable with Method 3 in the model study 1 where the intra-individual sampling was indeed independent random (Table 1).
Regarding the classical Method 2 for nested sampling, the sample size at each sampling level should be constant, or the same average sample size should be used in calculation, otherwise inconsistent results would be obtained: the total intra-individual error would be quite different when different levels of intra-sampling are concerned.But such a constant sample size at each sampling level may not be guaranteed in a systematic sampling practice (see the actual studies 3~5).
As a matter of fact, if the sample size is arbitrarily made to be constant, e.g. the same number of fields are always sampled from each systematic section no matter how big the section is, the stereological estimate is not unbiased.The importance of systematic random sampling, or uniform random sampling, can never be over emphasized to obtain an unbiased estimate (Cruz-Orive and Weibel, 1981;Gundersen, 1991).
Treating a systematic sample as independent for error analysis by Methods 1 and 2 may be an awkward procedure as well.In actual study 3, the inter-disector variation was hard to be evaluated because not all disectors would be completely inside the measuring space: the spermatozoal fluid in epididymis.And it may also induce bias for the stereological estimate.In model study 2, for example, calculating an "independent" volume fraction estimate from each section through the cell resulted in an underestimation of ~26%.
In summary, systematic sampling rather than independent sampling is often used in practice and unbiased error estimator for systematic sampling is not available.Assuming independence for a systematic sample will overestimate the stereological error.Divide a systematic sample into two systematic sub-samples and evaluate error by comparing the two sub-sample means (see Eq. 2).This appeared to be a reasonably simple and practical good method for error analysis.
A preliminary report of some of the data (Yang et al., 1999) has been presented at the X th International Congress for Stereology, Melbourne, Australia, 1-4 November 1999.

Table 1 .
Results of error analysis.