Colour Image In 2d And 3d Microscopy For The Automation Of Pollen Rate Measurement

Pollen monitoring is of great importance for the prevention of allergy. As this activity is still largely carried out by humans, there is an increasing interest in the automation of pollen monitoring. The goal is to reduce monitoring time in order to plan more efficient treatments. In this context, an original device based on computer vision is developed. The goal of such a system is to provide accurate measurement of pollen concentration. This information can be used as well by palynologists, clinicians or by a forecast system to predict pollen dispersion. The system is composed of two modules: pollen grain extraction and pollen grain recognition. In the first module, the pollen grains are observed in light microscopy and are extracted automatically from a microscopic slide dyed with fuchsin and digitised in 3D. The colour segmentation techniques implemented on a hardware architecture are presented. In the second module, the pollen grains are analysed for recognition. To accomplish recognition, it is necessary to work on 3D images and to use deep palynological knowledge. This knowledge describes the pollen types according to their main visible characteristerics and to those which are important for recognition. Some pollen structures are identified, like the pore with annulus in Poaceae, the reticulum in Olea and similar pollen types or the cytoplasm in Cupressaceae. Preliminary results show correct recognition of some pollen types, like Urticaceae or Poaceae, and some groups of pollen types, like reticulate group.


INTRODUCTION
Automatic recognition of pollen grains is a relatively new application in computer vision.There have been studies trying to differentiate aerobiological spores by image analysis (Benyon et al., 1999) or to identify pollen texture by neural networks (Li and Flenley, 1999).Recently, work has been presented on pollen recognition using 2D statistical classification (Jones, 2000) or using 3D gray scale invariants with confocal microscopy (Ronneberger, 2000).
The original aspects of our approach for pollen recognition are the combination of statistic-based and knowledge-based techniques, the use of 3D and colour information, and the use of external information about the origin of the grain (sampling date and location).
The semi-automatic system is composed of two modules: pollen grain extraction and pollen grain recognition.

FIRST MODULE: POLLEN GRAIN EXTRACTION
The first module analyses the pollen slide and extracts the pollen grains without recognition of their types.In this section, both the hardware and the software of the module are described.The isolation of the pollen grains on the slide uses a two dimensions algorithm (Tomczak, 2000), then 3D images are digitised.
The input samples are microscopic slides which represent daily harvests (Stillman, 1996;Galan Soldevilla, 1997).A workstation for both automatic and manual handling and reading of the slides has been designed (Fig. 1).The hardware of the system includes an optical transmitted light microscope equipped with a 60X lens (ZEISS Axiolab), a mono CCD colour camera (SONY XC711) with a framegrabber card (MATROX Meteor RGB) for image acquisition, and a micro-positioning device (PHYSIK INSTRUMENTE) to shift the slide under the microscope.These components are driven by a PC computer.A graphic interface enables the technician to easily operate the system.The semi-automatic pollen extraction module (Fig. 2) is implemented on this workstation.The system needs to extract information about pollen grains from image data.To achieve this, two problems must be solved.First, autonomous image acquisition in microscopy requires to adjust sharpness in real time before acquiring image data.Therefore, an automated image focusing algorithm has been conceived.It is based on a sharpness criterion computed from image data and on a maximum criterion searching strategy (Tomczak, 1998).It allows the system to compute the best focusing position for a given sample from a small number of measuring positions in real time.Once the image has been focused, the second problem is the detection of pollen grains in the scene.The slides are currently dyed with fucshin (pink).However, the variation of coloration among the pollen types is important and some other airborne particles are also sensitive to the colorant.For this reason, simple segmentation techniques (for instance, techniques only based on chrominance analysis) are not efficient enough to localise and isolate the pollen grains.To solve this problem, a localisation algorithm based on a split and merge scheme with markovian relaxation has been conceived.It consists in three steps: colour coding (Noriega, 1996), segmentation and interpretation (Rouquet, 1998) and detection and extraction of pollen grains (Tomczak, 2000).
In Fig. 3 an example is shown for detection and extraction of the pollen grains from a RGB image.The localisation rate is estimated to be over 90% of the total pollen grains on the slides.This rate can be increased with a more precise dye dosing for the preparation of slides.This rate is better than the method proposed by (France et al.,97) which succeeded in the localisation of 80% of the pollen grains from grey level images using a neural network.
Once the central image of a pollen grain is detected, the last step is the acquisition of the whole grain in three dimensions.To achieve this, the system automatically digitise the grain into a sequence of 100 colour images showing the grain at different focus (with a step of 0.5 microns -see Fig. 4).This sequence of images allows to perform the identification using 3D characteristics.

SECOND MODULE: POLLEN GRAIN RECOGNITION
From a sequence of 100 images representing the pollen grain at different focus levels, the next step is to recognise its type.The identification of the pollen grain type is done using two kinds of information: -Global measures and statistics computed on the central image of the grain -Type-specific characteristics searched on selected images of the sequence.
The main difficulties for recognition are due to the particular appearance of pollen grains in the images.The pollen grains are 3D translucent objects, almost spherical, with sizes varying mostly from 20 to 80 microns.They are observed using an optical microscope, as described in the previous section, which can only focus partially on the grains, introducing blur in the digitised images (see Fig. 4).For more details, see (Tomczak, 2000).The first step of recognition performs a coarse classification by identifying some plausible hypotheses regarding to the type of an unknown grain.These hypotheses are used to guide the next processing steps.The grain is segmented from the central image of the sequence using automatic thresholding techniques based on colour histogram (k-means method applied on RGB histograms) and some mathematical morphological operations (opening and closing).Some global measures are computed on the grain.These measures are classical pattern recognition features: mean colour, size, perimeter, compactness, eccentricity, moments of inertia, convex hull area, concavity, convexity.Such features have already been used in other applications such as fungal spores differentiation (Benyon et al., 1999) or planktic foraminifera identification (Yu et al., 1996).
From a database containing 350 reference pollen grains of 30 different types, the system has learnt the covariance matrices representing the different types regarding to their most descriptive measures.The Mahalanobis distance is computed between an unknown grain and the existing types.For example, for an unknown grain, one can obtain the following sorted list of possible types with their respective distances: Cupressaceae (2.23), Coriaria (2.63), Platanus (6.27), Alnus (6.69), Brassicaceae (6.86).This list of possible types is used to select the characteristics that the system searches to confirm the initial hypotheses.
We have performed the classification on the previous database using the leave-one-out technique (Lachenbruch, 1968).Only the global measures have been used in this test to obtain a classification result of 67% of well-recognised pollen grains.This result is not satisfactory and leads us to include more domaindependant characteristics to recognise the pollen grains.
The second step of recognition is to look for specific pollen characteristics in 3D.Different pollen types can have different characteristics.These characteristics are already used by human experts to identify the pollen grains (cytoplasm, pores, reticulum, granules, ...).Such characteristics can be located at different places on the 3D grain and can appear differently depending on the orientation of the grain under the microscope.
Depending on the first hypotheses made about the possible type of an unknown pollen grain, some typespecific characteristics are tested in order to improve the initial estimations.
The general algorithm for testing a given characteristic for a specific type is: -2D segmentation of several selected images -3D validation combining all segmentation results.
The recognition system does not analyse all the 100 images of the digitised sequence to find a characteristic.Only 5 to 10 key images are enough to validate or not the presence of a characteristic.To find these key images, two methods are possible.First, the sequence can be sampled to extract n images with a given step.Second, the sequence can be analysed globally to find the most meaningful images (in terms of clear content, not blurred).This second method is performed using the operator Sum Modified Laplacian which provides local measures of the quality of image focus (Nayar and Nakagawa, 1994).Computing this operator for each image of the sequence enables to identify the clearest images, containing picks and high contrast details with strong colour variations.Both methods of selection for key images can be used, depending on the characteristic that is aimed.On these key images, some regions of interest are computed to facilitate the search for characteristics.
Various segmentation algorithms are used to detect the characteristics (automatic thresholding, Laplacian of Gaussian, ...) (Pal and Pal, 1993).The goal is to obtain a segmentation which is sufficiently good to validate or not the presence of the characteristics.To accomplish the validation of the different segmentations, the features already used for the first estimations are computed on these segmentations.In addition, other features like the spatial position of the segmented regions and their overlap (in different 2D images) are computed.Learnt covariances are used for validation (same model of covariances explained above), so the result of this is a list of sorted possible types (new hypotheses), which can be combined with the current hypotheses to update them.Fig. 5 shows an example of detection of a characteristic with the cytoplasm of the Cupressaceae pollen type (cypress tree).The cytoplasm is more visible for this type than for others.It is located in the center of the grain, without precise shape, appearing bright for images above the center and dark for images below the center.So the algorithm for detection uses 5 to 7 images, equally distributed around the central image, and looks for bright or dark regions in the center, depending on the location of the image (above or below the central image).The resulting regions are compared using several features (shape, colour, size and overlap) for validation.
Using this algorithm, the resulting hypothesis types are different than the hypothesis types obtained by global measures computation.This is a key point for the success of identification.For example, using global measures the similar types of the Cupressaceae type (see Fig. 5) are Plantago, Platanus or Populus.By detecting the cytoplasm, the similar types are Poaceae, Salix and Parietaria, which are different types (not only by their names, but also in appearance).When combining the two lists, it can be expected that the Cupressaceae hypothesis will be enforced.This strategy is used by iterating on several measures and characteristics until no possible confusion remains (or until no other characteristic can be tested).

CONCLUSION
The recognition system is currently being integrated.The preliminary results of classification using 2D global measures and very few 3D type-specific characteristics for some pollen types shows the recognition of 73% of the pollen grains (database of 350 pollen grains of 30 different types), compared to 67% using only global measures.We aim to improve this result by integrating other characteristics to the system.One goal is to include more characteristics to ensure a level of redundancy in the process of recognition to cope with possible partial occlusions of the grains by dust or other particles.

Fig. 4 .
Fig. 4. Image digitisation in three dimensions.(a) For each pollen grain, a sequence of 100 colour images is taken, showing the grain at different focus (with a step of 0.5 microns).(b-d) Images at different focus of an Olea grain, showing different details needed for its identification.

Fig. 5 .
Fig. 5. Example of type-specific characteristic recognition with the Cupressaceae cytoplasm.2D segmentations of some selected images around the central images are combined to validate or not the presence of the cytoplasm.