A human-computer collaborative workflow for the acquisition and analysis of terrestrial insect movement in behavioral field studies
EURASIP Journal on Image and Video Processingvolume 2013, Article number: 48 (2013)
The study of insect behavior from video sequences poses many challenges. Despite the advances in image processing techniques, the current generation of insect tracking tools is only effective in controlled lab environments and under ideal lighting conditions. Very few tools are capable of tracking insects in outdoor environments where the insects normally operate. Furthermore, the majority of tools focus on the first stage of the analysis workflow, namely the acquisition of movement trajectories from video sequences. Far less effort has gone into developing specialized techniques to characterize insect movement patterns once acquired from videos. In this paper, we present a human-computer collaborative workflow for the acquisition and analysis of insect behavior from field-recorded videos. We employ a human-guided video processing method to identify and track insects from noisy videos with dynamic lighting conditions and unpredictable visual scenes, improving tracking precision by 20% to 44% compared to traditional automated methods. The workflow also incorporates a novel visualization tool for the large-scale exploratory analysis of insect trajectories. We also provide a number of quantitative methods for statistical hypothesis testing. Together, the various components of the workflow provide end-to-end quantitative and qualitative methods for the study of insect behavior from field-recorded videos. We demonstrate the effectiveness of the proposed workflow with a field study on the navigational strategies of Kenyan seed harvester ants.
Characterizing and understanding insect movement patterns is a challenging endeavor. Due to the stochastic nature of insect motion, researchers often need to analyze large trajectory datasets that capture their movement under diverse conditions to accurately interpret their behavior. Automated image processing techniques have therefore become very popular among entomologists and behavioral ecologists as a way of quickly acquiring large datasets of insect trajectories from video. Nevertheless, extracting and quantifying the behavior of the focal insects with sufficient accuracy remains difficult due to the limitations of current image processing techniques. Consequently, in the vast majority of studies, researchers perform their experiments in controlled indoor labs and under ideal lighting conditions to reduce noise and improve tracking accuracy. Lab-based experiments, however, may radically alter the landscape and stimuli that insects normally encounter in their native habitat, casting doubts on the ecological validity of such experiments. Furthermore, many environmental variables are extremely difficult to replicate in the lab. For example, studies involving insect navigation often have to be carried out in the field, as the natural landscape plays a crucial role in providing navigational cues to insects. Yet, very few techniques have been proposed to acquire and quantify insect motion patterns in natural settings, with the exception of tracking honeybees in hives. To our knowledge, no robust techniques have been proposed to acquire the movement of terrestrial insects in the field.
In this paper, we present a workflowa for acquiring and analyzing the movement patterns of terrestrial insects (e.g., ants) from field-recorded videos. Rather than providing a fully automated solution, we adopt a human-computer collaborative analysis paradigm where a human analyst and a computer work together to accurately complete the task; the computer provides semi-automated processing of video sequences to visually segment and track the insects, while the human analyst provides judgment, interpretation of behavior, as well as corrective intervention in ambiguous situations to improve tracking precision. We also address the problem of making sense of insect behavior by providing post-acquisition qualitative and quantitative analysis methods. In summary, we contribute three analytical components that are integrated to provide an end-to-end workflow for the study of insect behavior from video sequences:
A novel, human-guided image processing pipeline to extract and track insects in outdoor field environments with high levels of noise
A novel trajectory visualization tool for the exploratory and qualitative analysis of insect behavioral patterns
Quantitative analysis methods for statistical testing of hypotheses and spatiotemporal movement regularities in insect motion trajectories
The flexibility of the proposed workflow makes it uniquely suited for field entomologists and experimental ecologists; unlike existing tools, our image processing pipeline does not presume long uninterrupted observational periods, making it suitable for behavioral assays that require repeated active manipulation of the insects and their surroundings in their natural habitat. In the rest of this paper, we discuss the limitations of existing techniques and show how our workflow addresses them in Section 2. We present the individual components of the workflow and describe how they are integrated in Section 3. In Section 4 we illustrate the effectiveness of the proposed workflow with a real-world use case involving a field study of the navigational strategies of Kenyan seed harvester and demonstrate the precision of the proposed human-guided video analysis technique. We discuss the current limitations of the workflow and planned future research in Section 5 and conclude the paper in Section 6.
2. Related work
The study of insect behavior relies largely on behavioral assays. The movement of individual insects can be extremely informative as to the nature of navigational strategies and decision making processes (reviewed in [1–3]). However, many previous studies have been somewhat limited in their scope due to the lack of workflows for collecting, processing, and analyzing trajectories in the field. The observational methods that ecologists and biologists use to collect data have constraints on the resolution of trajectory information that can be collected in field experiments; even recent studies primarily rely only on the measured orientations of moving insects rather than exploiting full trajectories (e.g., [3–5]). It has long been recognized that the distribution of orientations and turning angles making up an insect’s trajectory promises to contain much more information about the behavioral rules governing navigation (e.g., ). Such detailed information has traditionally been collected by hand (e.g., ) or by moving cameras in order to keep the focal insect at the center of the viewfinder and then inferring position from the tilt and azimuth angles (reviewed in ) - both relatively time-consuming methods. The lack of computational techniques that allow accurate acquisition and analysis of insect trajectories in the field has largely impeded the research on many interesting problems in behavioral entomology.
There is a wealth of image analysis methods for tracking insects in videos recorded under highly controlled conditions. For instance, Balch et al. described an algorithm to track ants in special containers with ideal lighting conditions . SwisTrack is another widely used tool for tracking insects and small robots . Its modular architecture allows for configurable image processing pipelines that can be built from basic components (e.g., background subtraction, blob detection, particle tracking). Beetrack is a similar software tool with a more advanced toolset for the analysis of honeybees’ locomotion . While the aforementioned tools provide fully automated insect tracking, they require controlled environments along with a predetermined set of parameters, making them unsuitable for outdoor field studies where the lighting conditions are constantly changing and the visual field is susceptible to frequent intrusion from other insects. To the best of our knowledge, no one has successfully used any of them to track insects in their native habitat.
Statistically inspired approaches have been developed in an attempt to overcome the limitations of traditional image processing pipelines. Khan et al. developed an effective particle tracking system using Markov chain Monte Carlo . Their method is capable of tracking interacting agents demonstrating good results when used to track ants in the lab. Kimura et al. described a novel technique based on vector quantization to track large numbers of densely packed honeybees in hives .
Despite their attractiveness, automated image processing methods are susceptible to many sources of error which have the potential to significantly degrade the accuracy of the extracted trajectories. Collaborative human-computer approaches have been proposed to improve accuracy in situations that are difficult to resolve by the computer alone . For example, DeCamp and Roy relied on a human operator to annotate preprocessed video segments to track human activities in indoor spaces . Li et al. employed a fully automated image processing algorithm to track migrating cells and later relied on a human operator to correct errors in trajectories . Voss and Zeil described a technique to extract the three-dimensional (3D) motion of flying insects under natural light conditions, requesting human intervention in ambiguous situations . We also employ a human-computer collaboration paradigm relying on a human operator to tag the initial location of the focal insect and letting the computer perform automatic image processing and tracking where possible and asking for human input when ambiguities occur.
Once insects are recognized and their motion tracked and recorded in the form of individual trajectories, the next task is to analyze those trajectories to discover and characterize consistent behavioral patterns the insects exhibit. Many automated techniques have been proposed to quantitatively analyze the motion of insects and animals. For instance, a data-driven Markov chain Monte Carlo has been employed to infer temporal patterns in the motion of bees . The k-means clustering of movement-based feature vectors has been used to recognize distinct behavioral patterns exhibited by grasshoppers . Time series analysis was used to recognize distinct behavioral states in leeches . However, to the best of our knowledge, no techniques have been proposed for exploratory, human-guided qualitative analysis of insect motion aside from simple observations with the naked eye. This is important as exploratory analysis has the potential to reveal behavioral patterns that may be difficult to recognize and interpret from statistical data alone . We address this gap in the literature by providing a novel interactive visualization tool to explore and visually analyze large collections of insect trajectories. Once qualitative patterns are detected, they can be quantitatively tested for statistical significance in the final stage of the workflow.
In summary, previous works on acquiring and analyzing insect movement have mostly focused on automatic, passive observations of insect collectives in highly controlled environments and under ideal lighting conditions. Yet, in many cases, the focal behavior is largely dependent on the natural habitat of the insect and thus can only be studied in the field. Moreover, field entomologists often need to actively and repeatedly manipulate the insects and their surrounding environment in order to elicit responses for specific stimuli or situations. This renders the majority of existing tools unsuitable as they often assume long, uninterrupted observational periods. The workflow we propose in this paper addresses these issues and targets studies where researchers need to record, extract, and analyze large collections of insect trajectories under a variety of experimental conditions. Furthermore, we also address the problem of actually analyzing and making sense of those trajectories once they are recorded. By integrating interactive visual exploration and statistical analysis of trajectory features, our workflow supports both qualitative and quantitative analyses.
Field research poses unique challenges that are not normally encountered in the lab. The stringent time and budgetary constraints combined with the remoteness of many field sites place additional emphasis on the quality and value of every experiment. Such experiments typically have to be performed manually - often with one insect at a time - to isolate the relevant variables and to accurately characterize the behavior at the individual level. While efficient data acquisition is desirable, field researchers often place a higher value on the reliability and accuracy of the data, due to the considerably high cost of field studies and the difficulty in replicating them. Yet, compared to lab-based research, field experiments are unpredictable in nature and suffer significantly lower signal-to-noise ratios, making accurate video analysis even more challenging. For example, the lighting conditions are far more dynamic in the field and the visual scene is susceptible to interference and intrusion from grasses, shadows, and even other insects or animals.
The manual, high-cost, narrow-band nature of field experiments combined with the increased level of noise call for workflows that prioritize data accuracy and resolution over throughput. Human-computer collaborative systems provide a good compromise to address these challenges . In this paradigm, a human analyst and a computer work collaboratively to complete the task; the computer performs laborious tasks, such as detecting and tracking insects in image sequences, while the human provides guidance and intervention in difficult and ambiguous situations, such as noisy images that are difficult for the computer to resolve. Furthermore, a human-computer collaborative workflow can potentially facilitate high-level qualitative analysis of the data by leveraging human judgment and interpretation. One could envision an interactive system where a researcher contemplates theories regarding a hypothesized or observed behavior with the computer providing the means to quickly query large collections of trajectories, enabling the researcher to weigh the data against his/her hypotheses in a visual and qualitative manner. When a number of promising hypotheses have been formulated, the computer can test those hypotheses quantitatively by performing computational and statistical tests on various trajectory features.
Although human-computer collaborative workflows do require increased involvement of researchers throughout the analysis, we believe that this active involvement translates to more accurate data acquisition as well as improved understanding of the underlying insect behavior. The key to building effective workflows is designing interactive visual interfaces that allow researchers to ‘see’ the data, supply judgment and interpretation, and intervene to correct errors and artifacts produced by the computer.
Our workflow consists of four main stages, as illustrated in Figure 1. The first stage comprises human-guided video processing to segment insects, track them, and extract their trajectories. It comprises a tool that not only implements common video processing pipelines but also includes interactive features to allow a human to intervene and correct errors, such as misidentification of the target in the beginning of the recording. In the second stage, the extracted trajectories are transformed according to a camera model to cancel perspective effects. In the third stage, the corrected trajectories are explored using an interactive visualization tool for exploratory and qualitative analysis. We employed a number of novel features, including a 3D visualization to enable researchers to discover recurring spatial and temporal patterns, compare trajectories under different conditions, and quickly test hypotheses pertaining to observed behavior. Once a number of promising hypotheses have been formulated, they can be statistically verified in the fourth stage. The last two stages of the workflow comprise a ‘sense making loop’ where plausible theories are first formulated and explored visually in a qualitative manner and then statistically verified in the following stage, which, in turn, may lead to new theories and hypotheses that can be visually explored again. The four stages comprise an end-to-end workflow, providing the analytical tools needed to acquire insect trajectories from field-recorded videos and make sense of these trajectories. We describe each stage in detail in the following subsections.
3.2 Human-guided video processing and insect tracking
Lighting conditions, shadows, moving debris, soil color, and other encroaching insects and animals can adversely affect the quality of data in field studies. For instance, moving grasses may cast shadows, causing false positives and interfering with tracking. To achieve accurate tracking, we employ an automated algorithm that performs the bulk of the image processing and insect tracking coupled with an interactive user interface that enables the human operator to watch the algorithm's output and intervene to rectify errors in the tracking. The program takes control from the automated algorithm and hands it to the operator to take action when needed. Control is handed back to the algorithm when the ambiguity is resolved. The operator may also initiate corrective intervention when errors in the tracking are observed. We first describe the video processing pipeline, discuss the tracking algorithm, and then describe how a human operator can intervene to rectify problems in tracking.
3.2.1 Video processing
The pipeline takes a video feed as an input, subtracts the background, filters each frame for noise and artifacts, and outputs the moving parts as binary blobs of pixels. Figure 2 illustrates this process. The following is a description of the individual steps:
Background elimination. We employ the foreground object detection algorithm described by Li et al. to segment moving insects from the background scene , giving us a binary image indicating the location of insects in the frame. Ideally, this step would completely eliminate the background leaving only moving insects. In most cases, however, the visual scene is simply too noisy, resulting in false-positive blobs from debris, grasses, and shadows.
Masking. The non-interesting parts of the image are removed to ease the task of the tracking algorithm and remove unwanted noise and artifacts. For example, if the experimenter is using a marked experimental arena to conduct the experiments, the surroundings can be removed. The mask has to be manually updated whenever the camera or subject position changes.
Noise filtering. The frame is processed to remove unwanted noise and artifacts. We first apply a Median Blur filter to remove ‘salt and pepper noise’ . To further smooth, the video feed is blurred using a 3 × 3 Gaussian kernel, after which the frame is thresholded to obtain a binary image.
Insect shadow elimination. Insects may cast prominent shadows on the ground, which tend to confuse the tracking algorithm and cause it to jump back and forth between the insect and its shadow, producing artifacts in the recorded trajectory. In some situations, the shadow cast by an insect can be larger than the insect itself. We employ a series of dilation and erosion operations to eliminate the insect’s shadow and/or merge it with its body [24, 25]. Eroding the image results in the elimination of smaller shadows, while dilation causes the insect and its shadow to merge into a single blob. Figure 3 illustrates the effect of dilation and erosion on an insect blob and its shadow. Typically, a series of erosions followed by a series of dilations are applied, or vice versa. Because the size of the shadow often depends on the time of the day and the actual size of the insect, the appropriate sequence of dilation and erosion operations must be determined empirically.
3.2.2 Insect tracking
At the beginning of this stage, the binary frame returned from the image processing pipeline is segmented using a simple contour finding algorithm , and the centroid of each detected blob is calculated. We employ a human-computer collaborative tool to accurately track the centroids of the focal insects; an automated algorithm performs basic tracking with a human operator supervising and intervening when there are tracking ambiguities. This is necessary when, for instance, the frame contains a significant level of noise as a result of quick changes in lighting conditions due to moving clouds or because of relatively strong winds, which tend to shift the camera. Figure 4 illustrates the steps involved in tracking. We describe the steps below:
Skipping to the beginning. In behavioral assays where experiments are conducted in rapid succession, a single video file may contain several experiments. The human operator may need to indicate the beginning and end of each independent experiment by fast-forwarding the video to the starting position of the experiment for instance. Additionally, some non-relevant video segments may need to be cropped (equipment or hands appearing in the beginning of the experiment, for instance).
Insect selection. When there are multiple prominent blobs in the first video frame, the focal insects may need to be identified manually in the first frame. In this case, the automated algorithm stops, and control is handed to the human operator to identify the focal insects. The operator selects one or more insects by clicking on them or fast-forwarding if no insects are present. After the selection, the control is handed back to the automated tracking algorithm. Although user identification of individual insects is not feasible when tracking a large number of insects, manual selection is often necessary in behavioral assays where the focal insects need to be accurately identified and tracked for the data to make any sense. In passive observation of large insect collectives, a different mechanism needs to be implemented to identify the initial position of insects. A simple approach is to assume that all blobs in the binary frame are potential insects. Alternatively, a more sophisticated scheme, such as pattern matching, could be employed.
Insect tracking. A region of interest (ROI) is defined as the circle around the current position of tracked insects, with a radius that is three to four times the size of the insect. The selection of the focal insect in the previous step initializes the position of the ROI. The automated algorithm then steps automatically through the video frames, associating insect blobs with their predecessor in previous frames. The algorithm only considers blobs that are inside the insect’s ROI. For each frame, one of the following three situations may arise:
No blobs are detected inside the ROI. This case happens if the insect stops moving for some time, becoming part of the background model. Since there is no movement, these frames are skipped.
Ideally, one blob would be present inside the ROI. In this case, the centroid of the blob is appended to the insect’s trajectory.
Two or more blobs detected inside the ROI. This is when the human operator needs to intervene. In this situation, using the mouse, the human operator selects the blob that corresponds to the insect. This situation can often be resolved automatically using a nearest neighbor algorithm for instance. Here, we opted to rely on human judgment to maximize the accuracy of tracking. However, in more forgiving situations, one could calculate a confidence level at every frame and stop the automated tracking only when the confidence level drops below a certain threshold (e.g., when the last position of the insect is equidistant to several blobs that are equally probable).
Trajectory correction. Once the trajectory is fully processed (i.e., the insect exits the field of view or the experiment is terminated), the human operator may delete extraneous jumps by clicking on them. The final trajectory is then saved to a file.
3.3 Camera modeling
Once trajectories are extracted, they are first corrected to cancel perspective distortion. In the lab, the experimenter can typically hang the cameras from the top - using a mounting structure - to get top-down, orthogonal shots of the insects, which reduces the amount of perspective distortion. In the field, however, it is difficult to construct such mounts. Therefore, researchers often resort to using regular tripod mounts and positioning the camera on the side of the subjects to avoid disrupting their behavior. This side-top view produces a significant amount of perspective distortion.
To correct perspective distortion, we use a simple grid calibration procedure illustrated in Figure 5. The borders of the experimental arena are marked with regular control points every 10 cm (indicated with arrows). The control points are identified and entered manually by a user whenever the camera position changes (typically once or twice a day). From these control points, a regular grid is constructed from the intersection of line segments defined by the control points. The corner coordinates of each cell are calculated in both world space and pixel space by linear interpolation along the control points. The cells are then used to map trajectory points from pixel space to world space using a bilinear interpolation. While this transformation ignores lens distortions, in practice this distortion was minimal in our case. This simple calibration technique works well in the field, as it does not require any accurate measurements of camera position or the availability of calibration-aiding materials such as regular checkerboards, which are difficult to obtain in remote locations. However, more sophisticated calibration schemes that take into account lens distortions can certainly be used if needed (e.g., Tsai’s method ).
3.4 Visual exploration (qualitative analysis)
Once trajectories are extracted and corrected, researchers can begin their attempt to understand and characterize the underlying insect behavior. Often, the first thing researchers want to do is to ‘take a look’ at the collected trajectories to get an overall sense of the behavior and to see if there are any obvious patterns. Although entomologists tend to form their initial hypotheses from field observations, it is beneficial to give them a chance to explore and follow up on a wider range of plausible theories before drawing conclusions. This is particularly important in behavioral entomology where the underlying insect behavior is highly probabilistic and is susceptible to many different interpretations that are often equally plausible. Therefore, at this stage of the workflow, our goal is to give researchers a tool that enables them to ‘think laterally’ and explore different hypotheses with ease before deciding on the most promising ones for further statistical analysis. Supporting this kind of exploratory qualitative analysis in scientific workflows is crucial, yet often overlooked .
To facilitate exploratory analysis, we developed an interactive visualization tool for the exploration of large collections of insect trajectories. The visualization employs a small-multiple view  with multiple trajectories visualized side-by-side in a grid layout, as illustrated in Figure 6A. The trajectories can be grouped according to their associated metadata, such as the location of capture, insect size, colony, etc. Although the movement of terrestrial insects is naturally restricted to the ground (i.e., two-dimensional (2D) motion), we utilized a 3D visual encoding to better illustrate spatiotemporal movement patterns in the data; each trajectory was rendered in stereoscopic 3D display, with the XY plane (the display surface) encoding the insect's movement on the ground, while the Z+ axis (away from display) encoded time. To a viewer looking at a 3D display, the trajectories appear as cylinders sprouting from the display surface and extending out to ‘float’ in front of the display. Figure 6B illustrates this concept. Traditional 2D visualization can only depict insect movement, irrespective of the time it took the insect to make that movement. A stereoscopic 3D visualization, on the other hand, can reveal the temporality and periodicity of trajectories, making it possible for researchers to perceive complex, spatiotemporal behavioral patterns. Although one can certainly encode time in a strictly 2D visualization by using color for instance, we found that stereoscopic depth cues are better at conveying temporal patterns when one is looking at a large number of trajectories simultaneously . Previous studies also demonstrated the value of stereoscopy in allowing one to perceive and operate on larger datasets .
We included two interactive features to let researchers query the data, explore hypotheses, and quickly determine whether the data support those hypotheses. First, a coordinated paintbrush tool allows the user to ‘brush’ the background of one trajectory, causing a color highlight in all other displayed trajectories when the insect moves over a brushed area. Second, a temporal filter enables the user to specify a time period (using a range slider), causing the visualization to display trajectory segments corresponding to insect movement during the specified time period only, such as the beginning of the experiment. Our experiments with the visualization demonstrated that using these two features in tandem, a researcher could test for a hypothesized spatiotemporal behavioral pattern and visually determine whether the data support that behavior .
To see how the visualization can be used for quick qualitative hypothesis testing, let us consider the following example. During the study on the navigational strategies of seed harvester ants (described in Section 4), our field observations suggested that ants were employing celestial cues, such as polarized sunlight, for navigation off the colony's main foraging trail where no reliable pheromone cues are present. To test this hypothesis, the researcher visualized trajectories of ants captured east of the main foraging trail in one group and tried to determine whether those ants exit the experimental arena from the west side in an attempt to get back to the trail. Because of the large number of samples (over 50 in our case), this is not normally an easy task. However, the test can be visually performed with ease using our visualization; the researcher uses the coordinated paintbrush tool to brush the left (west) part of one trajectory from the ‘east’ group with red (top right of Figure 6A) and set the temporal filter to show movement during the last moments of the experiment. One would expect a red highlight in the majority of cells if the ants exit from the left side, which is indeed the case here. While this qualitative assessment does not, by itself, constitute formal verification, it can be used to contemplate and explore a wide range of hypotheses. Once a number of plausible hypotheses have been identified, they can be statistically verified in the following stage of workflow.
3.5 Quantitative analysis
After the data have been explored visually, the next course of action is to quantify any observed behavior and to statistically test the hypotheses that were formulated earlier. The quantitative analysis stage of the workflow comprises two steps:
Quantitative trajectory description. Trajectories are first discretized at regular intervals, in both space and time. Following that, various statistical and geometric measures are calculated from the discretized segments, including distribution of turning angles and mean orientation vectors. These measurements quantitatively characterize insect motion and allow researchers to establish quantitative differences between groups of trajectories captured under different conditions.
Statistical hypothesis testing. This usually entails comparing groups of trajectories based on the measures calculated in the above step. Statistical tests such as Wilcoxon and generalized linear model (GLM) are common here.
While the appropriate statistical test is largely dependent on the question being asked, there are a number of general statistical measures that can be used to quantitatively characterize the movement of terrestrial insects. Moreover, these measures can shed light on the strategies insects employ to process stimuli and navigate the environment around them. Although not meant as an exhaustive list of measures, here we discuss statistical and geometric trajectory measures that are widely used in the analysis of insect movement. We first discuss trajectory discretization and then describe two common statistical measures for trajectories.
3.5.1 Trajectory discretization
Since the motion of insects is highly stochastic in nature, it is convenient to chop their trajectories into regular segments and analyze those segments differentially. This is often necessary to get a statistically representative sample of the insect’s motion and decision making process. Although trajectories extracted from image sequences are often recorded as a series of discrete points, such initial discretization is usually irregular or fixed at an arbitrary interval (the video’s frame rate). One should therefore resample the trajectories at biologically meaningful distances and intervals. The choice of segment length is subject to a trade-off between incorporating too much noise by using a small segment length and sacrificing resolution by taking too large a segment length. Ultimately, that choice depends on the questions being asked and the phenotypic behavior of the insect. Common values range between few millimeters to few centimeters for space discretization and few hundred milliseconds to few seconds for time discretization.
Here we consider two discretization schemes that are common in animal movement studies, namely spatial and temporal discretization (Chapter 7 in ). We note, however, that other criteria such as curvature and sinuosity may also be appropriate . In space discretization, the trajectory is chopped into straight segments of equal length, irrespective of the time the insect took to travel between those points. In time discretization, on the other hand, the end points of each segment represent insect displacement during a predetermined time period, irrespective of the magnitude of that displacement. Figure 7 illustrates the difference between the two techniques by showing the same ant trajectory discretized in both space and time (at 4 cm and 0.5 s, respectively). Algorithms 1 and 2 can be used for space and time discretization, respectively.
3.5.2 Distribution of turning angles
The distribution of turning angles can be obtained by measuring the angle between two consecutive segments along the discretized trajectory. Binning these angles gives us a distribution that shows the tendency of the insect to make turns. Figure 8A illustrates this. Generally speaking, a wider distribution of turning angles indicate a more tortuous path comprising many turns, whereas a narrower distribution usually corresponds to more directed movement.
3.5.3 Distribution of orientations
This can be calculated by determining the normalized orientation vector of the insect at every segment throughout the trajectory. Figure 8B illustrates this. Orientation vectors can be averaged to calculate a mean orientation vector, which gives the overall direction of the insect’s motion. The magnitude of the mean orientation vector indicates the directedness of an insect’s path. A mean value close to 1.0 implies that the insect is traveling in a particular direction, whereas a magnitude close to 0 typically indicates a non-directed motion, such as a loop or spiral.
4. Case study: context-dependent navigation in social foraging ants
The best way to evaluate scientific workflows is to see how they fare in the hands of scientists when analyzing real data. We put the proposed workflow to test during a field investigation on the navigational strategies of seed harvester ants (Messor cephalotes). The interdisciplinary research project was carried out in 2012 at the Mpala Research Centre located in the Laikipia district of Kenya. We first give background on the project and then describe how we employed the proposed workflow throughout the various stages of the investigation. We also give quantitative results on the accuracy of our human-computer collaborative video processing approach and compare it against a fully automated solution.
One key role for navigation in social insects is in the orientation of workers between food sources and the nest. Foraging efficiency, often cited as a key factor in the ecological success of social insects, is largely dependent on the accuracy and speed with which individuals can move between these locations. Seed harvester ants live in large colonies of many thousands of individuals and create enormous, persistent networks of trails to guide foragers to food sources up to 40 m from the nest. Many ants leave these trails to search for seeds individually . On finding a seed, foragers return to the trail network and then follow the main trail back to the nest. Thus, a harvester ant’s outward and inward journeys are each split into two segments - an on-trail segment and an off-trail segment. This two-part journey presents an interesting navigational challenge, as foragers do not home directly from their current location after finding food, but retrace their routes back to the point at which they left the trail network, and only then reorient towards the nest. Because the visual and chemical information available to a forager on the main trail will differ considerably from that available to a forager searching some distance from the trail, there is the potential for context-specific selection of navigational strategies.
Our goal was to understand and characterize the difference between these two modes of navigation, namely off-trail versus on-trail navigation, employed by seed harvester ants . As this investigation required access to several fully developed ant colonies with established foraging trails in various directions, the research questions could only be answered by studying the insect’s movement patterns in the field. Furthermore, the subject ants needed to be selected carefully from different locations to understand the effect of an ant’s position on the navigational strategy it chose to employ.
4.2 Experiments and data acquisition
For each trial, a single ant was selected, captured using a small cylindrical plastic container, and transferred to the experimental arena. The trial began when the ant was released in the center of an experimental arena. A full-HD (1,920 × 1,080 × 24 bits at 30 frames per second) video camera was used to film the movement of the ant until it crossed one of the boundaries. The experimental arena consisted of a rectangular (240 × 140 cm) 8-mm-thick plywood board, which was positioned at least 13 m away from the colony’s nest. Thus, it constitutes an unfamiliar ground prompting the ants to attempt to get back to the trail when released employing various navigational cues. Although the plywood board helped make ants to be more prominent in the videos, there was still significant noise resulting from the interference of shadows, grasses, and other encroaching insects. All in all, about 400 trials were carried out. Ants were selected to cover a variety of conditions, including different positions relative to the trail network (on/off trail), journey direction (from/to the nest), whether the ant was carrying a food item, and the initial heading direction when the ant was captured. The location of the camera and experimental arena was fixed during a single session; thus, the calibration needs to only be done once for each session (usually twice in a day).
In addition to the behavioral experiments, we performed long interrupted observations of the colony’s main foraging by video recording the movement of foragers along a small portion of the trail. This was done to get a sense of the flow of ants at different times of the day and to compare the movement patterns of on-trail and off-trail foragers. Figure 9 shows a screen-grab from a behavioral experiment (top) and a trail video (bottom). The density of ants and their fast movement on the trail, however, made image processing more challenging compared to the behavioral experiments where only a single individual ant was tracked. We therefore report precision results for each group separately in the following section.
4.3 Image processing
The collected videos were processed and analyzed off-site using the interactive image processing tool described in Section 3.2, which was implemented using openFrameworks and OpenCV. Table 1 lists OpenCV functions that were used along with their respective parameters. To process the behavioral experiments, a human analyst watched the beginning of every experiment and skipped the video to the moment when the captured ant is released (the recording of experiment started moments before the release so as to avoid losing any movement). The user also clicked on the area of release to center the ROI on it. This centering needs to only be done once per session. From that point, the program automatically tracked the focal ant, stopping only at ambiguous frames (such as when multiple insects are in the ROI region) and prompting the user to identify and click on the focal insect. In many cases, however, the only interaction required was to skip the beginning of videos to the moment of release. Occasionally, in difficult videos with significant noise, the user needed to click and correct the program several times. The processing of the trail videos proceeded in a similar manner. However, rather than tracking all ants on the trail, we opted to obtain few high-quality samples that represent the typical behavior of foragers along the trail. This was done by having a human analyst click on a target ant, with the program performing image processing and tracking and prompting the user for input when ambiguities occur. We report on the reliability and accuracy of our human-guided video processing technique and compare it against a fully automated solution in the following.
To measure the reliability of the user’s corrective input and to quantify the consistency of the results across multiple users, we had two independent analysts separately perform human-guided image processing on an identical subset of the seed harvester ant data comprising 30 behavioral experiments (approximately 8% of the data). The trajectories obtained by the two analysts were compared for consistency by measuring their degree of overlap. The overlap between two trajectories was calculated by assuming a path width of 3 pixels on either side of both trajectories and measuring the area covered by both paths. The area was then normalized by the length of the longer of the two trajectories to allow for comparison across the dataset. The average level of overlap between the two users was 92.6%, showing a high level of agreement between the two users. This also demonstrates the reliability of our human-guided video processing tool in tracking the focal insects in noisy field-recorded videos.
To measure the relative precision of the human-guided video processing method, we compared user-corrected trajectories against the ones extracted by a fully automated algorithm. This was done by running a subset of the videos twice through our image processing tool: once with a user supervising and intervening to correct errors and resolve tracking ambiguities, and a second time without human guidance, relying solely on the algorithm’s best guess. User-corrected trajectories were compared against their automatically tracked counterparts by measuring the degree of overlap between the two using the procedure described above (Section 4.3.1). If the performance of the fully automated algorithm is comparable to the performance of a human-guided analysis process, we would expect to see a high degree of overlap between the two sets of trajectories. If, on the other hand, the automated algorithm performs poorly, we would expect to see little overlap between the two. The level of overlap can be construed as the precision of the auto-tracked trajectories relative to their user-corrected counterparts. This allows us to quantify the gained precision as we move from a fully automated to a human-guided analysis. Importantly, this method also penalizes the automated method when it loses track of the insect and terminates the trajectory early, which naturally reduces the degree of overlap with the human-guided solution (the user has the ability to click on the lost insect allowing the algorithm to relocate it). For the purpose of this comparison, we consider the user-corrected trajectories to represent the ground truth. Although it is possible for the user to make mistakes, the reliability analysis in the previous section suggests that such errors are, in fact, rare. Furthermore, during our experiments, we found that users were able to accurately resolve even the most ambiguous of situations by pausing and/or rewinding the video.
We performed the overlap analysis over a subset of the seed harvester ant videos selected to reflect a wide range of lighting conditions in the field at different day times. The test dataset comprised 91 trajectories: 43 videos of behavioral experiments with a single ant at a time and 48 video segments of the colony’s foraging trail with a large number of ants moving at relatively high speeds. Overall, the fully automated algorithm achieves an average precision of 80.3% during behavioral experiments and 55.6% in trail videos, relative to the human-guided analysis process. This can be construed as an average gained precision of 19.7% to 44.4% when a user supervises the video analysis and takes action to correct tracking errors. Figure 10 (left) shows a breakdown of trajectory overlap levels between the auto-tracked and user-corrected trajectories. The automated algorithm achieves over 80% precision in roughly 65% of the behavioral experiments, with the precision dropping to 60% to 80% for approximately 14% of the test data. The bottom 21% of the data had a precision of less than 60%. The trail videos, on the other hand, suffered a significant drop in precision in the absence of human guidance with only 37% of the videos attaining 80% precision, while 55% of the tested videos had a precision of less than 60%.
The gap in precision between the behavioral experiments and the trail videos is somewhat expected, as the trail videos are inherently more difficult to process due to the large number of ants. The widely varying precision between the two groups of videos is likely due to different sources of error. In the behavioral experiments, the user-corrected trajectories are 18% longer, on average, compared to the automatically extracted ones. This suggests that the automated algorithm is often losing track of the focal insect during the experiment, resulting in shorter trajectories and ultimately loss of precision. In the trail observation videos, on the other hand, the automatically extracted trajectories are 29% longer, on average, than their user-corrected counterparts. Here, tracking errors are likely to dominate with the algorithm mistakenly switching to other insects due to the high density of ants on trails and their fast-paced movement. This discrepancy is also reflected by the number of times the user had to intervene, as shown in Figure 10 (right). On average, the user needed to intervene 5.2 times in the behavioral experiments, while the trail videos required an average of 44.4 interventions.
In general, while the fully automated algorithm guarantees at least 80% precision in 65% of the behavioral experiments and 37% of the trail videos, such relatively low precision levels are likely to confound the analysis. In some cases, it may be possible to discern unreliable trajectories and exclude them from the analysis by recording information about the confidence of the video processing algorithm and discarding trajectories that do not attain a certain confidence threshold. However, in many other cases, such as when the tracking algorithm loses track of the insect, recognizing information loss may be extremely difficult. Such gaps and errors may potentially dilute regularities to a point where they become hardly perceptible, particularly in field studies where the data suffer significantly from higher noise levels compared to lab-based studies. Human-guided video processing, on the other hand, can improve precision by an average of 20% to 44% depending on the complexity of the video.
4.4 Trajectory analysis
We employed both qualitative (Section 3.4) and quantitative (Section 3.5) analyses to make sense of the collected trajectories and understand the strategies employed by ants during their off-trail and on-trail foraging journeys.
4.4.1 Qualitative analysis
We employed the visualization tool described in Section 3.4. Because of the relatively large number of trails (about 400), we used of a large 3D display to juxtapose a large number of trajectories at the same time. Using the visualization, the user can group trajectories into ‘bins’ according to their associated metadata (such as location of capture, journey direction, etc.). The bins are given different background colors to distinguish them. Figure 11 illustrates the visualization environment.
Using the visualization, we were able to visually confirm our initial hypothesis, namely off-trail ants tend to exhibit a directed motion, exiting the experimental arena in a direction that would have eventually led them back to the colony’s foraging trail. For example, ants captured east of the trail tend to exit the arena from the west side, which was visually confirmed using the coordinated paintbrush tool. On the other hand, ants captured on the trail exhibited tortuous non-directed motion, presumably in an attempt to pick up pheromone cues and locate the trail again. We were also able to discover additional temporal patterns, thanks to the stereoscopic 3D view. For instance, ants that drop their seed during the capture process tend to spend a significant amount of time in the center of the experimental arena when released presumably searching for their seed. This was evident in the 3D stereoscopic view with trajectories that were semi-perpendicular to the display surface, indicating little insect movement over few minutes.
4.4.2 Quantitative analysis
At this stage of the investigation, we wanted to quantitatively characterize the difference between off-trail and on-trail ants, and statistically test our main hypothesis, namely off-trail ants orient themselves and exit the experimental arena in a direction that would have led them back to the colony’s trail.
We calculated the statistical measures described in Section 3.5. The distribution of turning angles revealed a marked difference between off-trail and on-trail trajectories. While off-trail ants showed a much narrower distribution of angles often centered on 0 indicating few turns, on-trail ants exhibited a much wider distribution of turning angles, indicating that this group frequently made relatively large turns. Additionally, the mean orientation vector showed significant differences between the two groups (GLM p < 0.001). Off-trail ants exhibited a mean orientation vector with a 0.78 magnitude, on average, often directed toward the trail relative to their capture position. On-trail ants, on the other hand, exhibited a shorter mean orientation vector with an average magnitude of 0.5.
We performed a V test  to test our main hypothesis. The test was applied to compare the mean orientation vectors of groups captured east, west, north, and south of the foraging trail to a presumed vector oriented toward the trail relative to the capture point. For example, the east group was compared to a vector oriented at a +90° angle. Results indicate that with the exception of the ‘south’ group, all other off-trail ants do indeed orient themselves toward the foraging trail when released (east: p < 0.001; west: p < 0.001; north: p < 0.001; south: p = 0.025). On-trail ants, on the other hand, did not demonstrate significant directional bias. These results statistically confirm our hypothesis. Furthermore, they point to a global navigational cue employed by off-trail ants, perhaps some sort of a sun compass. On-trail ants seemed lost when removed from the trail, suggesting a reliance on pheromone cues. When taken together, these two results strongly suggest a context-dependent navigation employed by seed harvester ants, depending on their position relative to the trail network.
The proposed human-computer collaborative workflow proved crucial in the various stages of the field study. Thanks to the increased tracking precision, we were able to acquire ant movement trajectories at a higher precision. This increased precision allowed us to quantify specific behavioral patterns among different groups of ants even though we had a limited number of condition-specific samples (about 30 to 50 samples). Using the 3D stereoscopic visualization, we were able to discover additional spatiotemporal behavioral regularities exhibited by off-trail ants. Finally, we were able to statistically confirm and characterize those behavioral regularities, providing solid evidence for two distinct navigational strategies that seed harvester ants seemed to employ in different contexts.
5. Limitations and future work
There are some technical limitations in the workflow that need to be addressed before it can be adopted on a larger scale. At the moment, the workflow was implemented in separate tools using C++, Java, and Matlab. In the future, we would like to combine all the stages into a single application. The integration between qualitative and quantitative analyses would be particularly helpful. This would allow researchers to explore the dataset and visually formulate hypotheses from the visualization environment with simple interactions, with the computer translating the qualitative hypotheses into statistical tests that are performed automatically for verification.
A second concern is the significant effort a researcher needs to devote for data acquisition, compared to fully automated methods. Although human-guided image processing is well suited for studies relying on behavioral assays where insects are often manipulated and filmed individually, such an approach is less efficient in studies that rely on long passive observation of insect collectives in the field. While we still believe that a human-computer collaborative approach is promising even for observational studies with hours of video recordings, the role of the human analyst needs to be further restricted. In the current implementation, the image processing tool requires a human operator to continuously supervise and take action to resolve ambiguities and/or correct tracking errors before processing continues. To minimize the amount of supervision time, we envision a backend image processing system that is capable of automatically analyzing most of the data offline, stopping only at frames that suffer high noise levels and queuing those up for human intervention at a later time. The operator would use a frontend tool to quickly scan through the accumulated frames to visually resolve them at his/her convenience. The backend system would take the user input and continue offline processing, queuing up any additional frames that are difficult to process for human intervention.
The study of insect behavior from image sequences poses many challenges. Despite the advances in image processing techniques, the current generation of insect tracking tools is only effective in controlled lab environments and under ideal lighting conditions. In this paper, we presented an end-to-end workflow for the acquisition, processing, and analysis of the movement trajectories of terrestrial insects in the field. The workflow employs human-guided video analysis to overcome limitations in automated algorithms when faced with an unpredictable visual scene and highly dynamic lighting conditions. Our technique improves tracking precision by an average of 20% to 44% compared to traditional automated methods. The workflow also incorporates a novel trajectory visualization tool for large-scale exploratory analysis of insect movement patterns, allowing researchers to visually formulate and test hypotheses pertaining to insect behavior. Further, we provide a number of generic statistical analysis methods for the quantitative analysis of insect behavioral patterns. We demonstrated the effectiveness of the proposed techniques with a field case study that investigated the navigational strategies employed by Kenyan seed harvester ants in their native habitat.
aAn implementation of the workflow along with the source code is available at http://www.evl.uic.edu/kreda/field_entomology/.
Cheng K: How to navigate without maps: the power of taxon-like navigation in ants. Comp. Cogn. & Behav. Rev. 2012, 7: 1-22.
Wolf H: Odometry and insect navigation. J. Exp. Biol. 2011, 214: 1629-1641. 10.1242/jeb.038570
Wystrach A, Graham P: What can we learn from studies of insect navigation? Anim. Behav. 2012, 84(1):13-20. 10.1016/j.anbehav.2012.04.017
Lebhardt F, Koch J, Ronacher B: The polarization compass dominates over idiothetic cues in path integration of desert ants. J. Exp. Biol. 2012, 215(Pt 3):526-535.
Legge E, Spetch M, Cheng K: Not using the obvious: desert ants, Melophorus bagoti , learn local vectors but not beacons in an area. Anim. Cogn. 2010, 13: 849-860. 10.1007/s10071-010-0333-x
Wehner R, Räber F: Visual spatial memory in desert ants, Cataglyphis fortis (Hymenoptera, Formicidae). Experientia 1979, 35: 1569-1571. 10.1007/BF01953197
Wehner R, Srinivasan MV: Searching behaviour of desert ants, genus Cataglyphis (Formicidae, Hymenoptera). J. Comp. Physiol. A 1981, 142: 315-338. 10.1007/BF00605445
Collett T, Collett M: Memory use in insect visual navigation. Nat. Rev. Neurosci. 2002, 3: 542-552. 10.1038/nrn872
Balch T, Khan Z, Veloso M: Automatically tracking and analyzing the behavior of live insect colonies. In International Conference on Autonomous Agents: Proceedings of the Fifth International Conference on Autonomous Agents, vol. 2001. New York: ACM; 2001:521-528.
Lochmatter T, Roduit P, Cianci C, Correll N, Jacot J, Martinoli A: Swistrack - a flexible open source tracking software for multi-agent systems. In IEEE/RSJ 2008 International Conference on Intelligent Robots and Systems (IROS 2008). Piscataway: IEEE; 2008:4004-4010.
Sokolowski MB, Moine M, Naassila M: “Beetrack”: a software for 2D open field locomotion analysis in honey bees. J. Neurosci. Methods 2012, 207(2):211-217. 10.1016/j.jneumeth.2012.03.006
Khan Z, Balch T, Dellaert F: An MCMC-based particle filter for tracking multiple interacting targets. 8th European Conference on Computer Vision, Prague, Czech Republic, May 11–14, 2004. Lecture Notes in Computer Science. In Computer Vision - ECCV 2004. Berlin: (Springer; 2004:279-290.
Kimura T, Ohashi M, Okada R, Ikeno H: A new approach for the simultaneous tracking of multiple honeybees for analysis of hive behavior. Apidologie 2011, 42(5):607-617. 10.1007/s13592-011-0060-6
Licklider JCR: Man-computer symbiosis. IRE Transactions on Human Factors in Electronics 1960, 1: 4-11.
DeCamp P, Roy D: A human-machine collaborative approach to tracking human movement in multi-camera video. In Proceedings of the ACM International Conference on Image and Video Retrieval. New York: ACM; 2009:32.
Li K, Miller ED, Weiss LE, Campbell PG, Kanade T: Online tracking of migrating and proliferating cells imaged with phase-contrast microscopy. 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW ‘06) (IEEE, Piscataway, 2006) pp. 65-65.
Voss R, Zeil J: Automatic tracking of complex objects under natural conditions. Biol. Cybern. 1995, 73(5):415-423. 10.1007/BF00201476
Oh SM, Rehg JM, Balch T, Dellaert F: Learning and inferring motion patterns using parametric segmental switching linear dynamic systems. Int. J. Comput. Vis. 2008, 77(1):103-124.
Naeini MM, Dutton G, Rothley K, Mori G: Action recognition of insects using spectral clustering. In IAPR Conference on Machine Vision Applications. Tokyo; 2007.
Mazzoni A, Garcia-Perez E, Zoccolan D, Graziosi S, Torre V: Quantitative characterization and classification of leech behavior. J. Neurophysiol. 2005, 93(1):580-593.
Tukey JW: We need both exploratory and confirmatory. Am. Stat. 1980, 34(1):23-25.
Li L, Huang W, Gu IY, Tian Q: Foreground object detection from videos containing complex background. In Proceedings of the Eleventh ACM International Conference on Multimedia. New York: ACM; 2003:2-10.
Jain R, Kasturi R, Schunck BG: Machine Vision. Volume 5. New York: McGraw-Hill; 1995.
Haralick RM, Sternberg SR, Zhuang X: Image analysis using mathematical morphology. IEEE Trans. Pattern Anal. Mach. Intell. 1987, PAMI-9(4):532-550.
Plaza A, Martinez P, Perez R, Plaza J: Spatial/spectral endmember extraction by multidimensional morphological operations. IEEE Trans. Geosci. Remote Sens. 2002, 40(9):2025-2041. 10.1109/TGRS.2002.802494
Suzuki S: Topological structural analysis of digitized binary images by border following. Comput. Vis. Graph. Image Process. 1985, 30(1):32-46. 10.1016/0734-189X(85)90016-7
Tsai R: A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. IEEE J. Robot. Autom. 1987, 3(4):323-344.
Tufte ER, Graves-Morris PR: The Visual Display of Quantitative Information. Volume 31. Cheshire: Graphics Press; 1983.
Reda K, Johnson A, Mateevitsi V, Offord C, Leigh J: Scalable visual queries for data exploration on large, high-resolution 3D displays. In 7th Annual Workshop on Ultrascale Visualization, Salt Lake City, Utah, Nov 12, 2012. Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis. Piscataway: IEEE; 2012.
Ware C, Mitchell P: Visualizing graphs in three dimensions. ACM Trans. Appl. Percept. 2008, 5(1):2.
Couzin I PhD thesis. In Collective animal behavior. University of Beth; 1999.
Buchin M, Driemel A, van Kreveld M, Sacristán A: Segmenting trajectories: a framework and algorithms using spatiotemporal criteria. J. Spat. Inf. Sci. 2011, 3: 33-63.
Hölldobler B: Recruitment behavior, home range orientation and territoriality in harvester ants. Behav. Ecol. Sociobiol. 1976, 1(1):3-44. 10.1007/BF00299951
Offord C, Reda K, Mateevitsi V: Context-dependent navigation in a collectively foraging species of ants, Messor cephalotes. Insectes Sociaux 2013, 60(3):361-368. 10.1007/s00040-013-0301-y
Zar J: Biostatistical Analysis. 4th edition. Upper Saddle River: Prentice Hall; 1998.
We are grateful to Iain Couzin, Daniel Rubenstein, Tanya Berger-Wolf, and Jason Leigh for the helpful discussion in the field; to Andrew Johnson for his feedback on the visualization tool; and to Karl Li for lending us his awesome laptop. This work was part of a project performed in the joint Princeton-UIC Field Computational Ecology course in Spring 2012 (http://www.cs.uic.edu/bin/view/ComputationalEcology), with co-instructors Tanya Berger-Wolf and Jason Leigh (University of Illinois at Chicago), and Daniel Rubenstein and Iain Couzin (Princeton University), who were instrumental in several parts of this research. We thank the staff at Mpala Research Centre (Kenya) and Ol’ Pejeta Nature Conservancy (Kenya), and fellow graduate students at EEB-Princeton University, CS at University of Illinois at Chicago, and University of Nairobi. Funding was provided by the Department of Ecology and Evolutionary Biology of Princeton University: NSF OCI-1152895 ‘EAGER: Field Computational Ecology Course’ (Berger-Wolf and Rubenstein), NSF IIS-0747369 (Berger-Wolf), and NSF OCI-0943559 (Leigh).
The authors declare that they have no competing interests.