Previous work in our lab has demonstrated that efficient visual search with a fixed target has a reaction time by set size function that is best characterized by logarithmic curves. Further, the steepness of these logarithmic curves is determined by the similarity between target and distractor items (

Starting from the retina, early stages of the human visual system are organized in a parallel architecture, so that low-level information is extracted and represented simultaneously for a wide view of the world (

Notably, much of the effort in the visual search literature has been devoted to understanding focused attention, a capacity-limited form of visual attention where items or subsets of items in the display are serially processed. Plenty of empirical research has thus focused on the dependent variable of search slope—how much longer on average it takes for the visual system to process an additional item, and tried to establish relationships between different task settings and corresponding changes in search slopes. These task setting variables include how many features define the target (

Perhaps partly because of this tradition, cognitive experimental research on visual search has become somewhat disinterested in understanding parallel processing in visual search. The specific reason for this disinterest might be because of the assumption that parallel processing is synonymous with ‘flat’ search functions. This follows from two observations: first, when the linear regression of RT by set size returned a slope coefficient that’s close to zero (or smaller than 10ms/item), the search function is typically assumed to be a flat straight line; second, parallel processing with unlimited capacity was assumed to produce no (meaningful) additional time cost as additional items are introduced to a display. Therefore, when search slope is found to be near zero, the usual inference is that items are processed in parallel and that there is no need for attentional selection. Thus, the data producing that pattern were considered to be not informative for understanding visual attention. Yet, as discussed below, recent findings indicate that neither of these assumptions necessarily hold (

Our knowledge about the parallel processing of visual scenes has largely come from computationally-oriented approaches to vision, in which the central goal is to predict the series of loci of attention or eye fixations given a specific scene or image. By the success of these computational models of attention in predicting human fixations, one can argue that low level, parallel computations carried out by these models mimic the parallel processing of human vision. For example, the work by Itti and Koch (

Consequently, important aspects of the influence of parallel processing in visual search are largely unchartered. There are several reasons why understanding this processing stage is important.

We propose then that developing a better understanding of early parallel processing ought to be very informative to attention research. Empirically, there are various experimental results indicating that the visual system can rapidly access substantial amounts of information without focused attention, such as scene gist (

Recent work in our lab demonstrated an important reaction time signature of the parallel processing stage in fixed-target, efficient visual search (

Key findings demonstrating logarithmic RT by set size functions from Buetti et al. (

Returning to lures, lures are sufficiently different from the target that they can be processed in parallel, across the visual scene and, with a high degree of success, they can be ruled out as non-targets. When candidates and lures are both present in a scene, one can dissociate the linear and logarithmic RT contributions to overall RT that each bring about (see Figure

Given these results, we proposed that lure items are processed in the first parallel stage of vision to the degree that there is sufficient evidence to reject them as possible targets. Naturally, candidates go through this parallel stage because the resolution limitation at this stage of processing means it cannot differentiate them from the target. Locations where information is not differentiated in that manner are passed on for analysis by the second stage of focused spatial attention. Further, the relationship between lure-target similarity and the slope of the logarithmic function indicates that lure-target similarity determines the efficiency of processing for each

We developed the following set of hypotheses to construct a theoretical model of stage-one visual processing that allows us to understand variability in stage-one processing times:

(1) Consistent with traditional assumptions of early visual processing (e.g.,

(2) During stage-one processing, the visual system is attempting to make a binary decision at each location where there is an item. The question is: is this item sufficiently different from the target? If so, the item is unlikely to be the target and doesn’t require further processing. If it is sufficiently similar to the target, given the resolution limitations of peripheral vision, the item will require further processing and its location will be passed on to stage-two processing. An eye movement or a deployment of focused attention will be required to resolve and inspect the item to determine whether or not it is the target.

(3) The amount of evidence required to reach a decision about an item (the “decision threshold”) is proportional to its similarity to the target. This follows from the idea that the more visually similar an item is to the target, more information is needed to determine that the item is indeed not the target and will not require further inspection. Given the resolution limitation of peripheral vision, there is a maximum decision threshold. All locations containing items that reach that level (i.e., items too similar to the target and the target itself) will be passed on to the second stage of processing.

In order to make explicit predictions with this theory, we specified the following assumption to model individual item processing times:

(4) Processing of individual items is modeled by noisy accumulators. The rate of information accumulation at each instant is drawn from a Gaussian distribution with a positive mean value. Processing is complete when accumulated evidence reaches a decision threshold. As proposed above, the decision threshold is proportional to the item’s similarity to the target. This process is thus mathematically equivalent to a Brownian motion with a constant drift rate towards a given threshold. Completion time t of this process follows the Inverse Gaussian distribution (

where A is the accumulator’s threshold, k is the constant drift rate (or mean accumulation speed), and σ is the standard deviation of accumulated information at each instant.

These assumptions enabled us to numerically simulate different implementations of parallel, unlimited capacity processing system, and derive the expected time cost as a function of number of items to be processed, modulated by the similarity of items to the target. Specifically, following the pioneering work by Townsend and Ashby (

Given our current theory of stage-one processing in visual search, one intriguing application is to the understanding of performance in search tasks with multiple types of distractors presented simultaneously. Many laboratory experiments on visual search use highly homogeneous displays, i.e. the distractor items are either completely identical, or composed of groups that differ from each other in only one feature dimension. In the real world, however, an arbitrary scene often consists of mostly non-repeating objects. When a specific target is defined, it is also usually the case that most non-target objects are highly dissimilar to the target, so that very few of them need to be actually examined (

Notice that many conclusions drawn from homogeneous search tasks cannot be easily extended to a heterogeneous search scenario simply. Duncan and Humphreys (

One prominent aspect of our current theory is that it emphasizes the role of visual similarity in the parallel stage of processing, and it makes a more specific formulation of the effect of target-distractor similarity in comparison to Duncan and Humphreys (

More importantly, the degree of similarity between one distractor item and the target item

There are two obstacles that need to be resolved before we can take on this approach. The first issue is that an analytical solution for stage-one processing time based on our current model is not readily available, which means that given observed log slope values, we cannot directly compute the corresponding accumulator thresholds. This is because even though the individual accumulator’s completion time is well understood (formula 1), in the case of heterogeneous displays (where individual completion times are sampled from multiple groups of different Inverse Gaussian distributions), the maximum of all items’ completion times (since our model assumes an exhaustive termination rule) requires an integral that seems to be analytically unsolvable.

A second issue lies in the fact that our model assumed individual items’ processing are independent of each other, and this assumption was not directly backed by evidence. In Buetti et al. (

First, we began by simulating homogeneous search completion times for three types of lures by using a different accumulation threshold for each lure type. We then

Second, we ran simulations of completion times for heterogeneous search scenes. Each scene was composed of a varying number of each of the three types of lures. Processing of every lure was modeled by

Third, based on different assumptions about how the processing of heterogeneous search scenes might unfold, we developed four different theoretical models of stage-one processing. For each of these models, we derived a hypothetical equation that approximates the completion time as a function of the number of lures of each type present on the display and each lure type’s logarithmic slope coefficient (i.e., the D values extracted from lure-homogeneous simulations). The models and their corresponding equations were pre-registered on the OpenScienceFramework website, in the context of the pre-registration for Experiment 2 (

Fourth, we compared the

Finally, we did an

This predicted-to-observed RT comparison will also be used to investigate whether individual items are processed independently of one another or whether there are inter-item interactions that produce systematic deviations from the model. Because our simulation was conducted under the assumption of between-item independence, the predictions of this equation naturally carry that assumption along. Hence, any systematic deviation of predicted stage-one processing times from observed heterogeneous search data can be interpreted as effects of homogeneity (or heterogeneity, depending on the viewpoint) on processing times. To estimate the deviations from our model, we fit predicted stage-one processing times to observed data via a linear regression. On the one hand, if the model allowed for a perfect prediction, then observed and predicted stage-one time costs should line up precisely along the

As a summary, the current study consists of a predictive simulation and two experiments. The simulation provides a best-performing equation for predicting the completion times for stage-one processing in heterogeneous visual search, using parameters obtained from homogeneous visual search displays. Next, we used a set of real-world object images to construct homogeneous search displays in Experiment 1 and heterogeneous search displays in Experiment 2. Experiment 1 served both as an extension of our previous ‘feature search’ results (as in Experiment 1 of

We developed the following set of equations in the hope that some of them may be a good approximation of the exact analytical solution of stage one processing time cost in heterogeneous lure search. Each equation describes time cost of stage-one processing, _{i}_{i}_{3} > _{2} > _{1} > 0, i.e. we denote lure no.3 to have the highest similarity to the target and lure no.1 with lowest similarity. Note that these equations do not include time costs associated with other processing stages, such as encoding, response selection, and execution, which we assume to be constant in efficient search tasks for a given target.

This equation was a simple extension of the concepts in Buetti et al. (_{3} > _{2} > _{1} > 0), with _{i}_{1}_{2}_{2} – _{1}), and so on.

Equation 2 above assumes that each group of lures is considered and rejected sequentially. That is, different types of lures are processed in a serial and exhaustive fashion, while within each type of lure, individual items are processed in parallel. This model would mean that first all blue lures are processed and discarded in parallel, then the red ones, and last the orange ones. The big difference between Models 1 and 2 is that in Model 2, accumulation for red ones will start once the blues have been discarded, whereas in Model 1, accumulation for

Equation 3 represents a model that has a single decision threshold associated with the single D value in the equation. The model predicts that while all items are being processed in parallel and exhaustively, the amount of information required to complete processing is determined by the lure with the highest similarity to the target. This can be understood as there being a single decision threshold for the entire display: items below it will be discarded at the same moment, while items above the threshold will require focal inspection (i.e., are likely targets). This kind of idea has been proposed in the literature in various papers (e.g., Guided Search,

Equation 4 serves as an alternative to equation 3. Here the log slope is estimated by the mean of the 3 types of lures (instead of the max), while all items are still processed exhaustively in parallel.

We note here that the above 4 equations include variations in different aspects of processing across lure types. Equation 1 is the strongest extension of our theory since it assumes both parallel processing and independence across lure types. Equation 2 assumes independence but serial processing across lure types, whereas equations 3 and 4 assumes parallel processing but with interaction between different lure types.

The goal of this simulation is to find out which of the 4 equations above best accounts for simulated time costs of stage-one processing in heterogeneous scenes. The critical parameters are the

We used

Given a specific set of parameters, the simulation procedure and algorithm can be described as follows:

For each type of lure item, we simulated homogeneous search time as a function of set size. At set size N, there were 1 target item and N-1 lure items. The target item’s processing time was found by randomly sampling from an Inverse Gaussian distribution defined by the target threshold _{t}_{l}

For each type of lure, we computed a regression of

We simulated heterogeneous search time with different combinations of the 3 types of lures. For each simulated display condition, each type of lure could appear 1, 3, 7, or 15 times with one or two other lure types, which yielded a total of 111 unique combinations or conditions (see Appendix A for a complete list of these conditions). Processing time costs were then simulated in the same way as in step (1), with 2000 repetitions per condition.

For each of the 4 approximating equations, we computed the predicted completion times for each display condition simulated in (3) using the estimated

We plotted simulated completion times for heterogeneous scenes against predicted completion times as scatterplots, for each equation and for both simulation runs in Figure

Predicted processing time according to the 4 approximating equations for heterogeneous processing times plotted as a function of simulated processing times. Panels A and B present results from two different simulation runs using different sets of parameters (see text for more details).

Example search displays for Experiment 1 (top row) and Experiment 2 (bottom row).

Model characteristics of linear regressions of simulated processing times as a function of predicted processing times.

Simulation 1 | Simulation 2 | |||||||
---|---|---|---|---|---|---|---|---|

R^{2} |
Log likelihood | Slope (Standard Error) | Intercept (Standard Error) | R^{2} |
Log likelihood | Slope (Standard Error) | Intercept (Standard Error) | |

Eq 1 | 0.9615 | 114.031 | 1.005 (0.019) | –0.109 (0.127) | 0.9637 | 124.179 | 1.008 (0.018) | –0.152 (0.156) |

Eq 2 | 0.8045 | 23.816 | 0.463 (0.021) | 3.075 (0.163) | 0.8095 | 32.107 | 0.482 (0.022) | 3.871 (0.203) |

Eq 3 | 0.7934 | 20.758 | 0.877 (0.043) | 0.570 (0.291) | 0.7709 | 21.878 | 0.861 (0.045) | 0.905 (0.383) |

Eq 4 | 0.8049 | 23.953 | 1.112 (0.052) | –0.652 (0.338) | 0.7852 | 25.452 | 1.137 (0.057) | –1.058 (0.466) |

From these results, we can conclude Equation 1 is our best performing equation for predicting heterogeneous lure search based on performance metrics from homogeneous displays. In the next section, we will consider empirical data based on human participants in both lure-homogeneous (Experiment 1) and lure-heterogeneous (Experiment 2) search tasks.

Experiment 1 serves two purposes. First, it allowed us to estimate three different lure-target similarity coefficients in homogeneous displays to be used to predict performance in Experiment 2 (heterogeneous displays). In addition, it allowed us to extend the findings from Buetti et al. (

Reaction times for Experiment 1 plotted as a function of set size and lure type. Curves indicate best-fitting logarithmic functions. The legend shows the analytical form of each of these functions as well as corresponding R-squares as a measure of fit. Error bars indicate one standard error of the mean. Images of search stimuli and the corresponding data symbols are presented on the right.

Search objects were chosen from a collection of images studied by Alexander and Zelinsky (

These images of objects were presented with sizes of approximately 1.3 degrees visual angle horizontal and 1.7 degrees visual angle vertical. All images had a small red dot overlaid on the left or right side, with a diameter of 0.2 degrees of visual angle. In each search display, there was always only one target and at most one type of lure item. The items were randomly allocated onto the screen based on an invisible 6-by-6 square grid that spanned 20 degrees of visual angle horizontally and vertically. Each item’s actual location was then randomly jittered within 1 degree horizontally and vertically. On average the minimal distances between two items (i.e. the distance between two adjacent grid points) was 3.5 degrees. The grid was populated with equal (or approximately equal) numbers of items in each of the four quadrants of the screen. A white fixation cross was also presented at the center of the screen, spanning 0.6 degrees vertically and horizontally. All displays had a gray background with a color vector of [121, 121, 121] in RGB color space. Figure

Trials started with a brief presentation of the central fixation cross, with a duration randomly selected from 350 to 550 ms. Then, the search scene was displayed for a maximum duration of 2.5 seconds. The display turned blank as soon as the participant pressed a response key. On error trials, a warning tone (1000Hz sine wave lasting 250 ms) was played. The inter-trial interval was selected randomly between 1.4 to 1.6 seconds. Each experiment session started with a practice block of 32 trials.

We compared regression models based on logarithmic and linear RT by set size relationships using R square and log likelihood as measures of goodness of models._{red carrotman} = 66.278, D_{white reindeer} = 28.492, D_{grey car} = 26.581. In sum, the results show that the RT by set size functions found in this experiment are best characterized by a series of logarithmic functions.

Logarithmic vs. linear regression results of RT by set size functions in Experiment 1.

With target-only condition | Without target-only condition | |||||||
---|---|---|---|---|---|---|---|---|

Logarithmic | Linear | Logarithmic | Linear | |||||

R square | Log Likelihood | R square | Log Likelihood | R square | Log Likelihood | R square | Log Likelihood | |

Red carrot man | 0.933 | –27.178 | 0.713 | –31.558 | 0.965 | –18.718 | 0.924 | –20.650 |

White reindeer | 0.930 | –22.251 | 0.619 | –27.351 | 0.951 | –15.365 | 0.711 | –19.827 |

Grey model car | 0.852 | –24.354 | 0.595 | –27.380 | 0.938 | –14.636 | 0.919 | –15.300 |

It should be noted that the estimated ‘linear slopes’ were very small and would be categorized as ‘efficient’ or ‘pop-out’ search according to traditions in the literature. When the target-only condition was included, the estimated linear slope coefficients were 6.380 ms/item, 2.559 ms/item, and 2.446 ms/item for carrot man, reindeer, and model car lures, respectively. Without the target-only condition, these changed to 4.531 ms/item, 1.727 ms/item, 1.503 ms/item.

A within-subjects ANOVA using lure type and set size as factors on correct RTs was also conducted. Main effects were significant for both lure type, F(2, 44) = 265.37, p < 0.001, Cohen’s f = 3.47, and set size, F(5, 110) = 217.69, p < 0.001, f = 3.15. More importantly, the interaction between set size and lure type was significant, F(10, 220) = 54.13, p < 0.001, f = 1.57. These results indicate that the different levels of lure-target similarity lead to different magnitudes of set size effects, i.e. lure-target similarity modulated search processing efficiency. To further understand this difference in search efficiency, we also computed individual subjects’ logarithmic slope estimates and use t-tests to compare the mean log slope for different pairs of lures. Consistent with the visual pattern in Figure

Overall our results provided evidence that a logarithmic function better captured the relationship between reaction time and set size in efficient searches with real-world stimuli than linear models. Importantly, this conclusion was not contingent upon whether the target-only condition was included in the analysis. Additionally, the steepness of the logarithmic curves depended on the similarity between target and lures: the higher similarity, the steeper or more inefficient search functions. This pattern of results extends our previous findings to real-world stimuli and corroborates the notion that visual similarity modulates early parallel visual processing, regardless of whether search objects differ from each other along a couple or multiple feature dimensions (

We should note that there is some difference between the similarity relationship reflected in our visual search results and Alexander and Zelinsky (

In Experiment 2, we used the same stimuli as in Experiment 1 to construct heterogeneous search displays. We then compared observed RTs with predicted RTs from the four equations described in the Predictive Simulation section. We had two goals. The first goal was to determine which of the four equations best predicted human performance. The second goal was to evaluate what kind of systematic deviations exist between our theory-based RT predictions and human data.

Because of the limited number of conditions we could afford within one experimental session (about 50 minutes), we designed five different subsets of conditions of heterogeneous displays, characterized by different types of lure combinations (see Table

Description of all the conditions tested in Experiment 2, organized by subset. In Subsets 1–3 only two lure types were presented in the display with the target, whereas in Subsets 4–5 always contained all 3 types of lures in addition to the target.

# red carrot man | # white reindeer | # grey model car | Description | |
---|---|---|---|---|

0 | 1 | 2 | Comparable numbers of white reindeer and gray cars | |

0 | 3 | 4 | ||

0 | 7 | 8 | ||

0 | 15 | 16 | ||

1 | 0 | 2 | Roughly equal numbers of red carrot man and white reindeer | |

3 | 0 | 4 | ||

7 | 0 | 8 | ||

15 | 0 | 16 | ||

1 | 6 | 0 | Fixed 6 reindeer, varying number of carrot man | |

5 | 6 | 0 | ||

9 | 6 | 0 | ||

21 | 6 | 0 | ||

1 | 1 | 1 | Roughly equal numbers of all 3 types of lures | |

2 | 2 | 3 | ||

5 | 5 | 5 | ||

10 | 11 | 10 | ||

1 | 4 | 2 | Fixed 4 reindeer, comparable numbers of carrot men and cars | |

3 | 4 | 4 | ||

7 | 4 | 8 | ||

13 | 4 | 14 | ||

0 | 0 | 0 | Baseline |

Notice that in both Experiments 1 and 2 we had included a target-only condition where the only item in the display is the target. We consider reaction time in this condition to be an important baseline to compare performance across both groups. Mean RT in this condition represents all the RT components that do not depend on set size, e.g. time for visual information to arrive at the cortex, response selection processes, motor response time, etc. In the case of efficient search for a target among lures, the only component depending on set size should be the stage-one processing time, which can be computed by subtracting target-only RT from RT in each of the conditions with mixtures of lures. Notice that this operation is consistent with the property of the logarithmic function, i.e.

The analysis consisted in linear regressions for observed RTs as a function of predicted RTs for each subset of conditions. Because there were four equations to be compared, there were be four different set of predicted RT values. Each set of predicted RT values are based on all 20 non-target-only conditions of the experiment. Thus, four regressions were performed using observed RTs as the dependent variable and each set of predicted RTs as the independent variable. To compare the performance or ‘goodness of fit’ across the four equations, we computed the R-square and the Akaike Information Criterion (

In a further analysis focused on the best-performing equation, we analyzed whether the human data had any systematic deviations from the model. We interpret this deviation as the effect of heterogeneity/homogeneity on stage-one processing. As described before, we were interested in identifying either additive or multiplicative deviations. To estimate them, we performed six regressions on stage-one time costs obtained from the best-performing equation, one for each subset of conditions (5 total), plus one overall regression combining all conditions. In this manner, the estimated intercept coefficients become a useful indicator of any systematic lure-to-lure interaction effects in homogeneous search. If this interaction does not cause an additive difference, then the estimated intercept should be equal to zero, assuming that our best performing equation provides a truthful prediction. Hence, any substantial difference between the estimated intercept and zero becomes a measure of the magnitude of this additive effect. By the same logic, deviation of the estimated slope coefficient from 1 represents a multiplicative effect of inter-lure interaction in homogeneous search displays.

_{white reindeer} = 28.492) and of the grey car (D_{grey car} = 26.581) from Experiment 1, in spite of the fact that in Subset 1, there were two types of lures always present in the display. In contrast, the log slopes for Subsets 2 to 5 were all _{red carrotman} = 66.278) from Experiment 1. That is, even though in each of the Subsets 2–5, the red carrot man was paired with stimuli that were

Reaction times in Experiment 2 as a function of set size, grouped by the different subsets of conditions. Error bars indicate one standard error of the mean. Curves are best-fitting logarithmic functions, see Table

Logarithmic regression results of search RT for each subset of conditions.

Log Slope (D) | Intercept | R square | |
---|---|---|---|

26.881 | 663.97 | 0.9722 | |

79.59 | 649.38 | 0.9811 | |

75.183 | 650.16 | 0.9307 | |

70.196 | 654.85 | 0.9801 | |

71.974 | 649.76 | 0.9573 |

Linear regression results of predicted RT to observed RT in Experiment 2.

Equation 1 | Equation 2 | Equation 3 | Equation 4 | |
---|---|---|---|---|

0.9681 | 0.9178 | 0.9480 | 0.9153 | |

174.533 | 194.392 | 184.757 | 195.003 | |

14.520 | 23.298 | 18.522 | 23.639 |

In sum, Equation 1 represents an architecture that is equally successful at predicting performance in simulations as well as in human experiments.

Regression coefficients of the regression of Equation 1’s predicted stage-one processing times to observed stage-one processing times.

Intercept | Slope | |||||
---|---|---|---|---|---|---|

Estimate | Std. Error | 95% C.I. | Estimate | Std. Error | 95% C.I. | |

–10.961 | 7.268 | [–26.17; 4.25] | 1.3328 | 0.0555 | [1.22; 1.45] | |

0.948 | 5.986 | [–18.11; 19.99] | 0.9554 | 0.0938 | [0.66; 1.25] | |

–4.392 | 6.585 | [–25.35; 16.56] | 1.3587 | 0.0515 | [1.19; 1.52] | |

2.038 | 16.573 | [–50.71; 54.78] | 1.2063 | 0.1179 | [0.83; 1.58] | |

–1.242 | 10.217 | [–33.76; 31.27] | 1.2905 | 0.0857 | [1.02; 1.56] | |

–1.242 | 6.277 | [–21.22; 18.74] | 1.2890 | 0.0474 | [1.14; 1.44] |

To evaluate whether there was an additive effect of homogeneity on stage-one processing times, we computed and reported 95% confidence interval of both coefficients. The regression on all 20 conditions combined had 19 degrees of freedom, whereas the regressions on each subset had 3. All 6 intercepts’ confidence intervals included zero, indicating that there was no meaningful additive deviation when we predict heterogeneous search time using efficiency parameters (D values) from homogeneous search data.

Next, to evaluate whether there was a multiplicative effect of homogeneity on stage-one processing times, we can compare 95% confidence intervals of slope coefficients to 1. The results indicated that slope coefficients were significantly larger than 1 when all conditions were combined, as well as in 3 out of 5 Subsets (specifically, subsets 2, 4, 5 had confidence intervals that were larger than 1). In other words, our best-predicting equation

Predictions of Equation 1 for each Subset plotted against observed stage-one processing time. Error bars indicate one standard error of the mean. The lines

Equation 1 provided the best predictions of heterogeneous search reaction time using log slope values estimated from homogeneous search data, just it did for simulated data. Thus, the predictive power of our theory was confirmed by empirical data. Therefore, Equation 1 represents a formula that will allow investigators to predict performance in heterogeneous search scenes. In the present study, Equation 1 accounted for 96.81% of the variance for a total of 20 different experimental conditions. This predictive success is all the more compelling given that predictions were based on parameter estimates from different participants.

Further, since our simulation on both homogeneous and heterogeneous search assumed processing independence between individual items, systematic deviations from Equation 1’s predictions can be used to estimate quantitatively, for the first time, the extent and effect of homogeneity facilitation in efficient search tasks. The results indicated a systematic multiplicative deviation that suggests that in homogeneous displays, identical items do interact in a facilitative fashion and are not truly independently processed.

The slope coefficients for all conditions combined, as well as for Subsets 2-5 all indicated a systematic multiplicative under-prediction by a factor somewhere between 1.2 and 1.3, in a fairly consistent pattern. It should be noted, however, that regression analysis on stage-one processing times for Subset 1 showed a slope coefficient that deviate from the other groups. It was much closer to 1 (estimate = 0.9554, standard error = 0.0938). Recall that Subset 1 also had a cluster of RTs that substantially differed from the other Subsets, as shown in Figure

More data is needed to understand why Subset 1’s results differed from those of the other Subsets. There are at least two possible explanations. The first is that, at extremely low levels of lure-target similarity, homogeneity facilitation effects are absent. If so, the D values observed in Experiment 1 are good predictors of performance in heterogeneous displays, simply because the D values truly represented stand-alone processing efficiency of those items. A second possibility is that, for this pair of stimuli, the same lure-to-lure interaction effects are present in both homogeneous and heterogeneous display. That is, perhaps when two types of lures are

Recent work in our lab has uncovered that there is systematic variability in stage-one processing times and that much can be learned about the architecture of early visual processing by studying this variability (

Using data from homogeneous search tasks with real-world objects (Experiment 1), we were able to predict heterogeneous search RTs (Experiment 2), accounting for as much as 96.8% of the variance and with high precision, as indicated by the 14.520 ms RMSE. This prediction was made across participants: that is, parameters were estimated on one set of participants and predictions were confirmed on an entirely new set of participants that had never participated in one of these search experiments and who never saw any homogeneous displays like the ones used to estimate D parameters. The only common condition across experiments was the target-only condition. Finally, we used systematic deviations from the predictions of our model to estimate quantitatively, for the first time in the literature, the effects of homogeneity facilitation on performance in homogeneous displays, similar to the ones traditionally used in the literature to study efficient, a.k.a. pop-out, search (all elements identical but one). We found evidence that in homogeneous displays, there is a facilitatory processing effect whereby evidence thresholds are systematically reduced. This results in an improvement in overall search efficiency (in logarithmic space) in homogeneous scenes and thus, D coefficients estimated in homogeneous scenes end up under-predicting performance, by a multiplicative factor, in heterogeneous scenes where lure-to-lure interactions are absent (or much reduced).

The idea that the degree of heterogeneity (or homogeneity) in a scene influences visual search processing efficiency is not new. Duncan and Humphreys (

In contrast, here we conducted a more systematic evaluation of heterogeneity and differences in processing between heterogeneous and homogeneous scenes. We analyzed homogeneous search separately for three different types of lures and designed displays with varying degrees of heterogeneity using those stimuli. Whereas Duncan and Humphreys suggested increasing linear search slopes with increasing degree of heterogeneity, we found that different mixtures (e.g. mixing two or three types of lure items) can be accounted for by a single constant factor (around 1.3).

At the theoretical level, according to Duncan and Humphreys, items are given different attentional weights or different amounts of resources from a limited pool, depending on their similarity to the target template. Items (or ‘structural units’) compete for access to visual short-term memory by their weight. Further, the more items perceptually group with each other, the stronger the weights of those items will covary. This

As a result, our results imply that a mechanism different from Duncan and Humphreys spreading suppression is at play in homogeneous search. We foresee at least two possible mechanisms. First, it is possible that instead of grouping similar search items, decisions are still made for each individual item, but

Alternatively, the lowering of thresholds for identical items could reflect the presence of an evidence monitoring mechanism. An evidence monitoring mechanism is one that observes (i.e., monitors) how evidence accumulates at all local accumulators and sums up (or averages) evidence over all (or large) regions of the scene, much like global motion detectors sum/average local motion signals to extract a global motion direction. Applied to lure processing, as information accumulates, regions containing identical lure items will produce stronger evidence against target presence, compared to regions containing different lure items. Thus, homogeneous regions can then be discarded sooner as being unlikely to contain the target. Precise location information would not be available for these global accumulators because they represent large regions, but that is not a big problem: representing lure-location information is unnecessary for task completion, what is needed, rather, is a representation of the target location. Rejecting large regions of the display as unlikely to contain the target does help to reduce the uncertainty about the target location. Further, an advantage of such an evidence monitoring mechanism is that it can facilitate the orienting response towards regions that are more likely to contain the target (if one is present). This might happen even

It is also interesting to consider how our results relate to other accounts of heterogeneity effects in visual search. The Signal Detection Theory model of visual search has been shown to offer a natural explanation of the heterogeneity effect (

In sum, current data present a challenge to traditional views of the distractor homogeneity effects and suggest that further study is needed to understand the mechanisms underlying this search facilitation effect. A potential avenue for further testing this facilitative interaction between homogeneous items is through the use of the capacity coefficient (

It is important to note that the visual search task used in this study was a target discrimination task, where there is always a target present in the display, and the participants had to locate it in order to make a decision about its details (i.e., the relative location of the red dot). This is different from the target detection task also used in the literature, where the presence or absence of a target is to be reported (e.g.

In a target discrimination task with a fixed target, efficient visual search is best characterized as arising from a system that processes all items in a parallel, unlimited capacity, and exhaustive fashion. Under this conceptualization, a lawful relationship between heterogeneous and homogeneous search performance was predicted by simulation and confirmed by experiments with a novel methodology. Results indicated that, rather than being completely independent, individual items

The additional file for this article can be found as follows:

Conditions Used in Simulation. DOI:

An additional underlying assumption is that measures of lure-target similarity generalize across subjects, at least at the group level. Thus, we can use the estimates from one set of participants and use them to predict performance in a new set of participants.

With 2 types of items, for example, the essential integral to be solved takes the following form:

where _{1}, _{2} are the numbers of two types of items, and _{1}, _{1}, _{2}, _{2} are corresponding parameters,

The measure of log likelihood is a measure of how likely the regression model is given the observed data. The higher log likelihood value, the more likely a specific model is. The relative likelihood ratio between two models can be computed by _{1} – _{2}) where _{1} and _{2} are log likelihood values.

The relative likelihood ratio between two linear models can be computed using AIC values by the formula _{1} – _{2})/2). Thus the regression model based on Equation 1 was

The authors have no competing interests to declare.