Recently, Wang, Buetti and Lleras (

The literature on visual search has, for the most part, overlooked the variability observed in efficient visual search experiments, that is, under conditions where the target appears to pop-out of the display and is overwhelmingly the first item selected by attention. Thus far, “efficient” visual searches have been characterized through the following rule of thumb: whenever the measured linear slope of the reaction time (RT) by set size function is less than 10 ms/item, one can describe the search as having been efficient, whereas slopes larger than 10 ms/item suggest more inefficient processing (e.g.,

Recently, Buetti, Cronin, Madison, Wang and Lleras (

Buetti et al. (

The cognitive architecture proposed by Buetti et al. (

As mentioned above, one of the key findings in Buetti et al. (

In their initial model, Buetti et al. assumed

Wang, Buetti, and Lleras (

To formulate this equation, Wang et al. assumed that: (a) all lures are processed in parallel; (b) that evidence stops accumulating at a location once the decision threshold for that stimulus has been reached; (c) that evidence continues to accumulate at locations where decision thresholds have not been reached. At the aggregate level, this means that lures with lower lure-target similarity will be rejected sooner than lures with higher lure-target similarity. This reduces the number of “active” accumulators over time. Going back to the example with blue, yellow and orange lures, with a red target, in this model, blue lures would be on average rejected first, then yellow lures, and finally orange lures (on average). In the equation above, there are three different types of lures in the scene, with _{3} > _{2} > _{1} > 0 being the three logarithmic slopes for each lure of type _{i}_{1}). Then, evidence for lures of types 2 and 3 continues to accumulate. However, some evidence about these lures has already been accumulated (dictated by D_{1}). Thus, the second term represents the _{2}), while also continuing to accumulate evidence for lures of type 3, (hence the term _{2}–_{1}), and so on. For more details, readers are directed to Wang et al. (

The generalized equation when _{T} is the total number of lures, and D_{j} parameters are organized from smallest (D_{1}) to largest (D_{L}):

Importantly, using this equation and real-world objects, Wang et al. were able to account for 96.8 percent of the variance in the heterogeneous scenes experiment based on the log slopes measured in the homogeneous scenes experiment.

An unexpected aspect of the Wang et al. results was that predicted RTs systematically underpredicted observed RTs in heterogeneous scenes by a factor of about 1.3, such that

The goal of the present study was two-fold. First, we sought to test again the predictive power of Equation 2 using the same simple geometric stimuli used in Buetti et al.’s Experiment 1A. The idea was to take the log slopes for blue circles, yellow triangles and orange diamonds reported in Experiment 1A from Buetti et al. (

Search displays were always heterogeneous, with a mixture of at least two different lure objects, sometimes three. One exception to this was the target only condition where the target was presented with no lure items. The target was a red triangle, which could face to the left or to the right, and was fixed throughout the experiment session. The lure stimuli were orange diamonds, which were two triangles placed side by side and had high similarity to the target (similar hue and shape), yellow triangles, which could face to the right or left and had moderate similarity to the target (dissimilar hue and similar shape), or blue circles, which had low similarity to the target (dissimilar hue and shape). This set of stimuli was the same set used in Buetti et al. (

Examples search displays from

Table

Next, we used Equation 2 to predict RTs for each condition in each group. Note that for each group, the value of the variable ^{2} of the prediction was 0.899, demonstrating that, as in Wang et al. (

Observed Reaction Times across all 45 conditions as a function of the predicted Reaction Time for each condition.

Note that as a comparison, we also plotted in Figure _{i}_{2} will determine the decision threshold for all items in the scene, whereas if lures of types 2 and 3 are present, _{3} will determine the decision threshold for all items in the scene. If all lures are present, it will be _{3}. Finally, if only lures of type i are present, its corresponding

The corresponding equation is:

The corresponding fit was visibly poorer (R^{2} = 0.827). Using the Akaike Information Criterion (AIC) model comparison metric, we found that the multi-threshold model was 4.3 × 10^{16} more likely than the single-threshold model given the observed data. This result validates the results of Wang et al. (_{i}_{i}^{2} = 0.774, figure not shown).

Going back to Equation 1, the current results once again replicated the findings of Wang et al. (

Experiments 2A and 2B were designed to test the hypothesis that the multiplicative factor by which we underpredicted RTs in heterogeneous search displays reflects the extent of inter-item interactions present in homogeneous displays. If spatial proximity to identical elements is necessary for these inter-item interactions to be observed, then by spatially intermixing the different lure types as we have done so far, then we have (inadvertently) minimized these inter-item interactions in heterogeneous displays, both here and in Wang et al. (

An alternative hypothesis is that the multiplicative factor represents not the inter-item interactions themselves, but rather, the presence of heterogeneity of the display (or conversely, the absence of homogeneity). That is, it is possible that when multiple types of lure items are present in a display, the displays are processed fundamentally different than displays where all lure items are identical to one another. Put the other way around, it is possible that homogeneous displays afford a processing advantage that is absent whenever multiple types of lures are present in a scene. If so, whether lures are presented in a spatially intermixed manner or spatially segregated manner should matter little: in both cases, multiple types of lures are present in the display (increasing, in some abstract sense, the degree of scene complexity) and therefore, in both cases, Equation 2 should underpredict RTs, perhaps even to identical degrees as in spatially intermixed displays.

Experiments 2A and 2B differed only with regards to the stimulus used. In Experiment 2A, we used the same stimuli as in Experiment 1 (and in

The methods for both experiments were identical except for the stimuli and background used in each experiment.

Sample displays used in Experiments 2A (left) and 2B (right).

Observed Reaction Times across all 9 conditions of Experiment 2A (plus the target only condition) as a function of the predicted Reaction Time for each condition.

As can be seen in Figure ^{2} for the predictions was close to 0.92. Both of these measures were better than the ones obtained for Equation 3, which had a worse average prediction error (21 ms) and worse R^{2} (0.85). Model comparison based on AIC indicated that the multi-threshold model (Equation 2) was 41.9 times more likely than the single-threshold model (Equation 3). In addition, a paired two-sample t test showed that prediction errors for Equation 2 were significantly smaller than those for Equation 3: t(8) = –4.92, p = 0.001. More interestingly, in contrast to Experiment 1, the best-fitting line for the ObservedRT as a function of PredictedRT now shows a slope close to 1 (0.91, standard error = 0.096) rather than 1.8, much as if this time around, the equation had accurately captured the extent of lure-to-lure interactions present in the displays. In other words, given that the predictions were made based on log slope coefficients obtained from homogeneous displays, the current results suggest that the spatial segregated displays produce just as strong inter-item interactions as do perfectly homogeneous displays, in spite of the added complexity that arises from having two different sets of lures simultaneously present in the display. Experiment 2B therefore consisted of a test of this preliminary conclusion using real-world objects.

Observed Reaction Times across all 9 conditions (plus the target only condition) in Experiment 2B using real-world objects as a function of the predicted Reaction Time for each condition.

In this instance, the R^{2} produced by both models were not meaningfully different (0.977 and 0.980 for Equations 2 and 3 respectively). Relative likelihood of Equation 2 over Equation 3 based on AIC was found to be 1.98. That said, Equation 2 did produce much more accurate predictions overall: average prediction error was 8 ms compared to 20 ms for Equation 3. A paired two-sample t test showed that prediction errors for Equation 2 were significantly smaller than those for Equation 3: t(8) = –2.97, p = 0.018. Thus, in the context of these results as well as those of Experiment 2A and those in Wang et al. (

In sum, we take the results of Experiments 2A and 2B as evidence that lure-to-lure interactions reduce RTs above and beyond lure-target similarity effects and that the magnitude of these interactions depend on the spatial configuration of the lures in the display: the lure-to-lure interactions maximally reduce RTs when identical lures are near each other (as in homogeneous and spatially-segregated displays), and their impact on RT decrease as they become spatially intermixed with other lure items (spatially intermixed conditions).

The goal of this investigation was two-fold. First, we sought to confirm the findings of Wang, Buetti & Lleras (

There was reason to believe a priori that the independence assumption was in fact too strong. In the cognitive neuroscience literature, there is ample evidence of inter-element interactions. Gilbert and colleagues looking at single-cell responses in V1 in monkeys have shown that the response function of V1 neurons is modulated by surrounding neurons (

In sum, it is not surprising that we find a discrepancy between parallel search performance in homogeneous displays (where inter-item interactions ought to be maximized) and parallel search performance in heterogeneous displays (when those interactions ought to be weaken by interleaving different types of items in the display). That said, what is remarkable is that search performance measured on lure-homogeneous displays predicted almost all the variability of performance observed on lure-heterogeneous displays (with inter-mixed lures). This was initially observed in Experiment 2 of Wang et al. (^{2} = 97%) and again here in Experiment 1 (R^{2} = 90%). This predictive success puts constraints on how we understand lure-to-lure interactions:

From the perspective of the architecture proposed by Buetti et al. (

In contrast, we believe our results are evidence against Duncan and Humphreys’ (

Recent results reported in Utochkin and Yurevich (

It is also worth noting that the proposed nature of inter-item interactions in our experiments is probably not fundamentally different from those local interactions that are believed to be useful in the detection of textures and boundaries (e.g.,

The current results suggest that there are two factors impacting the strength of inter-item interactions. The first, and most obvious, is the spatial arrangement of stimuli: when items are spatially segregated, the inter-item interaction observed between items is identical to the one measured in homogeneous displays. Evidence supporting this conclusion comes from the finding that the multiplicative constant in the function Observed RTs as a function of Predicted RTs is close to 1, when predicting performance in spatially-segregated displays (Experiments 2A and 2B, slope = 0.91 and 0.88, respectively).

Our results also suggest a second factor that might determine the strength of inter-item interactions: stimulus complexity. Though it may be premature to conclude based on tests from just two different stimulus sets (colored geometric shapes and real-world stimuli), there appears to be the intriguing possibility that stimulus complexity affects the extent of inter-item interactions. Indeed, we propose that the slope of the function Observed RT as a function of Predicted RT when testing spatially-intermixed lure displays can be taken as a quantitative index of the strength of these interactions. The rationale is as follows. The predicted RTs are based on logarithmic slope parameters obtained from entirely homogeneous lure displays. In those displays, two factors determine the efficiency (i.e., the log slope) of processing: lure-target similarity and inter-item interactions. Because the displays are homogeneous, those interactions are maximal. When the same lure items are then spatially intermixed in a subsequent experiment, the second factor (inter-item interactions) is minimized, leaving only lure-target similarity as a contributing factor to processing efficiency. As a result, search performance in heterogeneous displays will always be

Following this rationale then, we propose that we can (and should) meaningfully interpret the slope of the function Observed RTs as a function of Predicted RTs. Doing so reveals that the slope of this function is substantially larger with simpler stimuli (1.8, Experiment 1) than with real-world stimuli (1.3,

The current results are limited on several fronts. First, it should be noted that our predictions are based on group-aggregated data. From one perspective, this is strong evidence of the reliability and validity of our model and equations because it means that we can make predictions across independent groups of subjects. However, from another perspective, it would be an even stronger demonstration if we had used a psychophysical approach to do this testing: evaluate for each subject their own logarithmic slope for various lure-target combinations and different stimulus types (geometric shapes vs. real-world stimuli) and then use those estimates to predict that same individual’s performance in heterogeneous search displays. This experimental design is complicated by the need of multiple sessions per participant. Indeed, a single one hour session is not enough to get a stable estimate of, say, three logarithmic slopes (corresponding to three different lures for a given target) at the subject level. After one session, log fits are much noisier at the individual level than at the group level (and linear fits are even noisier). We believe this is the case because of the stochastic nature of processing and the many different processing configurations and factors that impact any one estimate of a log slope. Indeed, we know that the log slope is impacted by factors such as crowding (Madison, Lleras & Buetti, in press), eccentricity (

A second limitation of the present study relates to the findings of Experiment 2B. Although we believe our equation does overall a better job at predicting performance (smaller average RT errors), in terms of variability (R^{2}), the two equations performed relatively similarly. This may be related to the stimuli set itself used in Experiment 2B: two of the three lures have almost identical log slopes (the car and the reindeer). Thus, it is possible that lacking variability in the log slope parameters made the differentiation between the two models more difficult. However, we knew that going into this project and we took that risk because we wanted to use log slopes of published homogeneous search data to predict performance in novel scenarios (heterogeneous scenes). Future work could do a more thorough test of our hypothesis by using complex stimulus with more differentiated log slopes.

We validated an equation (Equation 2) used to predict search performance in heterogeneous search scenes based on performance observed in homogeneous search scenes. The equation arises directly from our computational model of parallel visual processing in a search task where the target is known ahead of time and all the distractor stimuli are lures. We were able to predict the variability in performance quite well both when lure stimuli were arranged haphazardly around the display (spatially intermixed condition) and when lures were arranged in spatially segregated manner. We conclude that the processing required to complete a parallel search task in lure-heterogeneous scenes is fundamentally the same as in lure-homogeneous scenes, even if at first, lure heterogeneous scenes (Figure

The data reported in this paper is publicly available in OpenScienceFramework:

Search display conditions from Experiment 1 for Group A, B, and C are shown by set size of lure object. An asterisk (*) indicates the lure item was held constant throughout the experimental session for that group of participants.

Group A:

Orange Diamonds* | Yellow Triangles | Blue Circles | Total Set Size | Mean RT (SEM) | Mean Error Rate | |
---|---|---|---|---|---|---|

Target only | 0 | 0 | 0 | 1 | 533 (22) | 1.2% |

Subset 1 |
4 | 0 | 4 | 9 | 605 (22) | 1.2% |

4 | 0 | 8 | 13 | 623 (23) | 1.8% | |

4 | 0 | 12 | 17 | 622 (22) | 1.2% | |

4 | 0 | 20 | 25 | 656 (26) | 1.2% | |

Subset 2 |
4 | 4 | 0 | 9 | 642 (22) | 1.5% |

4 | 8 | 0 | 13 | 655 (23) | 2.2% | |

4 | 12 | 0 | 17 | 680 (21) | 1.3% | |

4 | 20 | 0 | 25 | 696 (24) | 2.1% | |

Subset 3 |
4 | 4 | 4 | 13 | 658 (24) | 2% |

Subset 4 |
4 | 4 | 8 | 17 | 673 (26) | 2.5% |

4 | 4 | 12 | 21 | 681 (27) | 1.9% | |

4 | 4 | 20 | 29 | 699 (25) | 1.4% | |

Subset 5 |
4 | 8 | 4 | 17 | 675 (23) | 1.1% |

4 | 12 | 4 | 21 | 697 (24) | 1.7% | |

4 | 20 | 4 | 29 | 710 (25) | 1.5% |

Group B:

Orange Diamonds | Yellow Triangles* | Blue Circles | Total Set Size | Mean RT (SEM) | Mean Error Rate | |
---|---|---|---|---|---|---|

Target only | 0 | 0 | 0 | 1 | 490 (15) | 0.6% |

Subset 1 |
0 | 4 | 4 | 9 | 535 (13) | 1.5% |

0 | 4 | 8 | 13 | 538 (12) | 1.3% | |

0 | 4 | 12 | 17 | 543 (13) | 0.9% | |

0 | 4 | 20 | 25 | 554 (13) | 1.3% | |

Subset 2 |
4 | 4 | 0 | 9 | 592 (14) | 1.8% |

8 | 4 | 0 | 13 | 622 (18) | 1.5% | |

12 | 4 | 0 | 17 | 637 (21) | 1.1% | |

20 | 4 | 0 | 25 | 694 (24) | 2% | |

Subset 3 |
4 | 4 | 4 | 13 | 607 (16) | 1.1% |

Subset 4 |
8 | 4 | 4 | 17 | 633 (19) | 1.2% |

12 | 4 | 4 | 21 | 637 (21) | 1.5% | |

20 | 4 | 4 | 29 | 694 (24) | 1.6% | |

Subset 5 |
4 | 4 | 8 | 17 | 601 (15) | 1.2% |

4 | 4 | 12 | 21 | 607 (17) | 0.7% | |

4 | 4 | 20 | 29 | 622 (17) | 0.9% |

Group C:

Orange Diamonds | Yellow Triangles | Blue Circles* | TotalSet Size | Mean RT (SEM) | Mean Error Rate | |
---|---|---|---|---|---|---|

Target only | 0 | 0 | 0 | 1 | 499 (13) | 0.6% |

Subset 1 |
4 | 0 | 4 | 9 | 550 (14) | 1.5% |

8 | 0 | 4 | 13 | 584 (15) | 1.7% | |

12 | 0 | 4 | 17 | 605 (16) | 1.1% | |

20 | 0 | 4 | 25 | 637 (19) | 1.4% | |

Subset 2 |
0 | 4 | 4 | 9 | 537 (15) | 3.1% |

0 | 8 | 4 | 13 | 548 (12) | 3% | |

0 | 12 | 4 | 17 | 552 (13) | 2.5% | |

0 | 20 | 4 | 25 | 561 (14) | 3.3% | |

Subset 3 |
4 | 4 | 4 | 13 | 592 (15) | 2.6% |

Subset 4 |
8 | 4 | 4 | 17 | 628 (23) | 1.9% |

12 | 4 | 4 | 21 | 650 (20) | 2.3% | |

20 | 4 | 4 | 29 | 705 (25) | 2.1% | |

Subset 5 |
4 | 8 | 4 | 17 | 611 (18) | 2.6% |

4 | 12 | 4 | 21 | 616 (15) | 2.3% | |

4 | 20 | 4 | 29 | 638 (17) | 1.9% |

Search display conditions in Experiment 2A considered in regression analysis and corresponding RT and error rate data.

Case 1 | Case 2 | TotalSet Size | Mean RT (SEM) | Mean Error Rate | |
---|---|---|---|---|---|

Target only | 1T | 1T | 1 | 455 (10) | 0.9% |

Orange Diamonds (OD) + Yellow Triangles (YT) | 4OD + 3YT + 1T | 4YT + 3OD + 1T | 8 | 530 (13) | 1.4% |

8OD + 7YT + 1T | 8YT + 7OD + 1T | 16 | 565 (15) | 1.3% | |

16OD + 15YT + 1T | 16YT + 15OD + 1T | 32 | 593 (16) | 1.8% | |

Orange Diamonds (OD) + Blue Circles (BC) | 4OD + 3BC + 1T | 4BC + 3OD + 1T | 8 | 512 (12) | 0.5% |

8OD + 7BC + 1T | 8BC + 7OD + 1T | 16 | 530 (13) | 0.9% | |

16OD + 15BC + 1T | 16BC + 15OD + 1T | 32 | 547 (13) | 0.8% | |

Yellow Triangles (YT) + Blue Circles (BC) | 4BC + 3YT + 1T | 4YT + 3BC + 1T | 8 | 501 (11) | 1.2% |

8BC + 7YT + 1T | 8YT + 7BC + 1T | 16 | 518 (12) | 1.6% | |

16BC + 15YT + 1T | 16YT + 15BC + 1T | 32 | 535 (11) | 1.2% |

Search display conditions in Experiment 2B considered in regression analysis and corresponding RT and error rate data.

Case 1 | Case 2 | TotalSet Size | Mean RT (SEM) | Mean Error Rate | |
---|---|---|---|---|---|

Target (T) only | 1T | 1 | 598 (17) | 1.5% | |

Grey Car (GC) + White Reindeer (WR) | 4GC + 3WR + 1T | 4WR + 3GC + 1T | 8 | 667 (16) | 1.9% |

8GC + 7WR + 1T | 8WR + 7GC + 1T | 16 | 679 (16) | 1.4% | |

16GC + 15WR + 1T | 16WR + 15GC + 1T | 32 | 694 (16) | 0.9% | |

Grey Car (GC) + Red Doll (RD) | 4GC + 3RD + 1T | 4RD + 3GC + 1T | 8 | 719 (18) | 1.4% |

8GC + 7RD + 1T | 8RD+ 7GC + 1T | 16 | 746 (18) | 1.6% | |

16GC + 15RD + 1T | 16RD + 15GC + 1T | 32 | 789 (20) | 1.4% | |

White Reindeer (WR) + Red Doll (RD) | 4RD + 3WR + 1T | 4WR + 3RD + 1T | 8 | 719 (17) | 1.7% |

8RD + 7WR + 1T | 8WR + 7RD + 1T | 16 | 756 (22) | 1.3% | |

16RD + 15WR + 1T | 16WR + 15RD + 1T | 32 | 776 (20) | 1.2% |

Doing this allows us to correct for overall RT differences across groups of participants. As a reminder, the values of the D parameters used for the predictions here come from observations published in Buetti et al. (^{2} (a measure of variability), the value of the intercept of the linear regression does not change the value of the R^{2}.

The authors have no competing interests to declare.

SB, ZW and AL designed the experiments. ZW and AM programed the experiments. ZW and AM analyzed the data. AL and SB wrote the paper.