Salud mental 2018;

ISSN: 0185-3325

DOI: 10.17711/SM.0185-3325.2018.039

Received: 13 July 2018 Accepted: 26 November 2018

Out of the office and into the field: Exploring neuropsychological correlates between search behavior and a traditional desktop task in children and adolescents with ADHD

Marcos Francisco Rosetti 1 ,2 , Rosa Elena Ulloa 3 , Lino Palacios-Cruz 2 , Robyn Hudson 1 , Francisco de la Peña 2

1 Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Ciudad de México, México.

2 Unidad de Psicopatología y Desarrollo, Instituto Nacional de Psiquiatría Ramón de la Fuente Muñiz, Ciudad de México, México.

3 Hospital Psiquiátrico Infantil Juan N. Navarro, Ciudad de México, México.

Correspondence: Marcos Francisco Rosetti Unidad de Psicopatología y Desarrollo, Instituto Nacional de Psiquiatría Ramón de la Fuente Muñiz. Calzada México-Xochimilco 101, Col. Huipulco, Del. Tlalpan, C.P. 14370, Ciudad de México, México. Phone: (52) 55 5622 - 3828 Email:

Introduction. Cognitive assessment of patients with attention deficit hyperactivity disorder (ADHD) can help clinicians provide individually tailored treatment and advice, and researchers to identify potential associations between psychopathology and specific cognitive deficits. Assessment instruments, however, have received some criticism regarding their ecological validity, that is, the capacity to extrapolate from the performance on such tasks to aspects of everyday functioning. In order to meet this challenge we developed the Ball Search Field Task (BSFT) that takes place outdoors and uses large, open areas. In the BSFT, the goal is to search for target objects hidden under opaque containers, with experimenters assessing the efficiency of participants’ strategies to collect a maximum of these.
Objective. Here we explore how the measures produced by one of the latest versions of this task (the patchy BSFT) match up with a traditional desktop task often used in clinical environments, the Tower of London (ToLo).
Method. We applied the BSFT and ToLo to children and adolescents with ADHD and compared the metrics using Spearman correlations.
Results. We found significant, moderate correlations between instruments, as exemplified by that of balls collected per cones lifted (BSFT) and number of moves (ToLo) (r = -.44).
Discussion and conclusion. Matching correlates between the BSFT and ToLo suggest these tasks may be tapping into similar cognitive processes. The addition of assessment tools with ecological validity may help provide a more comprehensive evaluation and a better understanding of the day-to-day impact of cognitive afflictions underlying psychiatric disorders such as ADHD.

Key words: Neuropsychological tests, attention deficit hyperactivity disorder, Tower of London test, ecological validity, Ball Search Field Task.

Introducción. La evaluación cognitiva de pacientes con déficit de atención e hiperactividad (TDAH) puede ayudar al personal clínico a personalizar el tratamiento y a los investigadores a identificar asociaciones entre psicopatología y deficitarios cognitivos específicos. Los instrumentos de evaluación han recibido críticas en cuanto a su validez ecológica, esto es, la capacidad de extrapolar el desempeño en dichos instrumentos a situaciones de la vida diaria. Con este desafío en mente, desarrollamos la Prueba de Búsqueda de Pelotas (BSFT, por sus siglas en inglés) que se lleva a cabo en áreas abiertas y amplias. La BSFT consiste en buscar objetos escondidos bajo contenedores opacos para evaluar la eficiencia de la búsqueda que intenta encontrar el mayor número posible de objetos.
Objetivo. Exploramos la manera en que una versión de esta tarea (la BSFT en parches) se compara con una tarea de uso común en ambientes clínicos, la Torre de Londres (ToLo, por sus siglas en inglés).
Método. Aplicamos la BSFT y la ToLo a niños y adolescentes con TDAH y comparamos las métricas resultantes mediante una correlación de Spearman.
Resultados. Encontramos correlaciones significativas entre estas pruebas, como lo ejemplifica aquella entre el número de conos levantados (BSFT) y el número de secuencias correctas (ToLo) (r = -.48).
Discusión y conclusión. Correlatos de equivalencia entre la BSFT y la ToLo sugieren que estas tareas demandan procesos cognitivos similares. Investigar tareas con validez ecológica puede ayudarnos a ofrecer una evaluación más completa y a entender mejor el impacto diario de las afectaciones cognitivas subyacentes a trastornos psiquiátricos como el TDAH.

Palabras clave: Pruebas neuropsicológicas, trastorno por déficit de atención con hiperactividad, Prueba de la Torre de Londres, validez ecológica, Prueba de Búsqueda de Conos y Pelotas.


Attention deficit hyperactivity disorder (ADHD) is a neurodevelopmental condition (American Psychiatric Association, 2013) with a large worldwide prevalence of over 5% (Polanczyk, Willcutt, Salum, Kieling, & Rohde, 2014). The impact of ADHD on daily life can lead to reduced performance in social and academic settings (Barry, Lyman, & Klinger, 2002) and, later in life, at the workplace (Kessler, Lane, Stang, & Van Brunt, 2009).

Following an ADHD diagnosis, patients are often profiled using standardized neuropsychological tools. These procedures help clinicians gain deeper knowledge into cognitive impairments that may accompany the disorder to judge severity or provide treatment and advice tailored to each case (Brooks, Ploetz, & Kirkwood, 2016). In research, systematically testing clinical populations with such procedures has provided insight into some of the more common cognitive dysfunctions associated with ADHD and, in doing so, has aided efforts to identify potential neural circuits that may be affected (Castles, Kohnen, Nickels, & Brock, 2014). So far, most findings suggest a strong association between ADHD and deficits in executive function (Doyle, 2006; Nigg, Blaskey, Huang-Pollock, & Rappley, 2002; Willcutt, Doyle, Nigg, Faraone, & Pennington, 2005). Executive function (EF) is a broad term that encompasses a set of cognitive processes associated with the capacity to self-regulate behavior (Jurado & Rosselli, 2007), which include attention, inhibition, planning, and working memory. These cognitive functions are often interlinked, for instance, an attention deficit may influence working memory by interfering with obtaining and retrieving information.

A test often used to evaluate cognitive impairment in ADHD is the Tower of London (ToLo) (Shallice, 1982), which evaluates spatial problem solving abilities by having the participant rearrange a set of balls on wooden pegs to match a given pattern. In the ToLo, children with ADHD have been found to underperform, needing more moves to reach a solution, taking longer to complete some of the trials, and often handling the material in ways that are not allowed (Culbertson & Zillmer, 1998). A poor performance in the ToLo is interpreted by Culbertson and Zillmer (1998) as “a failure to plan and problem solve in a timely and rule-governed manner” (p. 221). A review by Seidman et al. (2004) neatly summarizes the results of neuropsychological studies in children, adolescents, and adults with ADHD. While many tasks have found a differential performance in ADHD subjects, not all have done so (Seidman et al., 2004), and differences are not always consistent across the lifespan, i. e. may be present in preschoolers but not in older children or adolescents (Culbertson & Zillmer, 1998), and even when these are present, effect sizes are often modest (Frazier, Demaree, & Youngstrom, 2004). These difficulties in evaluating cognitive performance associated with ADHD point to the need for a paradigm shift in terms of evaluation procedures.

Heterogeneity in clinical manifestations of ADHD (e.g., comorbidity, severity, presence of cognitive disabilities or socioeconomic adversity) has made it challenging to draw clear links between neuropsychological scores and processes underlying the disorder (Culbertson & Zillmer, 1998). However, the nature of the assessment tools may also play a role. Criticisms of some of the findings showing ADHD subjects to underperform in neuropsychological tasks may relate to difficulties in separating issues of attention control from those of motivation (Burgess et al., 2006). One reason for this may be that most of the assessment tools are applied in a clinical setting and involve somewhat abstract desktop activities inside an office space. In a complimentary attempt to evaluate the behavioral performance of the children and adolescents with ADHD, we proposed to move out of the office and into the field by developing the Ball Search Field Task (BSFT) (Rosetti et al., 2016; Rosetti et al., 2018).

The BSFT attempts to address some of the issues often criticized in traditional tests by centering its design on the concept of ecological validity. By ecological validity we understand “the functional and predictive relationship between the patient’s performance on a set of neuropsychological tests and the patient’s behavior in a variety of real-world settings” (Sbordone, 1996, p. 16). The task is designed to simulate commonly faced situations such as looking around the house for a lost set of keys or navigating supermarket aisles in the attempt to locate a favorite food item. In such situations, people face several challenges; for example, avoid returning to previously visited locations (planning and memory), maintaining focus on the current activity until successful (attention and focus), and accessing recently acquired information that may point to the location of the desired item (working memory). In this sense, the BSFT evaluates cognitive function and behavior in a more naturalistic context as participants try to solve a biologically relevant “foraging” problem.

The BSFT involves an arrangement of opaque covers distributed over a large area, underneath of which a target object might be located. These elements allow the use of inexpensive, highly versatile setups. So far, we have tested three different versions of the BSFT by changing the spatial distribution of the targets and the size of the experimental arena (Rosetti et al., 2016; Rosetti, Valdez, & Hudson, 2017; Rosetti et al., 2018). Each version addresses different questions and potentially involves different cognitive processes. The first version consisted of a large arena on which the covers (brightly colored plastic cones) were arranged as a simple grid (Rosetti et al., 2016). Using this version, we showed that children with ADHD could collect the same number of target items as controls, but did so less efficiently. We then increased the difficulty of the task by changing the arrangement of the cones so that more planning was needed to reach an efficient solution (Rosetti et al., 2018), which proved useful to test adolescents. More recently, we modified the test by reducing the size of the search area and arranging the cones in a patchy distribution. In this last setup, participants need to detect the underlying distribution rule for the targets to improve their searching efficiency.

The present study explores the neuropsychological correlates of searching (planning and problem solving) in children and adolescents with ADHD by comparing measures of performance between the last version of the BSFT and the ToLo.



We performed a convenience sampling of outpatients from the Child Psychiatric Hospital Juan N. Navarro in Mexico City from August 2017 to June 2018. Inclusion criteria were 1. a concurrent principal diagnosis of ADHD, 2. free of psychiatric medication for at least the previous six months, 3. showing no visible motor or sensory impairment, and 4. having an intellectual coefficient (IQ) of at least 80 points on the Wechsler Intelligence Scale for Children (Wechsler et al., 2003).

Tools and measurements


This version consisted in five patches of 30 cones each (150 total), arranged over a flat concrete area of 6 x 9 m (Figure 1). Each patch contained six golf balls individually hidden as a cluster under six neighboring cones. The aim of the task was to collect into a cloth bag as many balls as quickly as possible (a maximum of eight minutes was allowed but not told to the participants). Instructions were a) to leave the cones upright after inspection and b) not to lift two cones simultaneously. Each participant’s sequence of collections was recorded using a small video camera (Hero4, GoPro, California, USA) mounted on a helmet worn by the participant, and later analyzed using the software BORIS (Friard & Gamba, 2016). Performance measures could be interpreted as proxies for planning, problem solving, sustained attention, and working memory. These measures are described in detail in Table 1.




To show efficient performance on the BSFT, the number of patches visited should be five, which would entail not repeating nor neglecting to visit any patches. Lifting many cones per patch suggests the participant was more focused on depleting the patch rather than extracting and applying the underlying rule. The number of balls collected is alone not necessarily indicative of a better performance. Rather, it is preferable to use measures such as collection rate (number of balls located per number of cones lifted) and the slope describing how collection rate changes as the participant explores more patches. A positive slope suggests the participant became more efficient as he or she visited successive patches. A smaller number of blank cones lifted after the last collection would suggest that the participant had gained information regarding the number of balls left in the patch. Similarly, a large probability of choosing a close cone after a collection suggests the participant was able to extract information regarding the distribution of balls and thus behaved as if expecting clumped resources.

The Tower of London (ToLo)

This task, designed by Shallice (1982) more than 35 years ago, consists of two sets of boards with three pegs each and three differently colored balls with holes so that they can be mounted on the pegs. The aim is for the participant to try and match the arrangement of balls as shown on the experimenter’s board using as few moves as possible. The task consists of two sample trials followed by 10 scored trials. Performance measures are described in detail in Table 1. Overall performance is calculated by summing these variables across all trials and is interpreted as a proxy for planning and problem solving (Shallice, 1982).

A longer latency to the first move is interpreted as the participant taking time in order to plan the sequence of moves to execute. A larger number of correct sequences is considered a better performance than a smaller number of such moves. The presence of rule and time violations suggests poor performance due to the participant failing to pay attention to the task rules or taking too long to solve the task. Finally, shorter execution times often correlate with participant’s achieving a smaller number of moves (Shallice, 1982).


The parents or guardians of the participating children were interviewed by an experienced clinical psychiatrist using a Spanish language version of the MINI-KID interview (Sheehan et al., 2010) to confirm diagnosis. Afterwards, participants were evaluated using the WISC-IV, and if the IQ inclusion criterion was met, continued to be tested on the two tasks. The order of presentation of the tasks was balanced, with a 20-minute pause between them.

Statistical analyses

To evaluate correlation values between the BSFT and the ToLo we calculated Spearman’s rs. Using the pcor. test function from the ppcor package [24], we performed partial correlations to control for the effect of age for each comparison. We plotted all comparisons to visually corroborate correlation values. Significance for all tests was set at p < .05. All statistical tests were done using R (R Core Team, 2018).

Ethical considerations

All procedures were evaluated and approved by the internal review boards of the Institutions involved in the project. Written consent of parents or guardians and verbal consent of children were obtained before any evaluation was performed.


We recruited a total of 53 participants. From this sample, we discarded five BSFT participants: a) three failed to understand the instructions for the BSFT and stopped searching after finding only one ball in each patch, and b) two who accidentally turned off the camera while searching. Thus, the final sample included 48 participants (mean age = 9 years [SD = 2.91], 85% male, mean IQ = 97.5 [SD = 8.3]).

Regarding the association between the scores of the ToLo and the patchy version of the BSFT, we found several significant correlations ( Table 2 ). Overall, the rs values indicate that participants with a poor performance on the ToLo also produced a less efficient searching performance on the BSFT (e.g., more moves on the ToLo correlated with a lower rate of ball collections and a lower probability of choosing a close cone after a collection). Table 2 also shows that some measures indicative of good performance on the BSFT (e.g., the lower probability of choosing a close cone after an empty cone or the slope of collection rate) were not associated with performance on the ToLo.


Discussion and conclusion

Regarding our main aim, which was to relate performance on the ToLo and BSFT to start piecing together the possible neuropsychological correlates suggested by the comparison between different types of tasks – those taking place over a desktop, common in office space evaluations, and those occurring outside traditional evaluation settings – we can refer to finding significant correlations between the BSFT and the ToLo, but also to those measures without a significant association.

The similarities between the BSFT and ToLo are several. Both consist of spatial problem solving tasks that involve planning and working memory to solve the problem in an efficient manner. To perform well on either of these tasks, the participant must be able to comprehend, although not necessarily consciously, the underlying “trick” needed to solve the task – in the case of the ToLo, success can be related to insight that to solve the task, often counter-intuitive moves must be performed (e.g., removing a ball from the correct peg in order to remove the ball from beneath). Similarly, as participants visit more patches in the BSFT, it is advantageous for them to recognize that the balls are arranged as a cluster. From the description of the performance variables, one can readily appreciate how re-visiting patches in the BSFT or how making many moves in the attempt to solve the ToLo decreases efficiency. The fact that several performance measures on the BSFT showed significant correlations when compared to performance on the ToLo suggests that both tasks may provide indirect measures of similar cognitive abilities. For instance, to perform well on the BSFT, participants need to work out the spatial array of the targets and thus increase their overall collection rate (attention and problem solving), leave a patch after suspecting most of the balls have been collected (working memory), or search for balls under nearby cones in the expectation of clusters (pattern recognition, planning).

The number of significant correlations between this latest version of the BSFT and the ToLo was larger than those for previous versions of the BSFT (Rosetti et al., 2018). Previous comparisons mainly highlighted the association between the number of violations on the ToLo and inefficient search patterns on the BSFT such as returning to previously visited cones. The explanation for this similarity appeared to be rooted in a lack of inhibition or attention, as rule violations are mainly encountered in clinical samples of patients with ADHD (Culbertson & Zillmer, 1998). Looking at the current results, we can say that poor performance on either of these tasks could be linked to similar aspects to those highlighted by Culbertson and Zillmer (1998) regarding cognitive impairments associated with lack of planning and poor problem solving. The current sample was composed of children and adolescents diagnosed with ADHD, and thus the observed correlations may be reflecting similar deficits, particularly those relating to the poor performance of a clinical sample. This interpretation could be further strengthened by the inclusion of healthy controls to assess the relative performance of patients with ADHD on the “patchy” BSFT; testing of an equivalent healthy control group is currently in progress. For instance, that we did not observe a correlation between measures on the ToLo and the slope of the collection rate may be related to the fact that only a few participants showed a large, positive slope (indicative of good information reception and application). This (lack of) association could be different in a healthy population where participants may be quicker to detect the clustered within-patch distribution of targets as they progress across successive patches. The addition of a control group could thus help identify the performance measures that provide the best profiling values of BSFT as an assessment tool.

Here it is important to remark that the purpose behind the continuing development of the BSFT is not to perfectly match it to other forms of assessment but rather to better understand a new and hopefully complimentary tool, which is based on many different underlying principles to various previous, mainly table-top, tests. For instance, while the ToLo limits, by design, the way in which participants can best solve the task, BSFT participants can adequately solve (or fail to solve) the task in various ways. For example, thoroughly searching a patch to deplete it versus leaving after inspecting only a few cones, hoping to collect more targets by visiting more patches, may result in a similar number of collected targets although perhaps with different energetic expenditure (Pacheco-Cobos, Rosetti, Cuatianquiz, & Hudson, 2010). Examining the diversity in searching strategies could be useful in detecting clinically distinctive subgroups: participants who loose efficiency by spending much time on the same patch versus participants who loose efficiency by moving quickly from patch to patch, including returning to previously visited patches. Furthermore, the task can be applied to a wide range of ages – simple modifications can be made to adjust the degree of difficulty to the motor and cognitive abilities of even very young children. Finally, the BSFT involves a large degree of sensorimotor feedback: moving (often running) between potential targets, lifting the cones and collecting the balls, and carrying collections around for the duration of the task. This feedback may help maintain motivation and provide a positive feeling of achievement even in the case of poor performance. Imperfect correlations in the present study between scores on the ToLo and BSFT support the need for complementary tests for the evaluation and better understanding of psychiatric disorders such as ADHD. Such differences with more established test methods are to be expected as the effort to provide more ecologically valid measures of cognitive function proceeds.


This research was supported by grants from the Consejo Nacional de Ciencia y Tecnología (CONACyT, No. 162043), by the Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica (PAPIIT, No. IA202617), and by institutional funding from the Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México.

Conflict of interests

Authors declare no conflict of interests.


We thank all participants for their enthusiastic collaboration as well as doctor Gina Chapa and doctor Alejandra Hernández and psychology intern Alejandro Angulo who assisted with recruiting participants, performing assessment procedures and coding the obtained results.


American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (DSM-5). Arlington: American Psychiatric Publishing.

Barry, T. D., Lyman, R. D., & Klinger, L. G. (2002). Academic underachievement and attention-deficit/hyperactivity disorder: The negative impact of symptom severity on school performance. Journal of School Psychology, 40(3), 259-283.

Brooks, B. L., Ploetz, D. M., & Kirkwood, M. W. (2016). A survey of neuropsychologists’ use of validity tests with children and adolescents. Child Neuropsychology, 22(8), 1001-1020.

Burgess, P. W., Alderman, N., Forbes, C., Costello, A., Laure, M. C., Dawson, D. R., … & Channon, S. (2006). The case for the development and use of “ecologically valid” measures of executive function in experimental and clinical neuropsychology. Journal of the International Neuropsychological Society, 12(2), 194-209.

Castles, A., Kohnen, S., Nickels, L., & Brock, J. (2014). Developmental disorders: what can be learned from cognitive neuropsychology?. Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1634), 20130407.

Culbertson, W. C. & Zillmer, E. A. (1998). The construct validity of the Tower of London DX as a measure of the executive functioning of ADHD children. Assessment, 5(3), 215-226.

Doyle, A. E. (2006). Executive functions in attention-deficit/hyperactivity disorder. The Journal of Clinical Psychiatry, 67, 21-26.

Frazier, T. W., Demaree, H. A., & Youngstrom, E. A. (2004). Meta-analysis of intellectual and neuropsychological test performance in attention-deficit/hyperactivity disorder. Neuropsychology, 18(3), 543-555.

Friard, O. & Gamba, M. (2016). BORIS: A free, versatile open source event logging software for video/audio coding and live observations. Methods in Ecology and Evolution, 7(11), 1325-1330.

Jurado, M. B. & Rosselli, M. (2007). The elusive nature of executive functions: a review of our current understanding. Neuropsychology Review, 17(3), 213-233.

Kessler, R. C., Lane, M., Stang, P. E., & Van Brunt, D. L. (2009). The prevalence and workplace costs of adult attention deficit hyperactivity disorder in a large manufacturing firm. Psychological Medicine, 39(1), 137-147.

Nigg, J. T., Blaskey, L. G., Huang-Pollock, C. L., & Rappley, M. D. (2002). Neuropsychological executive functions and DSM-IV ADHD subtypes. Journal of the American Academy of Child & Adolescent Psychiatry, 41(1), 59-66.

Pacheco-Cobos, L., Rosetti, M., Cuatianquiz, C., & Hudson, R. (2010). Sex differences in mushroom gathering: men expend more energy to obtain equivalent benefits. Evolution and Human Behavior, 31(4), 289-297.

Polanczyk, G. V., Willcutt, E. G., Salum, G. A., Kieling, C., & Rohde, L. A. (2014). ADHD prevalence estimates across three decades: an updated systematic review and meta-regression analysis. International Journal of Epidemiology, 43(2), 434-442.

R Core Team. (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing: Vienna.

Rosetti, M. F., Ulloa, R. E., Reyes-Zamorano, E., Palacios-Cruz, L., de la Peña, F., & Hudson, R. (2018). A novel experimental paradigm to evaluate children and adolescents diagnosed with attention-deficit/hyperactivity disorder: Comparison with two standard neuropsychological methods. Journal of Clinical and Experimental Neuropsychology, 40(6), 576-585.

Rosetti, M. F., Ulloa, R. E., Vargas-Vargas, I. L., Reyes-Zamorano, E., Palacios-Cruz, L., de La Peña, F., Larralde, H., … & Hudson, R. (2016). Evaluation of children with ADHD on the Ball-Search Field Task. Scientific Reports, 6, 19664.

Rosetti, M. F., Valdez, B., & Hudson, R. (2017). Effect of spatial scale on children’s performance in a searching task. Journal of Environmental Psychology, 49, 86-95.

Sbordone, R. J. (1996). Ecological validity: some critical issues for the neuropsychologist. In R. J. Sbordone & C. J. Long (Eds.), Ecological validity of neuropsychological testing (pp. 15-41). Boca Raton: St Lucie Press.

Seidman, L. J., Doyle, A., Fried, R., Valera, E., Crum, K., & Matthews, L. (2004). Neuropsychological function in adults with attention-deficit hyperactivity disorder. Psychiatric Clinics of North America, 27(2), 261-282.

Shallice, T. (1982). Specific impairments of planning. Philosophical Transactions of the Royal Society of London Series B: Biological Sciences, 298(1089), 199-209.

Sheehan, D. V., Sheehan, K. H., Shytle, R. D., Janavs, J., Bannon, Y., Rogers, J. E., … & Wilkinson, B. (2010). Reliability and validity of the mini international neuropsychiatric interview for children and adolescents (MINI-KID). The Journal of Clinical Psychiatry, 71(3), 313-326.

Wechsler, D., Kaplan, E., Fein, D., Kramer, J., Morris, R., Delis, D., & Maelender, A. (2003). Wechsler intelligence scale for children: Fourth edition (WISC-IV). Mexico: Manual Moderno.

Willcutt, E. G., Doyle, A. E., Nigg, J. T., Faraone, S. V., & Pennington, B. F. (2005). Validity of the executive function theory of attention-deficit/hyperactivity disorder: A meta-analytic review. Biological Psychiatry, 57(11), 1336-1346.