1 Introduction

Artificial intelligence (AI) and machine learning (ML) are becoming increasingly significant areas of research for scholars in science and technology studies (STS) and media studies. Scholars are exploring various aspects, including labour considerations (Tubaro et al. 2020) and the politics of algorithmic decision-making (Sánchez-Monedero and Dencik 2022) to the materiality of computation (Rella 2023), the role of training datasets (Thylstrup 2022), and the economic underpinnings of AI ethics (Steinhoff 2023). In this article, we contribute to this growing literature by studying the organisation of ML, machine vision ‘challenges’ used to foster technological innovation. This is particularly the case for emerging ML-dependent AI systems, such as autonomous vehicles.

Our objective is to examine how challenges shape applied AI research and development (R&D), using the case of Waymo, Google/Alphabet’s autonomous vehicle project.Footnote 1 Waymo’s recurring annual Open Dataset Challenges (2020–23) represent one example of open competitions organised for the global ML and data science community.Footnote 2 Our investigation into these challenges adopts a material approach bridging digital STS (Vertesi and Ribes 2019), platform studies (Helmond et al. 2019), and work on the political economy of AI (Luitse and Denkena 2021; Srnicek 2022; Van der Vlist et al. 2024), furthering insight into the phenomenon of ‘platform automobility’ (Hind et al. 2022; cf. Forelle 2022; Hind and Gekker 2022; Steinberg 2022) and autonomous driving (Hind 2019; Iapaolo 2023; Marres 2020; Sprenger 2022; Stilgoe 2017; Tennant and Stilgoe 2021).

Building upon a workshop conducted by the authors at the University of Siegen (Siegen, Germany), focussed on the Waymo Open Dataset, we adopt a ‘technographic’ approach (Bucher 2018; Van der Vlist et al. 2024) to explore how challenges play a crucial role in the development and political economy of AI and autonomous vehicles. Through a ‘scavenging-style’ ethnography (Seaver 2017), we examine their significance in ‘convening’ third-party developers (Egliston and Carter 2022 p. 10), considering how platform features, technical documentation, and other materials figure in the incremental advancement of AI systems and technologies.

Waymo has been a leader in the autonomous vehicle industry ever since it started as the Google Self-Driving Car project in 2009 (Markoff 2010). It continues to compete with car manufacturers like Tesla (through its mis-sold ‘Autopilot’ feature), Ford (former backer of Argo AI), and China’s Baidu; other Big Tech-funded projects like Zoox (a subsidiary of Amazon), dedicated autonomous vehicle passenger service (AVPS) operators like Cruise, and chip manufacturers like NVIDIA and Mobileye. Together, they shape an industry that has entered a new, mature phase, as key players have variously consolidated their self-driving vehicle operations (Mobileye), written-off related assets (Ford), or pivoted to other autonomous vehicle domains (Aurora, self-driving trucks). Cruise’s travails in San Francisco have only reiterated the difficult crossroads the industry has now reached (Biddle 2023; Hawkins 2023).

ML and data science challenges and competition-hosting platforms are numerous. Google subsidiary Kaggle, described by CEO D. Sculley as the ‘rainforest of machine learning’ (Pan and Fields 2022), provides a platform for users to discover and publish datasets, explore and construct models, and participate in various data science challenges to enhance their skills, earn ranking points, and win prizes.Footnote 3 The Grand Challenge platform serves as another instance of an open web-based environment for challenges, focussing specifically on the end-to-end development of ML solutions in biomedical imaging.Footnote 4 Such competitions can encompass diverse topics or algorithmic techniques since the respective datasets and evaluation criteria are typically provided by the competition hosts. Waymo also are not the only technology company to conduct competitions centred around their own datasets and rules: Netflix previously ran the ‘Netflix Prize’ inviting ways to improve its algorithmic film recommendation system, Cinematch (Bennett and Lanning 2007). The competition was exclusively open to external contestants, excluding individuals affiliated with Netflix, highlighting the ‘boundary work’ that digital platforms undertake, as they establish and manage the parameters of such competitions (Van der Vlist 2022, p. 102; cf. Helmond et al. 2019).

This article argues that challenges serve as touchpoints or interfaces between companies like Waymo, deeply involved in the application domain of self-driving technology, and the applied AI/ML community, including academia and machine vision subfields. This interface is a novel development in the automotive industry, encompassing open datasets, leaderboards, ar** the processes and trajectory of AI technology from development to its eventual deployment. We will briefly discuss these elements of training data, computing power, and (expert) labour next.

Srnicek (2022) argues that these three components are crucial for AI production, particularly in terms of the monopoly power of Big Tech companies like Google/Alphabet in sha** AI platforms and services, or what he refers to as ‘AI centralization’ (Srnicek 2022). He suggests that the collection of ML training data no longer offers a competitive advantage due to the prevalence of the platform business model and the ‘explosive growth of open datasets’ (p. 258). Waymo's Open Dataset, with its ‘nearly 17 h of video, with labelling for 22 million 2D objects and 25 million 3D objects’, is highlighted as an example (p. 258). Consequently, the challenge of starting without data has become less of a concern for actors in various domains.

Yet the presence of the challenge format suggests that doing something with the data remains significant. As we contend here, well-annotated and voluminous training data are not always readily available, with only a few initiatives within each AI domain maintaining useful and usable datasets. Building and maintaining such (open) datasets has been critical to Waymo's autonomous driving vision. These datasets play a vital role in attracting participants to the challenge format and aligning them with internal development timelines.

Furthermore, Srnicek contends that cloud computing power is ‘increasingly where AI monopolies and moats are being built’ (Srnicek 2022, p. 249). He argues that this is because of the ‘concentrated ownership of immense computing resources (compute) and the systems and lures built for attracting the small supply of high-skill workers’ (p. 249–250, emphasis added). Only the largest and most well-capitalised firms can afford to develop cloud computing systems capable of training models on data-rich scenarios. As Luitse and Denkena (2021, p. 3) ask, ‘who can further scale up their compute capacity’? The ability to run numerous experiments quickly and efficiently is crucial in the empirical nature of AI research, involving tasks such as ‘tuning hyperparameters, testing on data from outside the training dataset, debugging any problems, and so on’ (Srnicek 2022, p. 251).

It is through the optimisation of hardware such as Graphics Processing Units (GPUs) and Google’s own Tensor Processing Units (TPUs) that advances in AI are being carved out (Rella 2023). Waymo and other challenge organisers believe that external competition is the most effective means of conducting large chunks of this optimisation work. By providing computational capacity, more iterations can be performed by a larger number of teams, thereby accelerating progress.

In addition, neither data nor compute holds much value without skilled labour to leverage them. Hugely sought-after by Big Tech and AI firms, computer scientists and related graduates command substantial salaries. Open-source initiatives serve as mechanisms for channelling graduate talent into the right areas, with frameworks offering ‘premade tools, libraries, and interfaces…often based on the same ones used internally by companies’ (Srnicek 2022, p. 252). Challenges, therefore, serve as a primary avenue for funnelling ‘new AI talent’ (Luchs et al. 2023, p. 9), equip** graduates with the necessary skills to work with these pre-existing tools and interfaces and engaging them in pre-defined problems chosen by the toolmakers (Steinhoff 2022; Luchs et al. 2023). If these frameworks ‘become feeder networks for the emerging generations of talent’ (Srnicek 2022, p. 253), challenges can be seen as talent contests, pitting the best new talent against their peers. Piecemeal ‘micro-work’, routinely used to prepare training data for ML work (Tubaro et al. 2020), assumes an even more distant role, hidden behind the ‘expert input’ (Rieder and Skop 2021, p. 5) of challenge participants.

In sum, we argue that challenges are one of the most significant ‘systems and lures’ for effectively bringing together key AI assets. They serve as platforms where machine vision training datasets are employed for object detection, image segmentation, and motion prediction tasks, vital for autonomous vehicle development. By distributing and externalising specific AI tasks, challenges reduce the associated labour costs to nominal levels. Participating researchers are provided with the opportunity to tackle cutting-edge problems, gaining access to costly AI hardware otherwise out of reach (Luitse and Denkena 2021).

3 Taking up the challenge: a technographic approach

To investigate the role of the challenge as a structuring device in AI R&D, we adopt a material, ‘technographic’ approach (Bucher 2018; Van der Vlist et al. 2024). This aligns with the practices of digital STS (Vertesi and Ribes 2019) and involves gathering, analysing, and interpreting available information and materials from diverse sources to understand how applied R&D are organised and structured around Waymo.

Reflecting on ethnographic tactics for studying algorithmic systems, Seaver (2017, p. 6–7) emphasises the importance of ‘glean[ing] information from diverse sources, even when … objects of study appear publicly accessible’. Ethnographers, like ‘scavengers’, piece together heterogeneous clues to gain partial insights into the complexities of the world. Adrian Mackenzie’s (2017) ethnography of machine learners involved piecing together different aspects of the field of ML, from textbooks to statistical software packages such as R.

Steve Woolgar (1985), writing during the rise of ‘expert systems’, viewed AI work as an ongoing collaboration between human and machine actors, recognising the importance of studying ‘the relationship between the pronouncements of spokesmen on behalf of AI and the practical day-to-day activities of AI researchers’ (Woolgar 1985, p. 567, emphasis added). Overall, this perspective offers a fruitful avenue for investigating the role of challenges in the development of AI technologies: the ‘many moments where explicit and implicit forms of human judgement come together with technical methods and artifacts’ (Rieder and Skop 2021, p. 10).

Within the research literature on digital platforms, the diverse materials and documentation generated during such AI work are often referred to as ‘boundary resources’, serving the crucial function of facilitating and regulating the material aspects of participation for external third parties, extending beyond the platform itself (Van der Vlist 2022, p. 33). Critical scholars in media studies have explored how certain technical and informational resources, such as application programming interfaces (APIs) and reference documentation, shape power dynamics in various sectors of society, including digital marketing and advertising, mobile app development, and cultural production (Egliston and Carter 2022; Helmond et al. 2019; Helmond and Van der Vlist 2019; Ritala 2023). Drawing from these studies, our approach focuses on Waymo's pivotal role as a core platform company that provisionally brings together an autonomous vehicle technology ecosystem.

Waymo’s Open Dataset Challenges are part of a larger collection of boundary resources that serve to ‘convene’ third-party developers and businesses, cultivating this ecosystem around Waymo. This ‘convening’ process, as described by Egliston and Carter, involves ‘“calling out to others, attracting their attention”, requiring an “active response”’ in the form of usage or participation (Egliston and Carter 2022, p. 10). In studying these various interactions and resources, we gain insights into the dynamics of collaboration and knowledge exchange that underpin Waymo’s work.

This convening process undertaken by Waymo directly ties into the central argument of the challenge as an organising principle for AI. The challenge format serves as a crucial mechanism through which Waymo brings together diverse stakeholders, including researchers, to collectively tackle cutting-edge autonomous vehicle problems. By convening participants through challenges, Waymo creates a platform for collaboration, competition, and knowledge sharing. The challenge format acts as an organising principle that shapes and directs the collective efforts of participants towards specific AI tasks and objectives. It serves as a focal point for mobilising expert labour, leveraging well-annotated training data, and harnessing the computational power necessary for advancing AI models and techniques. In this way, the challenge format not only facilitates the exploration of innovative solutions but also fosters the development of a vibrant ecosystem around Waymo’s autonomous driving vision.

4 The setting: Waymo’s Open Dataset and Challenges, 2019–2022

We embarked on our own ‘scavenging-style’ ethnography during a 3-day workshop held at the University of Siegen in late 2021. The primary objective of this workshop was to examine the Waymo Open Dataset, which served as our entry point into the study. Throughout the workshop, we immersed ourselves in the dataset by accessing the huge open datasets provided by Waymo. These files, each totalling 25 GB, encompassed various data types, structures, visual imagery, 3D models, and accompanying data attributes and image labels. In the process, we discovered a range of associated materials, documentation, and infrastructure dependencies linked to the Open Dataset, including the existence of annual Challenges.

Although we adapted Python scripts provided on Google Colaboratory (Colab) to facilitate the rendering of lidar images,Footnote 6 as we delved deeper into the dataset, we realised that we lacked the necessary ML skills to effectively work with it as intended. Consequently, we shifted from undertaking an exploratory (empirical) data project to an STS-oriented approach. At this stage, our engagement with the materials differed from that of a regular user or an empirical analyst. Instead, we assumed the role of ‘scavengers’, extracting insights from diverse sources and piecing together the available information to gain an understanding of the subject matter.

In the summer of 2022, we conducted a second workshop, entitled ‘Taking up the Challenge’. During this workshop, our hands-on examination focussed on Waymo’s Challenges and their connection to the broader ‘research community’ as defined by Waymo. Like the first, we collected and interacted with diverse materials and documentation available online, which provided valuable insights into these challenges. These materials included participant instructions, competition requirements, evaluative metrics, technical reports of ML models and methods, model output scores, challenge leaderboards, participant names, affiliated organisations, as well as research on previous challenge winners.

4.1 Workshop I: open dataset

In August 2019, Waymo introduced their Open Dataset initiative, announcing that they were ‘sharing [their] self-driving data for research’, and ‘inviting the research community to join [them] with the release of the Waymo Open Dataset, a high-quality multimodal sensor dataset for autonomous driving’ (Waymo 2019). It was described at the time as ‘one of the largest, richest, and most diverse self-driving datasets ever released for research’ (Waymo 2019).

The initial release of the Waymo Open Dataset consisted of data from 1000 ‘segments’, with each segment capturing 20 s of continuous driving by Waymo autonomous vehicles. The primary focus was to provide ‘researchers the opportunity to develop models to track and predict the behaviour of other road users’ (Waymo 2019). The dataset encompassed data collected from various locations, including Phoenix (AZ), Kirkland (WA), Mountain View (CA), and San Francisco (CA) in the United States, capturing diverse environmental conditions such as ‘day and night, dawn and dusk, sun and rain’ (Waymo 2019). Each 20-s segment contained sensor data derived from five on-board lidar devices and five front-and-side-facing cameras. Notably, the dataset was extensively annotated, featuring 12 million 3D labels and 1.2 million 2D labels, playing a crucial role in training ML models for tracking and predicting the movement of vehicles in a driving environment.

In addition, the Open Dataset was available via Know Your Data (KYD),Footnote 7 a data exploration platform developed by Google, that ‘helps researchers, engineers, product teams, and decision makers understand datasets with the goal of improving data quality, and hel** mitigate fairness and bias issues’ (Know Your Data 2023). By utilising KYD, users were able to navigate the contents of the dataset (of nearly a million items) and explore the relationships between various items. Furthermore, the images in the dataset were labelled with Google Cloud Vision tags, providing additional information about road users (‘TYPE_CYCLIST’, ‘TYPE_VEHICLE’, ‘TYPE_PEDESTRIAN’, as well as ‘has_faces’, ‘num_faces’, etc.), various roadside objects (‘Tree’, ‘Traffic light’, ‘Building’, etc.), and other labels that could be utilised.

Waymo’s open datasets, however, were not the first such datasets within the autonomous driving community. Waymo acknowledges the existence of the KITTI Vision Benchmark Suite, which was publicly released in March 2012, 7 years prior to Waymo’s Open Datasets. The KITTI dataset is widely regarded as the benchmark for vision datasets in the field of autonomous driving and machine vision research.Footnote 8 Over the past decade, it has received updates, introduced novel benchmarks, and added newly annotated data. Given the popularity of existing benchmarks such as KITTI within the autonomous driving and machine vision communities, Waymo’s decision to launch its open dataset naturally piqued our interest: what might they stand to gain from its release?

4.2 Workshop II: open dataset challenges

In March 2020, just as the COVID-19 pandemic was starting to impact Europe and the US, Waymo introduced their first Open Dataset Virtual Challenge. Waymo’s principal scientist Drago Anguelov wrote that the newly-launched competition constituted ‘the next phase of our program’, with Waymo ‘committed to fostering an environment of innovation and learning’ (Anguelov 2020). The challenge comprised five specific machine vision challenges: 2D detection, 2D tracking, 3D detection, 3D tracking, and domain adaptation. Each challenge specified a task that participants were expected to perform with elements of the dataset, for example: ‘given a set of camera images, produce a set of 2D boxes for the objects in the scene’ or ‘given a temporal sequence of lidar and camera data, produce a set of 3D upright boxes and the correspondences between boxes across frames’ (Anguelov 2020).

Winners of each challenge were eligible for cash prizes, with $15,000 awarded to the first-place winners, $5000 for second place, and $2000 for third place. The competition opened on the same day as the announcement and ran until May 31, 2020. The leaderboard would be public and ‘remain open for future submissions’ (Anguelov 2020). Winners were also invited to present their winning methods at a workshop during the CVPR conference in Seattle, USA (Anguelov 2020). Subsequent editions of the Open Dataset Challenge were announced in 2021 (Anguelov 2021) and 2022 (Waymo 2022a).Footnote 9

For the 2021 edition, Waymo released a motion dataset for the first time, considered to be ‘the largest interactive dataset yet released for research into behaviour prediction and motion forecasting for autonomous driving’ (Anguelov 2021). The release included a comprehensive description of the datasets, a technical paper explaining the data annotation techniques used for the perception datasets (Qi et al. 2021).

The 2022 edition followed a similar pattern, with the announcement in March, a submission deadline in May, and eligible winners presenting at the CVPR workshop in June. Waymo augmented the Open Dataset by adding additional labels to expand the range of tasks researchers could explore. These labels included ‘key point labels’ (capturing ‘important small nuances’), ‘3D segmentation labels’ (used to detect image pixels), and ‘2D-to-3D bounding box correspondence labels’ (‘to further enable research on sensor fusion of object detection and understanding’) (Waymo 2022a). The challenges for 2022 included: motion prediction, occupancy and flow prediction, 3D semantic segmentation, and 3D camera-only detection.

During this second workshop, we decided to focus primarily on the detection challenges, offering a comparison of tasks, metrics, and methods across all three iterations (2020, 2021, 2022). The 2020 2D and 3D detection challenges evolved into real-time 2D/3D detection challenges in 2021 and further transformed into a 3D camera-only detection challenge in 2022. Participants had the opportunity to submit methods to previous challenges, enabling a temporal analysis of the original challenges (2D/3D detection) that laid the groundwork for these variations. To summarise, the 2021 edition introduced motion planning data for the first time, and in 2022, Waymo added additional labels to assist researchers in utilising the Open Dataset.

The materials and documentation encountered in relation to the Open Dataset Challenges originated from diverse sites and sources. These included the open dataset itself, the challenge guidelines, cloud computing tools and infrastructure, and associated technical papers describing the submitted ML methods in greater detail. However, all resources were clearly related to the Open Dataset Challenges and served to convene the field of applied AI/ML research, engaging the research community in a manner that aligns with Waymo’s business goals and strategy. Throughout, Waymo’s parent company Google/Alphabet assumed a prominent role as the provider of cloud platform infrastructure (including computing resources and image labels from Google's Vision API), the host of the online code-sharing and notebook platform (Colab, linked to Google Drive), and the developer of the data exploration platform (KYD). Continuous discussions surrounding these materials and documentation took place during the workshops, forming the basis for further reflections in the article.

5 Sha** AI ecosystems through challenges: insights from Waymo’s incremental approach

In the following, we detail six specific themes drawn from our study of Waymo’s challenges: challenges as multifaceted interfaces, dynamics of incrementalism, the evolving significance of metrics and benchmarks, the vernacular of AI, the allure of applied domains, and the pursuit of competitive advantages. Collectively, these thematic insights provide a deeper understanding of challenges as central structuring devices that drive the advancement of autonomous vehicles and the broader realm of AI/ML. Within this context, challenges not only break down the intricate task of automating driving into feasible interim objectives but also serve as a manifestation of AI’s operationalisation within specific domains or contexts. This operationalisation fuels inventive and exploratory endeavours evident in challenge submissions, where novel methods are trialled, traditional approaches serve as the foundation for innovation, and original combinations of data, algorithms, models, and workflows are tested, offering diverse pathways towards realising challenge goals.

5.1 Theme I: challenges as multifaceted interfaces

To begin with, Waymo’s Open Dataset Challenges serve as conduits, or multifaceted interfaces, for a diverse array of components, including training datasets, annotations, leaderboards, ar** and configuring the terms and conditions of these competitions on an annual basis, aligning them intricately with the internal developmental trajectories of AI firms.

5.2 Theme II: dynamics of incrementalism

The Waymo challenges also exemplify a distinctive form of incrementalism, strategically designed to yield incremental improvements in object recognition and motion planning. A telling instance of this approach is found in the 2022 3D detection challenge, where a mere 0.018 difference separated the top-ranked method (0.7914 AP) and second place (0.7896 AP).Footnote 10 Over the course of the three years, the winning method in the same category rose from 0.7711 AP in 2020 to 0.7764 in 2021 to 0.7914 in 2022. Only in 2022 did any method post an AP score of over 0.79.Footnote 11 The significance of these seemingly marginal percentage gains becomes pronounced in the context of autonomous driving, as Srnicek (2022) contends. Such minute increments could indeed signify the distinction between a pedestrian or cyclist being struck, grazed, or entirely evaded by a vehicle. This trend might even be perceived as an extreme form of incrementalism, considering the quantitative subtlety (though qualitative importance) by which each successive winning method surpasses its predecessor. The specific definition of progress based on AP, encompassing all object categories, is of notable consequence, progressively elevating the performance threshold from year to year, persisting beyond the official challenge period. As Everingham et al. (2015, p. 133) observed regarding a previous object recognition challenge spanning 2005–2012, participants’ optimal approach was to iteratively enhance the preceding year's winning method.

This ethos of incrementalism further manifested in the evolution of challenges, entailing refinements in task stipulations and parameters. Commencing in 2020, the 3D detection challenge solicited participants to generate a set of 3D upright boxes for scene objects (Waymo 2020), excluding any temporal component. In the subsequent year, Waymo introduced the real-time 3D detection challenge, retaining the original task specifications whilst introducing a temporal constraint (Waymo 2021b). The year 2022 witnessed the launch of a camera-only iteration of the challenge, restricting participants from incorporating lidar data into their methods (Waymo 2022c). With each iteration, challenge participants benefited from overarching enhancements and expansions to the foundational training dataset, encompassing a greater number of segments and an extended breadth of annotations.

Whilst all scientific and technical endeavours inherently encompass incremental progress, the broader concern emerges over whether these (highly) incremental gains are deemed sufficient by Big Tech firms financing the research and hinging their future growth on AI breakthroughs, particularly in areas like autonomous driving. This pertains equally to the broader public, who, in line with Reddy’s proposition (1988), necessitate assurance that AI is delivering on its promises. Thinking critically, it is conceivable that these incremental advances might indeed reflect the sluggish, or potentially thwarted, efforts to realise automated driving witnessed in recent times (e.g. Korosec 2022). As argued by Everingham et al., the extreme incrementalism characteristic of such challenges poses the risk of ‘reduc[ing] the diversity of methods within the community’ as ‘new methods that have the potential to give substantial improvements may be discarded before they have a chance to mature, because they do not yet beat existing mature methods’ (Everingham et al. 2015, p. 133). In essence, the competitive structure fosters (extreme) incrementalism, as participants vie to surpass existing methods, thereby inhibiting the pursuit of what Everingham et al., (2015, p. 133) call methodological ‘novelty’. Consequently, organised challenges crystallise a guiding ethos or value in the development of ML models, where prioritisation is accorded to ‘a specific, quantitative, improvement over past work, according to some metric on a new or established dataset’ (Birhane et al. 2022, p. 178).

5.3 Theme III: metrics and their evolving significance

Central to the orchestration of AI work within the Waymo challenges is the pivotal role of metrics and benchmarks, particularly Average Precision (AP), the preeminent standard for ML-based object recognition. The AP score plays an important role in these challenges, acting as the decisive arbiter for method validation. Any submission failing to attain a commendable AP score is categorically dismissed and invalidated. Consequently, the teams responsible for method design find themselves at a crossroads, necessitating a return to the proverbial drawing board, either to substantially refine and adapt their existing approach or devise an entirely new stratagem. Crucially, any such iteration must ultimately achieve a respectable AP score to merit consideration.

Nonetheless, the AP metric is neither arbitrary nor static; rather, it possesses a historical trajectory closely aligned with the timeline of the Waymo challenges themselves. The metrics employed by Waymo mirror the conventions established by the PASCAL Visual Object Classes (VOC) Challenge, conducted over 8 years (PASCAL VOC 2014; Everingham et al. 2015). In a pivotal decision, the organisers replaced the 'area under curve' (AUC) metric with AP to enhance interpretability and other rationale (Everingham et al. 2010, p. 313). Notably, the introduction of the 3D camera-only detection challenge in 2022 led Waymo to introduce a modified version of AP termed LET-3D-APL, designed to accommodate ‘depth estimation errors’ (Hung et al. 5.4 Theme IV: vernacular of AI work

Within the realm of the Waymo challenges lies a nuanced and intricate vernacular of AI work, characterised by two distinct dimensions. Firstly, a pronounced element of playfulness permeates the nomenclature, resonating with colloquial phrases that infuse the machine vision landscape. This tendency is evident in the technical papers and GitHub repositories, where terms like ‘bells and whistles’ are used to denote methods without additional embellishments or complex features (Bergman et al. 2022). This policy, with a specific focus on the semiconductor and AI sectors, has introduced an aura of apprehension akin to historical instances like Japan’s Fifth Generation program in the late 1980s (Roland and Shiman 2002, p. 2). In short, the current landscape of burgeoning competition and collaboration in AI, as convened by Waymo’s challenges, might represent a pivotal moment where US stakeholders mirror historical concerns about emerging technological rivals, or at the very least, it indicates the geopolitical and political–economic stakes involved in sha** AI development.

5.6 Theme VI: securing competitive advantage

The Waymo challenges serve as a strategic embodiment of a well-established Big Tech R&D playbook, aiming to shape and ‘lock-in’ (Urry 2004) a thriving developer community within their prescribed timelines, developmental trajectories, and technical frameworks. This concerted effort brings young researchers into Google’s orbit, offering tools and services like Google Colab and Tensorflow in exchange for labour, extending how Google uses its online Machine Learning Crash Course (MLCC) programme to hook users in the first place (Luchs et al. 2023). The challenge format itself stands as a relatively tried-and-tested format for achieving this goal, enabling both organisers and entrants ‘to commit time and funds to the competition’ (Kreiner 2020, p. 51) in an efficient, compact manner.

Cost-effectiveness underpins this strategic approach, as running a challenge for external participants, coupled with a modest cash prize fund ($15,000 for winners), proves economically advantageous compared to hiring full-time engineers at market rates.Footnote 13 Whilst the expenses associated with constructing training datasets and scaling computational resources are substantial, they are spread across the broader operations of Google/Alphabet, as well as being specifically valuable to Waymo’s own internal initiatives. The competitive spirit engendered by the challenges, coupled with the incentive of the prize fund, acts as a powerful catalyst propelling participants to invest substantial time and energy into the intricate tasks of method development, rigorous testing, meticulous verification, comprehensive documentation, and final submission. Vertesi et al. (2021) aptly term this phenomenon the ‘pre-automation’ phase of AI, characterised by companies' rapid AI product scaling through the internalisation of highly skilled technical endeavours.

In contrast to the veiled realm of ‘temporary, vendor, and contractor’ (TVCs) workers, used by Big Tech firms to plug gaps in short- and mid-term product development (Brophy and Grayer 2021), the temporary labour demonstrated by challenge participants is openly documented and is even celebrated as a rite of passage. This holds especially true for numerous aspiring young computer scientists who eagerly embrace the opportunity to apply their newfound knowledge to cutting-edge challenges. An indicator of this recognition and pride can be found in the frequent referencing of podium achievements on the GitHub pages of participating teams, where these accolades are displayed as prestigious badges of honour (e.g. BEVFormer 2023). Here, if Google’s MLCC programme allows them to recruit ‘new AI talent’ (Luchs et al. 2023, p. 9) at one end of the AI talent ‘pipeline’, Waymo’s challenges offer the opportunity to channel and celebrate that talent at the other.

In the 2023 edition, only challenge winners receive a prize, capped at $10,000 in Google Cloud credits (Anguelov 2023). Using their own Cloud Pricing Calculator (Google 2023b), $10,000 would offer a team roughly 3 month’s access to Google’s second-generation (v2) Cloud TPU service, useful for training ML models remotely.Footnote 14Footnote 15 Steinhoff (2023), building on Rikap (2021), characterises this phenomenon as a ‘subordinated innovation network’, further devaluing the work of those competing in such ML challenges (Steinhoff 2022). The shift from hard cash to credits further entrenches this subordination, intensifying participant dependence, whilst hardening the resultant innovation network.

Learning lessons from the PASCAL VOC challenges, Everingham et al. (2015) suggested that the open format tended to reduce the diversity of methods within the wider research community. If participants wanted to win, they stood the best chance by making ‘an incremental improvement on the previous year’s winning method’ (Everingham et al. 2015, p. 133), rather than develop new methods from scratch. In the context of the Waymo challenges, this predilection locks participants into Google products for model training, strengthening the gravitational pull towards the Google/Alphabet ecosystem. Everingham et al. (2015, p. 132) also remarked that having software able to ‘run everything “out of the box”’, from training to validation was crucial. In this respect, Waymo goes a step further by being both software developer and challenge host organiser, a unique position that asserts their monopoly power. Paradoxically, such a situation could hamper broader progress, following Everingham et al. (2015), diverting attention away from maturing methods and fostering an environment of heightened incrementalism.

Waymo is not the only firm to run an AI challenge within the autonomous vehicle domain. However, the decision by Ford to shutter Argo AI (Korosec 2022) has arguably diluted the prominence and impact of their rival Argoverse initiative,Footnote 16 now lacking the envisaged pipeline from challenge participation to commercial deployment. Essentially, Waymo has solidified a monopoly position within AI challenges, bolstering their competitive advantage by maintaining their sustained presence.

6 Conclusion: conceptualising challenges as an organising principle in AI innovation

Throughout this article, we have explored the fundamental role of challenges in sha** AI development. By juxtaposing the era of Grand Challenges with Waymo's strategy of incremental challenges within the realm of autonomous driving, we have unveiled a prevailing approach that characterises both Waymo and the broader contemporary AI landscape. Our investigation of the specific objects and practices of researchers in this field contributes to the critical literature in STS and media studies, shedding light on the history of AI research in the self-driving industry and the history of challenges within this area. In addition, it highlights the ongoing significance of the infrastructures supporting this research, as open datasets and challenges emerge as crucial instruments sha** research funding and the political economy of AI. Despite the existence of alternative paths and resistance to incrementalism, Waymo’s challenge initiative has effectively provided a platform for scaling R&D efforts whilst engaging external participants who share their interests. Through the provision of quality training data and computational resources, Waymo has cultivated a global research community united by common objectives, set by themselves.

Despite its scope, our exploration has merely grazed the surface of Waymo's multifaceted efforts in sha** the contours of AI R&D, and there are several avenues that warrant further inquiry. These might be categorised according to the scale of investigation: challenges as practices, challenges as economic phenomenon and challenges as instances of the infrastructuralisation of AI/ML. In the first instance, examining the distribution of machine vision labour, including the division of tasks within challenge teams, could yield profound insights. Exploring team methodologies, organisational structures, workflow plans, and the strategic leveraging of prior work are pivotal for understanding the nuanced costs and benefits encountered by potential challenge participants. A more sustained focus on the role of ML and machine vision metrics—how they are devised, who designs them, and what they replace—might also shed some light on the contingencies and power dynamics of ML practices writ large.

Likewise, comparing different challenges, challenge formats, and challenge platforms would offer an insight into cross-domain, cross-format, and cross-platform themes. Luchs et al.’s (2023) comparison between online ML courses run by Google and IBM, for example, suggests divergent approaches to offering practical ML experience to computer/data scientists. Steinhoff (2022; 2023) and Rikap’s (2021) work also point towards the possibility of evaluating how ML and ‘data science work’ (Steinhoff 2022, p. 193) conducted for such challenges is being shaped by automation, evidencing how AI/ML firms seek to reduce the huge financial costs for building ML models, products, and platforms. In other words, Waymo’s own challenges are not necessarily unique, but provide evidence of a certain challenge ‘playbook’ to be found across different AI/ML domains.

As the focus shifts from autonomous driving to the hype around large-language models (LLMs), the importance of machine vision challenges, including those hosted by Waymo and its competitors, may undergo changes for aspiring computer scientists. It remains to be seen when Waymo might reassess its developmental roadmap and evaluate the sustainability and usefulness of organising external challenges. The transition from cash prizes to Google Cloud credits indicates a potential shift in priorities, aiming to consolidate and optimise investments in autonomous driving. The recent suspension of ‘24/7’ Cruise passenger services in San Francisco suggest the autonomous vehicle battle has already entered another stage of development altogether (Hawkins 2023), despite Waymo expanding operations to Los Angeles (Davis 2024).

Beyond the specific trajectory of Waymo and the autonomous driving applied domain, further research is necessary to explore the broader infrastructuralisation and industrialisation of AI (Van der Vlist et al. 2024). Central to this exploration is understanding the commodification of LLMs and the widespread proliferation of third-party services that pivot on models like ChatGPT. Scrutinising intricate relationships, evolving licensing models, and the emergence of counter-LLM platforms across diverse sectors, such as higher education, to monitor LLM-generated content, presents compelling questions that stretch far beyond this article's scope but necessitate concerted attention in forthcoming research.

In conclusion, this article underscores the broader historical and critical importance of challenges as a pivotal organising principle sha** AI development, with Waymo's incremental approach serving as a prominent example in the field today. By investigating the dynamics and characteristics of these challenges, and their materiality and infrastructures, scholars can gain valuable insights into the trajectory of AI development and its driving forces in specific industry sectors like self-driving technology. However, the analysis also extends beyond this to the wider AI/ML landscape, offering a nuanced understanding of how challenges shape the contours of technological progress and influence broader socio-economic trends. As such, scholars can leverage challenges as an entry point into AI research, using them as a lens to critically examine the interplay between technological innovation, industry dynamics, and societal change.