Connect with us

AI Research

Type II mechanoreceptors and cuneate spiking neuronal network enable touch localization on a large-area e-skin

Published

on


Sensitive FBG-based e-skin

The experimented e-skin (Extended Data Fig. 5) had size and shape similar to those of the human forearm. To mimic biomechanical compliance, a soft silicone polymer (Dragon Skin 10 medium, Smooth-On) was used to realize a flexible patch to embed an optical fibre with 21 FBG sensors. It was fabricated by a three-step process11: (1) a first silicone pouring in a custom mould to obtain a bottom layer with a groove; (2) the embedding of the fibre along the grooved path; (3) a second silicone pouring to cover the fibre itself and provide robustness to the patch. The e-skin was then attached to a custom three-dimensionally printed plastic support to guarantee stability. The optically sensitive elements were FBGs (length, 10 mm) distributed along the designed path running throughout the e-skin resemble the sensitivity features of the human forearm, that is, with an increasing density from the proximal to distal parts. These gratings are interference patterns inscribed within the core of a single optical fibre. When broadband spectrum light, sent by means of an optoelectronic interrogator, passes through the fibre cable and hits an FBG, it is partly back-reflected to the input point, as a spectral peak centred at a characteristic wavelength, λB:

$${\lambda }_{\rm{B}}=2 {n}_{{{\rm{eff}}}}\varLambda,$$

(1)

where neff is the effective refractive index of the optical fibre and Λ is the pitch of the specific grating80. The λB value of the e-skin FBGs were, hence, defined by fabrication and ranged between 1,520 nm and 1,580 nm with a step of 3 nm. Wavelength shifts occur when a deformation is applied to the sensor11, thereby allowing the encoding of tactile stimuli. Considering this feature and to reduce the effects of potential prestrains that could have impacted the residual deformability of the sensors in response to external tactile stimuli, the FBGs were arranged on straight fibre segments.

Indentation data collection

The training and validation of the proposed model required the collection of a dataset of indentations applied over the e-skin surface. For this purpose, an automatized indentation protocol was executed, with 1,846 target points to touch, computed as the centroids of the triangles of a random mesh of the skin surface. A bimanual robotic platform, consisting of two anthropomorphic arms (Racer-5-0.80, Comau), was programmed to cooperate during the indentation protocol (Extended Data Fig. 5): one robot performed the tactile tasks, whereas the other held and oriented the e-skin. The indenting robot reached each target site perpendicularly, touching the skin with a spring-like hemispherical probe attached to the end-effector. Force feedback measured by means of a load cell (Nano 43, ATI Industrial Automation), mounted at the base of the indenter, was provided to control the robot to release the contact when exceeding a threshold (2.5 N) and then fly to a new target point. The control code consisted of dedicated routines (LabVIEW, National Instruments) running on both an industrial controller (IC-3173, National Instruments), to send commands to the robots, and a PC, for managing and checking the experiments. During the indentation sessions, force and FBG signals, the latter read by means of an optical interrogator (FBG Scan 904, FBGS Technologies), were collected at a 100-Hz rate, and saved into text files together with the Cartesian coordinates of the contact sites.

Assessment of receptive fields of e-skin FBG sensors

The receptive fields of the FBG sensors populating the skin were represented as two-dimensional contour maps depicting the spatial distribution of sensitivity to contact force. Hotspot regions in dark blue indicate the maximum sensitivity (Fig. 1b and Extended Data Fig. 1). To obtain the FBG receptive fields, the wavelength variation signals (Δλ) from each FBG sensor were averaged over 500 ms captured on the highest force plateau, that is, the second force level, and divided by its average value. The obtained results were then normalized with respect to the highest FBG sensitivity value, to have sensitivity maps ranging from 0 to 1. This procedure was performed using all the available indentation samples.

First-order neurons: type II PA model

The 21 FBG signals were the input of the PA models. Each sensor signal was resampled at a frequency of 1 kHz to establish a regular firing dynamics consistent with the discretization of the Izhikevich neuronal model74. The FBG signal derivatives were computed to obtain the dynamic components of the indentations, that is, the loading and unloading phases of the varying force levels applied to the e-skin. The absolute values of both raw and differentiated signals were calculated to feed the Izhikevich neurons as input current I, to mimic the adaptation dynamics and firing response of type II mechanoreceptors, that is, Ruffini endings (SAII) and Pacini corpuscles (FAII)38. To simulate the physiology of type II PAs in terms of the ratio between slowly and fast-adapting units, that is, two SAIIs to one FAII36, their diverse sensitivity and the different-sized overlapping receptive fields, the raw absolute FBG signals and their derivatives were multiplied by four and two distinct gains. Hence, six mechanoreceptor models were considered for each FBG sensor (Fig. 2), resulting in 126 input currents for the first-order neuron models. Extended Data Table 1 shows the gains used for the SAII and FAII models. The SAII gains were selected through some pilot tests and empirical analyses, where we assessed whether the increase in gain was generating an increase in the firing rate within the physiological range documented in the literature43. Specifically, the average ISIs (that is, the inverse of the average firing rates) corresponding to each force range were computed, and their logarithmic values were fitted to the logarithm of the stimulus intensity percentage (force values normalized by the maximum of 4 N; Fig. 2h). The results were then compared with background neurophysiological findings43, particularly to the ISI characterization of SAII units. The FAII gains were selected following the same empirical approach, that is, by evaluating the spiking activity at load transients. It is worth noticing that a general procedure for establishing the gains of the input currents of the Izhikevich model for artificial mechanoreceptor simulation has not yet been established in the literature. Thus, several studies, such as ref. 58, reported this empirical method of gain selection according to the specific application.

The Izhikevich model consists of a system of differential equations that can be solved via the Euler method74. Equation (2) describes the dynamics of the membrane potential v of the neuron; equation (3) represents the recovery variable u, responsible for the repolarization of the neural membrane; equation (4) describes the membrane potential restoration and the recovery variable update when v reaches the 30-mV firing threshold:

$$\frac{{{\rm{d}}v}}{{{\rm{d}}t}}=A{v}^{2}+{Bv}+C-u+{{{\rm{Gain}}}}_{n}\frac{I\left(t\right)}{{C}_{{\rm{m}}}},$$

(2)

$$\frac{{{\rm{d}}u}}{{{\rm{d}}t}}=a\left({bv}-u\right),$$

(3)

$${\rm{If}}\,v\ge 30\,{{\rm{mV}}},{\rm{then}}\left\{\begin{array}{c}v\leftarrow c\\ u\leftarrow u+d\end{array}\right.,$$

(4)

where A, B, and C are standard variables of the Izhikevich model; t represents time; a is the timescale of u; b is the sensitivity of u to the membrane potential; c is the membrane resting potential value; and d modulates the dynamics of the after-spike reset of the recovery variable u. For the implemented mechanoreceptors, we choose the parameters Gainn (Extended Data Table 1) that reproduced a regular firing behaviour.

Encoding of contact intensity by SNN PAs

The activity of type II mechanoreceptors was analysed to investigate relationships with the intensity of the applied load. For this purpose, the perpendicular force profile was extracted for all the 1,846 indentations. For each indentation, SAII and FAII units were considered separately. First, their corresponding firing rates were calculated on moving time windows of 100 ms, with an overlap of 50 ms. For each window, the firing rate was extracted as follows:

$${{{\rm{FR}}}}_{X}=\frac{{N}_{{{\rm{spikesX}}}}}{{w}_{{{{\rm{t}}}}} {N}_{{{\rm{neuronXactive}}}}},$$

where the subscript X indicates the type of mechanoreceptor (SAII or FAII), wt is the duration of the time window, NspikesX is the number of spikes of the X units in wt and NneuronXactive is the number of active neurons of each type in the selected window. The resulting firing rates were compared with the corresponding force profile (Fig. 2a–f). Then, the SAII firing rates were grouped for force levels, ranging from 0 to 4 N, with a step of 0.5 N. The corresponding distributions were analysed and the relevant statistics computed (median values and first and third quartiles; Fig. 2g and Supplementary Table 1). For each load-intensity level, the mean SAII firing rate value was extracted to further derive the relationship with the applied loads. For that, the linear Pearson correlation coefficient ρ was calculated. Then, the first-order polynomial that fitted the logarithm of the SAII average firing rates to the logarithm of the percentage of the applied load, normalized by 4 N, was extrapolated. The R2 coefficient was computed to assess the goodness of the linear fit (Fig. 2h).

Second-order neurons: CN model

The second layer of the SNN was composed of 1,036 CNs, modelled with the mathematical implementation based on the exponential integrate-and-fire approach29. This model reproduces the complete dynamics of the membrane potential of CNs, as described in equation (5), along with a detailed modelling of the activity of low-threshold voltage-gated calcium channels and calcium-activated potassium channels:

$${C}_{{\rm{m}}}\frac{{{{\rm{d}}V}}_{{\rm{m}}}}{{{\rm{d}}t}}={I}_{{\rm{L}}}+{I}_{{{\rm{spike}}}}+{I}_{{{\rm{ion}}}}+{I}_{{{\rm{ext}}}}+{I}_{{{\rm{syn}}}}.$$

(5)

In equation (5), Cm is the capacitance of the neural membrane; Vm is the CN membrane potential; IL is the leak current of the neuron (equation (6)); Ispike is the spike current that recreates the action potential onset and the fast neuron depolarization (equation (7)); Iion is the ionic current resulting from the summation of the currents of voltage-gated calcium channels (ICa) and calcium-activated potassium channels (IK; equation (8)); Iext is the external current that can be injected into the neuron (in this study, it is equal to 0); and Isyn is the synaptic current (equation (9)), where each synapse (i) is activated by a PA.

$${I}_{{\rm{L}}}={-\bar{g}}_{{\rm{L}}}\left({V}_{{\rm{m}}}-{E}_{{\rm{L}}}\right),$$

(6)

$${I}_{{{\rm{spike}}}}={\bar{g}}_{{\rm{L}}{\rm{L}}}{\Delta}_{{\rm{t}}}\exp \left(\frac{{V}_{{\rm{m}}}-{V}_{{\rm{t}}}}{{\Delta }_{{\rm{t}}}}\right),$$

(7)

$${I}_{{{\rm{ion}}}}={I}_{{{\rm{Ca}}}}+{I}_{{\rm{K}}},$$

(8)

$$\begin{array}{l}{I}_{{{\rm{syn}}}}={g}_{\max }\sum _{i}{w}_{{{\rm{exc}}},i}\exp \left(-\tau \left(t-{t}^{* }\right)\right)\left({E}_{{{\rm{rev}}},{{\rm{exc}}}}-{V}_{{\rm{m}}}\right)\\\qquad\;\;+\,{g}_{\max }{w}_{{{\rm{inh}}}}\sum _{i}\exp \left(-\tau \left(t-{t}^{* }\right)\right)\left({E}_{{{\rm{rev}}},{{\rm{inh}}}}-{V}_{{\rm{m}}}\right),\end{array}$$

(9)

where wexc,i is the excitatory synaptic weight; t* is the time at which a spike occurs and winh is the inhibitory synaptic weight. Isyn can be, hence, expressed as the sum of the excitatory and inhibitory synaptic currents of all the synapses of the individual PAs. The definitions of the other variables and their respective values are reported in Supplementary Table 6.

The current ICa is described by equation (10) and the current IK, by equation (11):

$${I}_{{{\rm{Ca}}}}=-{\bar{g}}_{{{\rm{Ca}}}}{x}_{{{\rm{Ca}}},{a}}^{3}{x}_{{{\rm{Ca}}},i}\left({V}_{{\rm{m}}}-{E}_{{{\rm{Ca}}}}\right),$$

(10)

$${I}_{{\rm{K}}}=-{\bar{g}}_{{\rm{K}}}{x}_{{{\rm{K}}}_{{{\rm{Ca}}}}}^{4}{x}_{{{\rm{K}}}_{{{\rm{V}}_{\rm{m}},}}}^{4}\left({V}_{{\rm{m}}}-{E}_{{\rm{K}}}\right),$$

(11)

where \({\bar{g}}_{{{\rm{Ca}}}}\) and \({\bar{g}}_{{\rm{K}}}\) are the maximum conductance; ECa and EK are the action potentials of the calcium and potassium channels, respectively; and xCa,a, xCa,i, \({x}_{{{\rm{K}}}_{{{\rm{Ca}}}}}\) and \({x}_{{{\rm{K}}}_{{{\rm{V}}_{\rm{m}}}}}\) are the activity states of the channels. These values are provided in Supplementary Table 6.

In this model, both ion channels (xCa,a and xCa,i) are considered as sources of the membrane calcium concentration of the CN ([Ca2+]). Thus, we proposed that the activity of the total neuron calcium concentration follows equation (12).

$$\begin{array}{l}\frac{{\rm{d}}\left(\left[{{{\rm{Ca}}}}^{2+}\right]\right)}{{{\rm{d}}t}}={{\rm{BCa}}}\bar{g}{C}_{a}{x}_{{{\rm{Ca}}}{,a}}^{3}x{C}_{a,i}\left({V}_{{\rm{m}}}-{E}_{{{\rm{Ca}}}}\right)\\\qquad\qquad\;+\,\left(\left[{{{\rm{Ca}}}}^{2+}\right]_{{{\rm{rest}}}}-\left[{{{\rm{Ca}}}}^{2+}\right]\right)/{\tau }_{\left[{{{\rm{Ca}}}}^{2+}\right]}.\end{array}$$

(12)

The relevant parameters are reported in Supplementary Table 2.

Functional organization of SNN second-order neurons to model a somatotopic map of the e-skin

To functionally mimic the organization of the cuneate nucleus, we generated an e-skin grid mesh with 29 × 38 edges, that is, 28 × 37 subregions. In this mesh, each subregion represented a CN, and its centroid represented the centre of its receptive field. In total, 1,036 CNs were created with circular overlapping receptive fields of 41.67-mm radius each. The criterion for the definition of the size of the CN receptive field was to include at least two FBG sensors in the corresponding sensitive area of the e-skin and, thus, 12 PAs, consistent with the number of dominant PAs that project into a CN found in neurophysiological studies on mammals and simulations of human tactile perception24. This approach allowed us to determine a CN somatotopic map of the e-skin.

Connectivity of SNN and initialization of synaptic weights

In our model, each of the 1,036 CNs was fully connected to the 126 PAs, with excitatory weights ranging from 0 to 1, and to a single inhibitory synapse (Fig. 1a, red triangle), with an IN that grouped the responses of all the 126 PA projections (Fig. 1a, coloured empty triangles). This connectivity enabled the potentiation or depression of all the synaptic weights during the learning process. In this model, the magnitude of the postsynaptic potential projected into a CN for a given sensory input is dependent on both excitatory (wexc,i) and inhibitory (winh) weights, as described in equation (9)18,29. The presynaptic excitatory weights were initialized as the inverse of the Euclidean distances between the position of the 21 FBG sensors in the skin and the 1,036 centroids of the CNs. Then, the obtained values were rescaled between 0.2 and 1. The initial excitatory weights (wexc) of the PAs corresponding to the sensors outside the area of the CN receptive fields were instead set to 0 (Fig. 1a, coloured dots). In this way, only the mechanoreceptors inside the CN receptive field region had initial weights greater than 0 (Fig. 1a, coloured filled triangles). All the inhibitory synaptic weights (winh) were instead initialized to 0.125.

Synaptic learning protocol of SNN second-order neurons

Inspired by calcium-dependent synaptic plasticity, we implemented a synaptic learning process29, where the total inhibitory weights (winh) and presynaptic excitatory weights (wexc,i) were updated at each stimulus presentation. According to our model, the excitatory weight potentiation of the presynaptic neurons (PAs) occurred when the total calcium activity (\({A}_{{{\rm{Tot}}}}^{{{{\rm{Ca}}}}^{2+}}\)) of the CN, as in equation (13), was strongly correlated with the local calcium activity (\({A}_{{{\rm{Loc}}}_{i}}^{{{{\rm{Ca}}}}^{2+}}\)) of a synapse (PA with the secondary neuron; equation (14)); otherwise, this synapse was depressed:

$${A}_{\rm{Tot}}^{{\rm{Ca}}^{2+}}={k}_{{{\rm{act}}}} \times [{{\rm{Ca}}}^{2+}],$$

(13)

$${A}_{\rm{Lo}{c}_{i}}^{{\rm{Ca}}^{2+}}=\frac{{\tau }_{1}}{{\tau }_{{\rm{d}}}-{\tau }_{{\rm{r}}}}\left[\exp \left(-\frac{t-{\tau }_{{\rm{l}}}-{t}^{* }}{{\tau }_{{\rm{d}}}}\right)-\exp \left(-\frac{t-{\tau }_{{\rm{l}}}-{t}^{* }}{{\tau }_{{\rm{r}}}}\right)\right],$$

(14)

where \({A}_{{Tot}}^{{{Ca}}^{2+}}\) is the total calcium activity of the CN; kact = 1 is an arbitrary constant; \({A}_{\rm{Loc}_{i}}^{{\rm{Ca}}^{2+}}\) is the local calcium activity due to a synapse i; τr = 4 ms is the rise time; τd = 12.5 ms is the decay time; τl = 0 ms is the latency time; τl = 21 ms is a constant to calculate the ratio; and t* is the time at which a PA spike occurs. To reach supralinearity in the local calcium activity, we used an approximative approach of subtracting an offset, that is, the 75% of the single-pulse activation peak activity, from the local calcium signal, and this resulting value was used as the local calcium activity (\({A}_{{{\rm{Loc}}}}^{{{{\rm{Ca}}}}^{2+}}\)), as proposed in ref. 29.

In the model, the individual excitatory weights actualization occurred during the presentation of each stimulus during the learning phase, and it is described as the integral of the correlation between local calcium activity (\({A}_{\rm{Loc}}^{{\rm{Ca}}^{2+}}\)) and total calcium activity (\({A}_{\rm{Tot}}^{{\rm{Ca}}^{2+}}\)), following equation (15):

$${\Delta w}_{{{\rm{exc}}},i}={\int}_{{\!t}_{0}}^{{t}_{\max }}\left\{\left({A}_{{Tot}}^{{\rm{Ca}}^{2+}}\left(t\right)-\left({{{\rm{Avg}}}}_{{A}_{{{\rm{tot}}}}^{{{{\rm{Ca}}}}^{2+}}}\times{{{\rm{Syn}}}}_{{{\rm{EQ}}}}\right)\right)\times{A}_{\rm{Loc}}^{{\rm{Ca}}^{2+}}\left(t\right)\right\}\times {\rm{K}}\times {{\rm{d}}t},$$

(15)

where \({{{\rm{Avg}}}}_{{A}_{{\mathrm{tot}}}^{{{\mathrm{Ca}}}^{2+}}}\) is the average of the last three values of the total calcium activity; SynEQ is the synaptic equilibrium defined as a linear function of the total excitatory synaptic weight with a dual slope having point 0 in 10 (decay = 0.04, if ∑wexc < SynEQ; decay = 0.12, if ∑wexc > SynEQ); and K is a constant gain factor defined by the sigmoid function represented in equation (16), with a gain step on the slope of 0.005.

$$S\left(t\right)=\frac{1}{1+{{\rm{e}}}^{-t}}$$

(16)

To avoid instabilities in synaptic learning, it was necessary to average the calcium activity; thus, the learning threshold was obtained by multiplying the average of the total calcium activity \({{\rm{Avg}}}_{{A}_{{\mathrm{tot}}}^{{{\mathrm{Ca}}}^{2+}}}\) by the synaptic equilibrium SynEQ.

Although the inhibitory synaptic learning is based on the firing rate of the CN calcium channels, winh, initially set to 0.125, is responsible for regulating the total activity of the calcium channels in the CN. Thus, the low activity of calcium channels was counteracted with a decrease in winh, and vice versa. For the model implemented in this work, winh was used to maintain the firing frequency of the calcium channels at a predefined set point of 20 Hz, and the update of winh was given by a dual-slope function zeroing at this set point. The synaptic learning protocol is summarized in a pseudo-code in the Supplementary Text.

The synaptic learning process was evaluated through a fourfold validation, such that for each training fold, 1,385 indentation samples were shuffled and presented to the model, thereby totalling 1,385 training cycles. In this step, only 1 s of data was used, including the beginning of the second indentation step (that is, the highest force level for each indentation). These data were the spikes (binary vector) of the first neuronal layer of mechanoreceptors (PAs). All the synaptic learning processes were separately performed for each of the 1,036 CNs.

Data post-processing and SNN performance evaluation

After the synaptic learning, the spike responses of 1,036 CNs were calculated for the indentations of each validation fold and processed to derive the stimulus intensity and location information.

Decoding of stimulus intensity through spiking activity of SNN second-order neurons

To demonstrate the increase in CN spiking activity when progressively higher loads were applied, the cumulative spike numbers inside the receptive field of the stimulated CN were computed for different loading conditions (Extended Data Fig. 2). To this aim, we evaluated the CN spikes in a 1-s window at the beginning of each of the two plateaus of the indentation force profile. In addition, the spatial activations of the CNs for the different force levels were explored. In this regard, the number of active CNs was calculated for the two load conditions and 1,655 of the applied indentations (Fig. 4d). A Mann–Whitney U-test test with a 0.05 significance level was performed to investigate whether the spatially distributed responses of the CNs could encode information on different intensities of the applied tactile stimuli.

SNN performance and temporal resolution for localization of stimuli on the e-skin

Regarding the prediction of the contact position, the spikes of the CNs were first summed over the indentation duration window. The sum of each neuron spike (Nspki) was then used for the weighted location estimation, as described in equation (17).

$${{{\rm{WL}}}}_{x,y}=\frac{{\sum}_{i=1}^{1,036}{{N}_{i}{{\rm{Loc}}}}_{x,y}\times{{{\rm{Nspk}}}}_{i}}{\mathop{\sum}\nolimits_{i=1}^{1,036}{{{\rm{Nspk}}}}_{i}},$$

(17)

where WLx,y is the location estimation in the skin xy plane (considering the two-dimensional projection of the e-skin surface) weighted by the neural activation of the CNs (weighted position); Nspki is the number of spikes of a neuron in a time window; i is the CN ID (from 1 to 1,036); and NiLocx,y is the position of the centroid of a second-order neuron in the xy plane of the modelled functionally organized cuneate nucleus. We estimated the weighted location on a 1-s sliding window with an overlap of 100 ms. To evaluate the location prediction error, we calculated the Euclidean distance between the real position of the indentation on the e-skin and the estimated weighted location (Fig. 5).

In addition, we compared the localization performance of the SNN with those achieved through the weighted average of the 21 FBG sensor wavelengths (FBG WLx,y), calculated as follows:

$${{\rm{FBG}}}{{{\rm{WL}}}}_{x,y}=\frac{{\sum}_{i=1}^{21}|{\Delta \lambda }_{i}|\times{({x}_{{{\rm{FBG}}}},{y}_{{{\rm{FBG}}}})}_{i}}{\mathop{\sum}\nolimits_{i=1}^{21}|{\Delta \lambda }_{i}|},$$

where Δλi and (xFBG, yFBG)i are the wavelength variation and two-dimensional coordinates of the ith FBG sensor, respectively. The FBG-based weighted average was computed for each timestamp of the corresponding non-zero force values. Then, the average location and Euclidean distance were calculated to obtain a single estimation and a single error value per indentation of the test sets of the four folds (Extended Data Fig. 3). The performance of both neuronal network and FBG weighted average in localizing the test stimuli were also assessed considering ROIs of different sizes. Starting from the centre of the skin, circular areas with an increasing radius-delimited (from 20 mm to 55 mm) portions of the e-skin surface were used to compute the contact estimation error, considering that all the test indentations fell inside the circular region. Median and IQR error values were calculated for each region size, grouping the results of the four cross-validation folds (Fig. 6c and Supplementary Table 4).

We then evaluated the temporal resolution of our system to localize the applied indentations. Specifically, considering that the sensors of the e-skin tracked the dynamics of the applied loading and unloading transients (Fig. 2a–c), we estimated the minimum amount of time needed by the trained SNN to reach its best localization performance. This approach aimed at investigating the stimulus conduction time from the simulated forearm mechanoreceptors, that is, the FBG sensors, to the modelled CNs, designated to localize contacts onto the e-skin. We computed the model localization error (median and IQR) on the second step of the force profile on time windows of different lengths, ranging from 10 ms to 2 s, the latter being the duration of the complete second level of the force profile. The two extreme conditions were statistically compared with a level of significance (α) of 0.05.

Weber two-point discrimination test and analysis

To apply two-point stimuli onto the e-skin through the two-point discrimination test (Supplementary Video 3), a set of custom probes was designed (Fusion 360, Autodesk) and three-dimensionally printed in polylactic acid (Ultimaker S5). The probes featured one or two hemispherical tips (radius, 1.5 mm) with distances ranging from 20 mm to 60 mm, with a step of 5 mm. Each probe was attached to a handle stick by means of a conical spring fixed to a sliding structure, which provided the tool with a translational perpendicular degree of freedom. This structure presented a blocking mechanism so that the tool could be pushed until bottoming out, allowing to control the applied force. Ten experimenters participated in the trials, which consisted of manually indenting the e-skin by means of each of the above-mentioned probes in seven random locations. The single-probe trials were repeated before each of those with two-tip probes. At each indentation, the experimenter had to gently land on the e-skin with the probe, stabilize the posture, push the tool until the descent was not blocked and keep it for 3 s. The protocol instructions and the timing of the different indentation phases were displayed via a custom graphical user interface developed in LabVIEW 2019 (National Instruments). The same software routine allowed us to store the FBG sensor signals, collected by means of an optical interrogator (FBG Scan 904, FBGS Technologies), the corresponding timestamps and the trial information into text files for further analysis. The recordings of the FBG wavelengths were, at first, segmented to focus on the period when the probe was pressed against the e-skin. A Δλ threshold of 0.003 nm was considered to identify the onset of the contact, that is, when at least one sensor signal exceeded it for 200 ms. This corresponded to the beginning of the landing phase. Then, to identify the onset of indentation, a second threshold, that is, twice the average absolute wavelength of all the sensors during the 1 s after the e-skin was touched, was considered. The start of the stimulus corresponded to the moment when the mean value of all the FBG Δλ values exceeded this threshold for at least 500 ms. The resulting indentation portions were zero padded with 500-ms periods to simulate the transition from non-contact to contact states. Given the variability of indentations performed manually by the experimenters, some trials were discarded (Supplementary Table 5). The obtained FBG signals for the selected indentations were fed into the pretrained neuronal model to get the spiking response of the 1,036 CNs. Then, the neuronal response latency was determined within a 200-ms temporal window starting from the first CN spike. These latencies were organized topographically (Fig. 6d,e); hence, a 3 × 3 Gaussian spatial filter (σ = 0.5) was applied. The lowest minimum in the resulting latency map was identified as the first contact point. Then, the occurrence of a second minimum was checked outside a circular region corresponding to the size of the CN receptive field44. If two distinct minima were found following this criterion, it was determined that two contact points were applied to the e-skin. The two-point detection rates were then estimated for all the trials, and a psychometric curve was fitted to them. The resulting piecewise logistic function had the following expression:

$$F\left(x\right)=\frac{a}{1+{{\rm{e}}}^{-b\left(x-c\right)}}+d.$$

(18)

Finally, the 0.75 probability, that is, when F(x) was 0.75, was considered to determine the e-skin two-point discrimination threshold (Fig. 6f).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.



Source link

AI Research

Artificial Intelligence at Bayer – Emerj Artificial Intelligence Research

Published

on


Bayer is a global life sciences company operating across Pharmaceuticals, Consumer Health, and Crop Science. In fiscal 2024, the group reported €46.6 billion in sales and 94,081 employees, a scale that makes internal AI deployments consequential for workflow change and ROI.

The company invests heavily in research, with more than €6 billion allocated to R&D in 2024, and its leadership frames AI as an enabler for both sustainable agriculture and patient-centric medicine. Bayer’s own materials highlight AI’s role in planning and analyzing clinical trials as well as accelerating crop protection discovery pipelines.

This article examines two mature, internally used applications that convey the central role AI plays in Bayer’s core business goals:

  • Herbicide discovery in crop science: Applying AI to narrow down molecular candidates and identify new modes of action.
  • Clinical trial analytics in pharmaceuticals: Ingesting heterogeneous trial and device data to accelerate compliant analysis.

AI-Assisted Herbicide Discovery

Weed resistance is a mounting global challenge. Farmers in the US and Brazil are facing species resistant to multiple herbicide classes, driving up costs and threatening crop yields. Traditional herbicide discovery is slow — often 12 to 15 years from concept to market — and expensive, with high attrition during early screening.

Bayer’s Crop Science division has turned to AI to help shorten these timelines. Independent reporting notes Bayer’s pipeline includes Icafolin, its first new herbicide mode of action in decades, expected to launch in Brazil in 2028, with AI used upstream to accelerate the discovery of new modes of action.

Reuters reports that Bayer’s approach uses AI to match weed protein structures with candidate molecules, compressing the early discovery funnel by triaging millions of possibilities against pre-determined criteria. Bayer’s CropKey overview describes a profile-driven approach, where candidate molecules are designed to meet safety, efficacy, and environmental requirements from the start.

The company claims that CropKey has already identified more than 30 potential molecular targets and validated over 10 as entirely new modes of action. These figures, while promising, remain claims until independent verification.

For Bayer’s discovery scientists, AI-guided triage changes workflows by:

  • Reducing early-stage wet-lab cycles by focusing on higher-probability matches between proteins and molecules.
  • Integrating safety and environmental criteria into the digital screen, filtering out compounds unlikely to meet regulatory thresholds.
  • Advancing promising molecules sooner, enabling earlier testing and potentially compressing development timelines from 15 years to 10.

Coverage by both Reuters and the Wall Street Journal notes this strategy is expected to reduce attrition and accelerate discovery-to-commercialization timelines.

The CropKey program has been covered by multiple independent outlets, a signal of maturity beyond a single press release. Reuters reports Bayer’s assertion that AI has tripled the number of new modes of action identified in early research compared to a decade ago.

The upcoming Icafolin herbicide, expected for commercial release in 2028, demonstrates that CropKey outputs are making their way into the regulatory pipeline. The presence of both media scrutiny and near-term launch candidates suggests CropKey is among Bayer’s most advanced AI deployments.

Video explaining Bayer’s CropKey process in crop protection discovery. (Source: Bayer)

By focusing AI on high-ROI bottlenecks in research and development, Bayer demonstrates how machine learning can trim low-value screening cycles, advancing only the most promising candidates into experimental trials. At the same time, acceleration figures reported by the company should be treated as claims until they are corroborated across multiple seasons, geographies, and independent trials.

Clinical Trial Analytics Platform (ALYCE)

Pharmaceutical development increasingly relies on complex data streams: electronic health records (EHR), site-based case report forms, patient-reported outcomes, and telemetry from wearables in decentralized trials. Managing this data volume and variety strains traditional data warehouses and slows regulatory reporting.

Bayer developed ALYCE (Advanced Analytics Platform for the Clinical Data Environment) to handle this complexity. In a PHUSE conference presentation, Bayer engineers describe the platform as a way to ingest diverse data, ensure governance, and deliver analytics more quickly while maintaining compliance.

The presentation describes ALYCE’s architecture as using a layered “Bronze/Silver/Gold” data lake approach. An example trial payload included approximately 300,000 files (1.6 TB) for 80 patients, requiring timezone harmonization, device ID mapping, and error handling before data could be standardized to SDTM (Study Data Tabulation Model) formats. Automated pipelines provide lineage, quarantine checks, and notifications. These technical details were presented publicly to peers, reinforcing their credibility beyond internal marketing.

For statisticians and clinical programmers, ALYCE claims to:

  • Standardize ingestion across structured (CRFs), semi-structured (EHR extracts), and unstructured (device telemetry) sources.
  • Automate quality checks through pipelines that reduce manual intervention and free staff up to focus on analysis.
  • Enable earlier insights by preparing analysis-ready datasets faster, shortening the lag between data collection and review.

These objectives are consistent with Bayer’s broader statement that AI is being used to plan and analyze clinical trials safely and efficiently.

PHUSE is a respected industry forum where sponsors share methods with peers, and Bayer’s willingness to disclose technical details indicates ALYCE is in production. While Bayer has not released precise cycle-time savings, its emphasis on elastic storage, regulatory readiness, and speed suggests measurable efficiency gains.

Given the specificity of the presentation — real-world payloads, architecture diagrams, and validation processes — ALYCE appears to be a mature platform actively supporting Bayer’s clinical trial programs.

Screenshot from Bayer’s PHUSE presentation illustrating ALYCE’s automated ELTL pipeline.
(Source: PHUSE)

Bayer’s commitment to ALYCE reflects its broader effort to modernize and scale clinical development. By consolidating varied data streams into a single, automated environment, the company positions itself to shorten study timelines, reduce operational overhead, and accelerate the movement of promising therapies from discovery to patients. This infrastructure also prepares Bayer to expand AI-driven analytics across additional therapeutic areas, supporting long-term competitiveness in a highly regulated industry.

While Bayer has not published specific cycle-time reductions or quantified cost savings tied directly to ALYCE, the company’s willingness to present detailed payload volumes and pipeline architecture at PHUSE indicates that the platform is actively deployed and has undergone peer-level scrutiny. Based on those disclosures and parallels with other pharma AI implementations, reasonable expectations include faster data review cycles, earlier anomaly detection, and improved compliance readiness. These outcomes—though not yet publicly validated—suggest ALYCE is reshaping Bayer’s trial workflows in ways that could yield significant long-term returns.



Source link

Continue Reading

AI Research

The STARD-AI reporting guideline for diagnostic accuracy studies using artificial intelligence

Published

on


  • Reichlin, T. et al. Early diagnosis of myocardial infarction with sensitive cardiac troponin assays. N. Engl. J. Med. 361, 858–867 (2009).

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Hawkes, N. Cancer survival data emphasise importance of early diagnosis. BMJ 364, l408 (2019).

    Article 
    PubMed 

    Google Scholar
     

  • Neal, R. D. et al. Is increased time to diagnosis and treatment in symptomatic cancer associated with poorer outcomes? Systematic review. Br. J. Cancer 112, S92–S107 (2015).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Leifer, B. P. Early diagnosis of Alzheimer’s disease: clinical and economic benefits. J. Am. Geriatr. Soc. 51, S281–S288 (2003).

    Article 
    PubMed 

    Google Scholar
     

  • Crosby, D. et al. Early detection of cancer. Science 375, eaay9040 (2022).

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Fleming, K. A. et al. The Lancet Commission on diagnostics: transforming access to diagnostics. Lancet 398, 1997–2050 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Whiting, P. F., Rutjes, A. W., Westwood, M. E. & Mallett, S. A systematic review classifies sources of bias and variation in diagnostic test accuracy studies. J. Clin. Epidemiol. 66, 1093–1104 (2013).

    Article 
    PubMed 

    Google Scholar
     

  • Glasziou, P. et al. Reducing waste from incomplete or unusable reports of biomedical research. Lancet 383, 267–276 (2014).

    Article 
    PubMed 

    Google Scholar
     

  • Ioannidis, J. P. et al. Increasing value and reducing waste in research design, conduct, and analysis. Lancet 383, 166–175 (2014).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Lijmer, J. G. et al. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA 282, 1061–1066 (1999).

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Irwig, L., Bossuyt, P., Glasziou, P., Gatsonis, C. & Lijmer, J. Designing studies to ensure that estimates of test accuracy are transferable. BMJ 324, 669–671 (2002).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Moons, K. G., van Es, G. A., Deckers, J. W., Habbema, J. D. & Grobbee, D. E. Limitations of sensitivity, specificity, likelihood ratio, and Bayes’ theorem in assessing diagnostic probabilities: a clinical example. Epidemiology 8, 12–17 (1997).

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Bossuyt, P. M. et al. The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Ann. Intern. Med. 138, W1–W12 (2003).

    Article 
    PubMed 

    Google Scholar
     

  • Bossuyt, P. M. et al. STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ 351, h5527 (2015).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Cohen, J. F. et al. STARD 2015 guidelines for reporting diagnostic accuracy studies: explanation and elaboration. BMJ Open 6, e012799 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Cohen, J. F. et al. STARD for Abstracts: essential items for reporting diagnostic accuracy studies in journal or conference abstracts. BMJ 358, j3751 (2017).

    Article 
    PubMed 

    Google Scholar
     

  • Korevaar, D. A. et al. Reporting diagnostic accuracy studies: some improvements after 10 years of STARD. Radiology 274, 781–789 (2015).

    Article 
    PubMed 

    Google Scholar
     

  • Korevaar, D. A., van Enst, W. A., Spijker, R., Bossuyt, P. M. & Hooft, L. Reporting quality of diagnostic accuracy studies: a systematic review and meta-analysis of investigations on adherence to STARD. Evid. Based Med. 19, 47–54 (2014).

    Article 
    PubMed 

    Google Scholar
     

  • Miao, Z., Humphreys, B. D., McMahon, A. P. & Kim, J. Multi-omics integration in the age of million single-cell data. Nat. Rev. Nephrol. 17, 710–724 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Williamson, E. J. et al. Factors associated with COVID-19-related death using OpenSAFELY. Nature 584, 430–436 (2020).

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Lu, R. et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Lancet 395, 565–574 (2020).

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • De Fauw, J. et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat. Med. 24, 1342–1350 (2018).

    Article 
    PubMed 

    Google Scholar
     

  • McKinney, S. M. et al. International evaluation of an AI system for breast cancer screening. Nature 577, 89–94 (2020).

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 25, 44–56 (2019).

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Benjamens, S., Dhunnoo, P. & Meskó, B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit. Med. 3, 118 (2020).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Liu, X. et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit. Health 1, e271–e297 (2019).

    Article 
    PubMed 

    Google Scholar
     

  • Liu, X. et al. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat. Med. 26, 1364–1374 (2020).

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Rivera, S. C., Liu, X., Chan, A.-W., Denniston, A. K. & Calvert, M. J. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI Extension. BMJ 370, m3210 (2020).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Collins, G. S. et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 385, e078378 (2024).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Tejani, A. S. et al. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): 2024 Update. Radiol. Artif. Intell. 6, e240300 (2024).

  • Aggarwal, R. et al. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit.Med. 4, 65 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • McGenity, C. et al. Artificial intelligence in digital pathology: a systematic review and meta-analysis of diagnostic test accuracy. NPJ Digit. Med. 7, 114 (2024).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Rajkomar, A. et al. Scalable and accurate deep learning with electronic health records. NPJ Digit. Med. 1, 18 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Moons, K. G. M., de Groot, J. A. H., Linnet, K., Reitsma, J. B. & Bossuyt, P. M. M. Quantifying the added value of a diagnostic test or marker. Clin. Chem. 58, 1408–1417 (2012).

    Article 
    PubMed 

    Google Scholar
     

  • Bossuyt, P. M. M., Reitsma, J. B., Linnet, K. & Moons, K. G. M. Beyond diagnostic accuracy: the clinical utility of diagnostic tests. Clin. Chem. 58, 1636–1643 (2012).

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Gallifant, J. et al. The TRIPOD-LLM reporting guideline for studies using large language models. Nat. Med. 31, 60–69 (2025).

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17, 195 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Yang, Y., Zhang, H., Gichoya, J. W., Katabi, D. & Ghassemi, M. The limits of fair medical imaging AI in real-world generalization. Nat. Med. 30, 2838–2848 (2024).

  • The White House. Delivering on the Promise of AI to Improve Health Outcomes. https://bidenwhitehouse.archives.gov/briefing-room/blog/2023/12/14/delivering-on-the-promise-of-ai-to-improve-health-outcomes/ (2023).

  • Coalition for Health AI. Blueprint for Trustworthy AI Implementation Guidance and Assurance for Healthcare. https://www.chai.org/workgroup/responsible-ai/blueprint-for-trustworthy-ai (2023).

  • Guni, A., Varma, P., Zhang, J., Fehervari, M. & Ashrafian, H. Artificial intelligence in surgery: the future is now. Eur. Surg. Res. https://doi.org/10.1159/000536393 (2024).

  • Chen, R. J. et al. Algorithmic fairness in artificial intelligence for medicine and healthcare. Nat. Biomed. Eng. 7, 719–742 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Krakowski, I. et al. Human-AI interaction in skin cancer diagnosis: a systematic review and meta-analysis. NPJ Digit. Med. 7, 78 (2024).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023).

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Tu, T. et al. Towards generalist biomedical AI. NEJM AI 1, AIoa2300138 (2024).

    Article 

    Google Scholar
     

  • Acosta, J. N., Falcone, G. J., Rajpurkar, P. & Topol, E. J. Multimodal biomedical AI. Nat. Med. 28, 1773–1784 (2022).

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Barata, C. et al. A reinforcement learning model for AI-based decision support in skin cancer. Nat. Med. 29, 1941–1946 (2023).

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Mankowitz, D. J. et al. Faster sorting algorithms discovered using deep reinforcement learning. Nature 618, 257–263 (2023).

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Corso, G., Stark, H., Jegelka, S., Jaakkola, T. & Barzilay, R. Graph neural networks. Nat. Rev. Methods Primers 4, 17 (2024).

    Article 
    CAS 

    Google Scholar
     

  • Li, H. et al. CGMega: explainable graph neural network framework with attention mechanisms for cancer gene module dissection. Nat. Commun. 15, 5997 (2024).

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Pahud de Mortanges, A. et al. Orchestrating explainable artificial intelligence for multimodal and longitudinal data in medical imaging. NPJ Digit. Med. 7, 195 (2024).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Johri, S. et al. An evaluation framework for clinical use of large language models in patient interaction tasks. Nat. Med. 31, 77–86 (2025).

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • EQUATOR Network. Enhancing the QUAlity and Transparency Of health Research. https://www.equator-network.org/

  • Sounderajah, V. et al. Developing specific reporting guidelines for diagnostic accuracy studies assessing AI interventions: the STARD-AI Steering Group. Nat. Med. 26, 807–808 (2020).

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Sounderajah, V. et al. Developing a reporting guideline for artificial intelligence-centred diagnostic test accuracy studies: the STARD-AI protocol. BMJ Open 11, e047709 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     



  • Source link

    Continue Reading

    AI Research

    Tracking AI’s role in the US and global economy \ Anthropic

    Published

    on


    Travel planning in Hawaii, scientific research in Massachusetts, and building web applications in India. On the face of it, these three activities share very little in common. But it turns out that they’re the particular uses of Claude that are some of the most overrepresented in each of these places.

    That doesn’t mean these are the most popular tasks: software engineering is still by far in the lead in almost every state and country in the world. Instead, it means that people in Massachusetts have been more likely to ask Claude for help with scientific research than people elsewhere – or, for instance, that Claude users in Brazil appear to be particularly enthusiastic about languages: they use Claude for translation and language-learning about six times more than the global average.

    These are statistics we found in our third Anthropic Economic Index report. In this latest installment, we’ve expanded our efforts to document the early patterns of AI adoption that are beginning to reshape work and the economy. We measure how Claude is being used differently…

    • …within the US: we provide the first-ever detailed assessment of how AI use differs between US states. We find that the composition of states’ economies informs which states use Claude the most per capita – and, surprisingly, that the very highest-use states aren’t the ones where coding dominates.
    • …across different countries: our new analysis finds that countries’ use of Claude is strongly correlated with income, and that people in lower-use countries use Claude to automate work more frequently than those in higher-use ones.
    • …over time: we compare our latest data with December 2024-January 2025 and February–March 2025. We find that the proportion of ‘directively’ automated tasks increased sharply from 27% to 39%, suggesting a rapid increase in AI’s responsibility (and in users’ trust).
    • …and by business users: we now include anonymized data from Anthropic’s first-party API customers (in addition to users of Claude.ai), allowing us to analyze businesses’ interactions for the first time. We find that API users are significantly more likely to automate tasks with Claude than consumers are, which suggests that major labor market implications could be on the horizon.

    We summarize the report below. In addition, we’ve designed an interactive website where you can explore our data yourself. For the first time, you can search for trends and results in Claude.ai use across every US state and all occupations we track, to see how AI is used where you live or by people in similar jobs. Finally, if you’d like to build on our analysis, we’ve made our dataset openly available, alongside the data from our previous Economic Index reports.

    Geography

    We’ve expanded the Anthropic Economic Index to include geographic data. Below we cover what we’ve learned about how Claude is used across countries and US states.

    Across countries

    The US uses Claude far more than any other nation. India is in second place, followed by Brazil, Japan, and South Korea, each with similar shares.

    Leading countries in terms of global Claude.ai use share.

    However, there is huge variation in population size across these countries. To account for this, we adjust each country’s share of Claude.ai use by its share of the world’s working population. This gives us our Anthropic AI Usage Index, or AUI. Countries with an AUI greater than 1 use Claude more often than we’d expect based on their working-age population alone, and vice-versa.

    The twenty countries that score highest on our Anthropic AI Usage Index: Israel, Singapore, Australia, New Zealand, and South Korea are the top five.
    The twenty countries that score highest on our Anthropic AI Usage Index.

    From the AUI data, we can see that some small, technologically advanced countries (like Israel and Singapore) lead in Claude adoption relative to their working-age populations. This might to a large degree be explained by income: we found a strong correlation between GDP per capita and the Anthropic AI Usage Index (a 1% higher GDP per capita was associated with a 0.7% higher AUI). This makes sense: the countries that use Claude most often generally also have robust internet connectivity, as well as economies oriented around knowledge work rather than manufacturing. But it does raise a question of economic divergence: previous general-purpose technologies, like electrification or the combustion engine, led to both vast economic growth and a great divergence in living standards around the world. If the effects of AI prove to be largest in richer countries, this general-purpose technology might have similar economic implications.

    Graph showing that Claude use per capita is positively correlated with income per capita across countries.
    Claude use per capita is positively correlated with income per capita across countries. (Axes are on a log scale.)

    Patterns within the United States

    The link between per capita GDP and per capita use of Claude also holds when comparing between US states. In fact, use rises more quickly within income here than across countries: a 1% higher per capita GDP inside the US is associated with a 1.8% higher population-adjusted use of Claude. That said, income actually has less explanatory power within the US than across countries, as there’s much higher variance within the overall trend. That is: other factors, beyond income, must explain more of the variation in population-adjusted use.

    What else could explain this adoption gap? Our best guess is that it’s differences in the composition of states’ economies. The highest AUI in the US is the District of Columbia (3.82), where the most disproportionately frequent uses of Claude are editing documents and searching for information, among other tasks associated with knowledge work in DC. Similarly, coding-related tasks are especially common in California (the state with the third-highest AUI overall), and finance-related tasks are especially common in New York (which comes in fourth).1 Even among states with lower population-adjusted use of Claude, like Hawaii, use is closely correlated to the structure of the economy: Hawaiians request Claude’s assistance for tourism-related tasks at twice the rate of the rest of America. Our interactive website contains plenty of other statistics like these.

    Graph showing US states’ Claude adoption relative to their working age populations, with Utah and DC in the lead.
    US states’ Claude adoption relative to their working age populations.

    Trends in Claude use

    We’ve been tracking how people use Claude since December 2024. We use a privacy-preserving classification method that categorizes anonymized conversation transcripts into task groups defined by O*NET, a US government database that classifies jobs and the tasks associated with them.2 By doing this, we can analyze both how the tasks that people give Claude have changed since last year, and how the ways people choose to collaborate—how much oversight and input into Claude’s work they choose to have—have changed too.

    Tasks

    Since December 2024, computer and mathematical uses of Claude have predominated among our categories, representing around 37-40% of conversations.

    But a lot has changed. Over the past nine months, we’ve seen consistent growth in “knowledge-intensive” fields. For example, educational instruction tasks have risen by more than 40 percent (from 9% to 13% of all conversations), and the share of tasks associated with the physical and social sciences has increased by a third (from 6% to 8%). In the meantime, the relative frequency of traditional business tasks has declined: management-related tasks have fallen from 5% of all conversations to 3%, and the share of tasks related to business and financial operations has halved, from 6% to 3%. (In absolute terms, of course, the number of conversations in each category has still risen significantly.)

    Changes in Claude use over time, showing increases in use for scientific and educational tasks, and decreases for arts, business, and architecture uses.
    Changes in Claude use over time, showing increases in use for scientific and educational tasks.

    The overall trend is noisy, but generally, as the GDP per capita of a country increases, the use of Claude shifts away from tasks in the Computer and Mathematical occupation group, and towards a diverse range of other activities, like education, art and design; office and administrative support; and the physical and social sciences. Compare the trend line in the first graph below to the remaining three:

    Occupation group shares vs. the Anthropic AI usage index, for computer and mathematical, educational instruction, arts, and office and administrative tasks.
    As we move from lower to higher adoption countries, Claude use appears to shift to a more diverse mix of tasks, although the overall pattern is noisy.

    All that said, software development remains the most common use in every single country we track. The picture looks similar in the US, although our sample size limits our ability to explore in more detail how the task mix varies with adoption rates.

    Patterns of interaction

    As we’ve discussed previously, we generally distinguish between tasks that involve automation (in which AI directly produces work with minimal user input) and augmentation (in which the user and AI collaborate to get things done). We further break automation down into directive and feedback loop interactions, where directive conversations involve the minimum of human interaction, and in feedback loop tasks, humans relay real-world outcomes back to the model. We also break augmentation down into learning (asking for information or explanations), task iteration (working with Claude collaboratively), and validation (asking for feedback).

    Since December 2024, we’ve found that the share of directive conversations has risen sharply, from 27% to 39%. The shares of other interaction patterns (particularly learning, task iteration, and feedback loops) have fallen slightly as a result. This means that for the first time, automation (49.1%) has become more common than augmentation (47%) overall. One potential explanation for this is that AI is rapidly winning users’ confidence, and becoming increasingly responsible for completing sophisticated work.

    This could be the result of improved model capabilities. (In December 2024, when we first collected data for the Economic Index, the latest version of Claude was Sonnet 3.6.) As models get better at anticipating what users want and at producing high-quality work, users are likely more willing to trust the model’s outputs at the first attempt.

    Graphs showing automation overtaking augmentation from our first to third Index reports.
    Automation appears to be increasing over time.

    Perhaps surprisingly, in countries with higher Claude use per capita, Claude’s uses tend towards augmentation, whereas people in lower-use countries are much more likely to prefer automation. Controlling for the mix of tasks in question, a 1% increase in population-adjusted use of Claude is correlated with a roughly 3% reduction in automation. Similarly, increases in population-adjusted Claude use are associated with a shift away from automation (as in the chart below), not towards.

    We’re not yet sure why this is. It could be because early adopters in each country feel more comfortable allowing Claude to automate tasks, or it could be down to other cultural and economic factors.

    Graph showing that countries with higher Claude use per capita tend to have a lower share of automated tasks..
    Countries with higher Claude use per capita tend to use Claude in a more collaborative manner.

    Businesses

    Using the same privacy-preserving methodology we use for conversations on Claude.ai, we have begun sampling interactions from a subset of Anthropic’s first-party API customers, in a first-of-its-kind analysis.3 API customers, who tend to be businesses and developers, use Claude very differently to those who access it through Claude.ai: they pay per token, rather than a fixed monthly subscription, and can make requests through their own programs.

    These customers’ use of Claude is especially concentrated in coding and administrative tasks: 44% of the API traffic in our sample maps to computer or mathematical tasks, compared to 36% of tasks on Claude.ai. (As it happens, around 5% of all API traffic focuses specifically on developing and evaluating AI systems.) This is offset by a smaller proportion of conversations related to educational occupations (4% in the API relative to 12% on Claude.ai), and arts and entertainment (5% relative to 8%).

    We also find that our API customers use Claude for task automation much more often than Claude.ai users. 77% of our API conversations show automation patterns, of which the vast majority are directive, while just 12% show augmentation. On Claude.ai, the split is almost even. This could have significant economic implications: in the past, the automation of tasks has been associated with large economic transitions, as well as major productivity gains.

    Graph showing a much higher share of augmentative uses on Claude.ai than the API, and vice-versa for automative uses.
    Augmentation and automation with Claude on Claude.ai vs. the API.

    Finally, given how API use is paid for, we can also explore whether differences in the cost of tasks (caused by differences in the number of tokens they consume) affect which tasks businesses choose to “buy”. Here, we find a positive correlation between price and use: higher-cost task categories tend to see more frequent use, as in the graph below. This suggests to us that fundamental model capabilities, and the economic value generated by the models, matters more to businesses than the cost of completing the task itself.

    Graph showing occupational categories' usage share vs. average API cost.
    Cost per task plotted against the task category’s share of total conversations.

    Conclusion

    The Economic Index is designed to provide an early, empirical assessment of how AI is affecting people’s jobs and the economy. What have we found so far?

    Across each of the measures we cover in this report, the adoption of AI appears remarkably uneven. People in higher-income countries are more likely to use Claude, more likely to seek collaboration rather than automation, and more likely to pursue a breadth of uses beyond coding. Within the US, AI use seems to be strongly influenced by the dominant industries in local economies, from technology to tourism. And businesses are more likely to entrust Claude with agency and autonomy than consumers are.

    Beyond the fact of unevenness, it’s especially notable to us that directive automation has become much more common in conversations on Claude.ai over the past nine months. The nature of people’s use of Claude is evidently still being defined: we’re still collectively deciding how much confidence we have in AI tools, and how much responsibility we should give them. So far, though, it looks like we’re becoming increasingly comfortable with AI, and willing to let it work on our behalf. We’re looking forward to revisiting this analysis over time, to see where—or, indeed, if—users’ choices settle as AI models improve.

    If you’d like to explore our data yourself, you can do so on our dedicated Anthropic Economic Index website, which contains interactive visualizations of our country, state, and occupational data. We’ll update this website with more data in future, so you can continue to track the evolution of AI’s effects on jobs and the economy in the ways that interest you.

    Our full report is available here. We hope it helps policymakers, economists and others more effectively prepare for the economic opportunities and risks that AI provides.

    Open data

    As with our past reports, we’re releasing a comprehensive dataset for this release, including geographic data, task-level use patterns, automation/augmentation breakdowns by task, and an overview of API use. Data are available for download at the Anthropic Economic Index website.

    Work with us

    If you’re interested in working at Anthropic to help build the systems powering this research, we encourage you to apply for our Research Engineer role.



    Source link

    Continue Reading

    Trending