AI Research
Multimodal AI to forecast arrhythmic death in hypertrophic cardiomyopathy
This study complies with all relevant ethical regulations and has been approved by the institutional review boards of Johns Hopkins Medicine and Atrium Health.
Patient population and datasets
JHH-HCM registry (internal)
A retrospective analysis was performed on patient data from the JHH-HCM registry spanning 2005–2015. Enrollment in the registry was based on the first visit to the Johns Hopkins HCM Center of Excellence, where patients meeting the diagnostic criteria for HCM were included. These criteria focused on the presence of unexplained left ventricular hypertrophy (maximal wall thickness ≥15 mm) without evidence of uncontrolled hypertension, valvular heart disease and HCM phenocopies, such as amyloidosis and storage disorders. Patients were followed for a mean duration of 2.86 years (median 1.92 years; 25th–75th percentile = 0.94–4.28 years). The current study focused on a subset of patients with HCM who were enrolled between 2005 and 2015 and had adequate LGE-CMR images, totaling 553 patients (Extended Data Fig. 3).
SHVI-HCM registry (external)
A retrospective analysis was performed on patient data from the Atrium Health SHVI-HCM registry spanning 2015–2023. This registry includes patients who presented to the SHVI HCM Center of Excellence with a preexisting HCM diagnosis or were subsequently diagnosed based on cardiac imaging, personal and family history, and/or genetic testing in accordance with current guideline definitions. Patients within this longitudinal database are still being followed, as the endpoint for registry inclusion is the transfer of care to an outside facility or death. For the purposes of this study, the SHVI-HCM registry was interrogated for patients who had undergone CMR imaging and ICD placement, and enrollment was delineated by the patient’s first visit with the SHVI.
Data collection and primary endpoint
Clinical data, including demographics, symptoms, comorbidities, medical history and stress test results, were ascertained during the initial clinic visit and at each follow-up visit. Rest and stress echocardiography and CMR imaging were performed as routine components of clinical evaluation for all patients referred to the HCM centers. For the internal JHH-HCM registry, echocardiography and CMR imaging were conducted before the first clinic visit, with typically 3 months between the imaging assessment and the first clinic visit. For the SHVI-HCM registry, patients typically underwent echocardiography and CMR imaging after the first clinic visit. The full list of covariates used in MAARS can be found in Extended Data Tables 1 and 2. The data were extracted through a manual search of patients’ EHRs. EchoPAC software (GE Healthcare) was used to quantitatively analyze the echocardiogram and compute related covariates. Of note, the internal and external cohorts have distinct patient populations with different demographic characteristics and different levels of risk factors (Table 1).
The CMR images in the JHH-HCM registry were acquired using 1.5-T magnetic resonance imaging (MRI) devices (Aera, Siemens; Avanto, Siemens; Signa, GE; Intera, Phillips). In the SHVI-HCM registry, most CMR images were acquired using 1.5-T MRI devices (Aera, Siemens; Sola, Siemens), and a small proportion of CMR images were acquired using 3-T MRI devices (Vida, Siemens). LGE images were obtained 10–15 min after intravenous administration of 0.2 mmol kg−1 gadopentetate dimeglumine. An inversion scout sequence was used to select the optimal inversion time for nulling normal myocardial signal. All images used were 2D parallel short-axis left ventricular stacks. Typical spatial resolutions were in the range of 1.4–2.9 × 1.4–2.9 × 7–8 mm, with 1.6- to 2-mm gaps.
The primary endpoint for the JHH-HCM registry was SCDA defined as sustained ventricular tachycardia (ventricular rate ≥130 beats per min lasting for ≥30 s) or ventricular fibrillation resulting in defibrillator shocks or antitachycardia pacing. Arrhythmic events were ascertained by reviewing electrocardiogram, Holter monitor and ICD interrogation data. The primary endpoint for the SHVI-HCM registry was SCDA defined as device shock, appropriate interventions or out-of-hospital cardiac arrest.
More details regarding patient inclusion, assessment, follow-up, echocardiography and CMR acquisition can be found in previous work23,51.
Data preparation
The multimodal inputs to MAARS included LGE-CMR scans and clinical covariates from EHRs and CIRs (Extended Data Tables 1 and 2). The labels were the outcomes (SCDA or non-SCDA). The preprocessing steps for LGE-CMR scans (described below) aimed to exclude nonrelevant background information and to standardize the CMR image volume for consistent analysis across all patients. We first obtained the left ventricular region of interest using our previously developed and validated deep learning algorithm52. Once each patient’s LGE-CMR 2D slices were processed using this algorithm, all pixels outside the left ventricle were zeroed out, and the pixels within the left ventricle were normalized by the median blood pool pixel intensity in each slice. Finally, the processed slices were stacked and interpolated to a regular 96 × 96 × 20 grid with voxel dimensions of 4.0 × 4.0 × 6.3 mm.
The EHR and CIR data were structured as tabular data. The input features included in the analysis were ensured to have <40% missing values originally; missing values were imputed using multivariate imputation by chained equations (MICE)53. MICE is a fully conditional specification approach that models each input feature with missing values as a function of all other features iteratively. To address the feature mismatch issue between the internal and external cohorts, we used a MICE imputer based on the internal dataset to impute the missing values in both datasets. After the imputation, the EHR and CIR data were standardized using the z-score method, which involves subtracting the mean and dividing by the s.d. of each feature.
Transformer-based multimodal neural network
Modality-specific branch networks
Three unimodal branch networks are included in MAARS, each learning from a specific input modality: a 3D-ViT29 for LGE-CMR images, an FNN for EHR data and an FNN for CIR data.
In the LGE-CMR branch, the image vector embeddings ζ are obtained by dividing the original 3D image X into n flattened nonoverlapping 3D image patches xi and following the operations
$$begin{array}{c}{zeta }_{{rm{CMR}}}^{,0}=left[{z}_{{rm{cls}}},E{x}_{1},E{x}_{2},ldots ,E{x}_{n}right]+{p}end{array}$$
(1)
where E is a linear projection, zcls is a classification token (CLS-token) and ‘p’ is a learnable positional embedding to retain positional information.
The image vector embeddings ({zeta }_{{rm{CMR}}}^{,0}) are then processed by a sequence of LViT transformer encoder blocks. Each transformer encoder block, ({zeta }_{{rm{CMR}}}^{,l+1}={rm{Transformer}}left({zeta }_{{rm{CMR}}}^{,l};{theta }_{{rm{ViT}}}^{l}right)), consists of two submodules: (1) a multihead self-attention (MSA) module and (2) a two-layer fully connected FNN.
$$begin{array}{c}{nu }^{l}={rm{MSA}}left({rm{LN}}left({zeta }_{{rm{CMR}}}^{,l}right)right)+{zeta }^{,l}end{array}$$
(2)
$$begin{array}{c}{zeta }_{{rm{CMR}}}^{,l+1}={rm{FNN}}left({rm{LN}}left({nu }^{l}right)right)+{nu }^{l}end{array}$$
(3)
where LN is the layer normalization operation. In the final transformer encoder block, the encoded CMR knowledge, ξCMR, is defined as
$$begin{array}{c}{zeta }_{{rm{CMR}}}^{{,L}_{{rm{ViT}}}}=left[{z}_{{rm{cls}}}^{{,L}_{{rm{ViT}}}},{z}_{1}^{{,L}_{{rm{ViT}}}},{z}_{2}^{{,L}_{{rm{ViT}}}},ldots ,{z}_{n}^{{,L}_{{rm{ViT}}}}right]={rm{Transformer}}left({zeta }_{{rm{CMR}}}^{{,L}_{{rm{ViT}}}-1};{theta }_{{rm{ViT}}}^{{L}_{{rm{ViT}}}-1}right)end{array}$$
(4)
$$begin{array}{c}{{rm{xi }}}_{{rm{CMR}}}={rm{LN}}left({z}_{{rm{cls}}}^{{,L}_{{rm{ViT}}}}cdot Wright)end{array}$$
(5)
where W is a learnable matrix.
In the EHR and CIR branches, processed EHR and CIR data are converted to vectors ζEHR, ζCIR fed into two FNNs, with outputs ξEHR and ξCIR representing the encoded EHR and CIR knowledge.
$$begin{array}{c}{xi }_{{rm{EHR}}}={rm{FNN}}left({zeta }_{{rm{EHR}}};{theta }_{{rm{EHR}}}right)end{array}$$
(6)
$$begin{array}{c}{xi }_{{rm{CIR}}}={rm{FNN}}left({zeta }_{{rm{CIR}}};{theta }_{{rm{CIR}}}right)end{array}$$
(7)
Multimodal fusion
Following knowledge encoding from the LGE-CMR, EHR and CIR subnetworks, we used an MBT consisting of multiple blocks to fuse the knowledge across modalities. MBT has demonstrated state-of-the-art performance in multimodal fusion tasks and has a light computational cost30. In each MBT block, the unimodal knowledge vectors concatenated with a shared fusion vector, ξfsn, are fed into modality-specific transformers:
$$begin{array}{c}left[{{xi }_{* }^{l+1},hat{xi }}_{{rm{fsn}},* }^{l+1}right]={rm{Transformer}}left(left[{xi }_{* }^{l},{xi }_{{rm{fsn}}}^{l}right];{theta }_{{rm{MBT}},* }^{l}right)end{array}$$
(8)
The fusion vector in layer l + 1 is updated as follows:
$$begin{array}{c}{xi }_{{rm{fsn}}}^{,l+1}={rm{Avg}}left({hat{xi}}_{{rm{fsn}},* }^{,l+1}right)end{array}$$
(9)
The last MBT block outputs a predicted SCDA risk score p using the following equation:
$$begin{array}{c}p={rm{sigmoid}}left(left[{xi}_{{rm{CMR}}}^{{,L}_{{rm{MBT}}}},{xi}_{{rm{EHR}}}^{{,L}_{{rm{MBT}}}},{xi }_{{rm{CIR}}}^{{,L}_{{rm{MBT}}}}right]cdot W+bright)end{array}$$
(10)
Model training and implementation details
For patient i, their SCDA outcome yi is 1 if they experienced an SCDA event during the follow-up, and 0 otherwise. We adopted the balanced focal loss as the loss function54:
$$L=-sum _{i}{alpha }_{i}{({,y}_{i}-{p}_{i})}^{gamma }log {p}_{i}$$
(11)
where αi is a class-dependent scaling factor, and γ is the focusing parameter that controls the level of how the model focuses on its mistakes and prioritizes improving on the hard examples, which was set as γ = 2 in this study.
The LGE-CMR, EHR and CIR branch networks were first trained independently, and then MAARS was trained end-to-end with all the branch networks and the multimodal fusion module. All models were trained with a batch size of 64 and a maximum of 150 epochs with early stopping based on loss. The Adam optimizer was used, with β1 = 0.9, β2 = 0.999, and the learning rate was initially set at 1 × 10−3 for the LGE-CMR branch network, 1 × 10−2 for the EHR and CIR branch networks, and 3 × 10−2 for the multimodal fusion and was adaptively adjusted during the training process. For the LGE-CMR branch network, the ViT has LViT = 8 transformer encoder blocks, eight heads for each attention module and dimension d = 512. The EHR branch network used an FNN with two hidden layers and a latent dimension of 16. The CIR branch network used an FNN with one hidden layer and a latent dimension of 16. The encoded unimodal knowledge vectors have dimensions ξCMR ∈ R32, ξEHR ∈ R16, ξCIR ∈ R16. We set LMBT as 3 and the bottleneck fusion vector dimension as 8.
Assessing model performance and clinical validation
Performance metrics
The values of metrics derived from the confusion matrix (BA, sensitivity and specificity) were computed at optimal probability decision thresholds selected to maximize Youden’s J statistic. When comparing the AI model’s performance to that of the clinical tools, we also adjusted the decision threshold by matching the sensitivities of the clinical tools to evaluate their specificities. All metrics were in the range of 0 to 1, with the baseline levels obtained by random chance being as follows: AUROC = 0.5, BA = 0.5, AUPRC = 0.03 and Bs = 0.25.
Internal and external validation
The internal model performance was assessed in a fivefold cross-validation of the JHH-HCM cohort on the patient level stratified by outcome. The training and test sets were split on the patient level; that is, all LGE-CMR scans corresponding to a given patient case were only present in either the training or validation set and never simultaneously partly in both. After five training folds, the model’s performance metrics were calculated based on the aggregation of all validation folds.
For the external performance evaluation, we trained the model using the entire JHH-HCM dataset (with 90% as the training set and 10% as the development set) and tested the model’s performance on the SHVI-HCM cohort. Of note, the model for external validation inherited the same hyperparameters as the internal model.
Model interpretability
We interpreted the MAARS network weights and predictions using attribution- and attention-based methods.
Shapley value
The EHR and CIR branch networks were interpreted using the Shapley value, which quantifies the incremental attribution of every input feature to the final prediction. The Shapley value32 is based on the cooperative game theory and explains a prediction as a coalitional game played by the feature values. The Shapley value has a collection of desirable properties, including efficiency, symmetry, dummy and additivity. In this study, the Shapley values were estimated using a permutation formulation implemented in SHAP55.
Attention rollout
For the LGE-CMR branch network, we used a technique called attention rollout to quantify attention flows from the start to the end throughout the ViT. Formally, at transformer encoder block l, the average of the attention matrices of all attention heads is Al. The residual connection at each block is modeled by adding the identity matrix I to the attention matrix. Therefore, the attention rollout is recursively computed by
$$begin{array}{c}{A}_{{rm{Rollout}}}^{l}=left({A}^{l}+Iright)cdot{A}_{{rm{Rollout}}}^{l-1}end{array}$$
(12)
We explained the predictions of the LGE-CMR branch network using the attention rollout at the end of the ViT after flowing through LViT transformer blocks, ({A}_{{rm{Rollout}}}^{{L}_{{rm{ViT}}}}).
Statistical analysis
The P values of clinical covariates between the internal and external cohorts were based on a two-sample Welch’s t-test for numerical variables and the Mann–Whitney U test for categorical variables before data imputation. Kolmogorov–Smirnov tests for the risk score distributions were based on the aggregated predictions on all internal validation folds. The means and CIs of model performance metrics in the internal fivefold cross-validation were estimated using 200 bootstrapping samples of the aggregated predictions on all validation folds. The performance metrics in the external validation were calculated using model predictions on 200 bootstrapping resampled datasets of the SHVI-HCM cohort. The computations were based on the bias-corrected and accelerated bootstrap method. Pearson’s r for clinical covariates in the network interpretations was based on aggregated interpretations from all internal validation folds.
Computational hardware and software
MAARS was built in Python 3.9 using packages including PyTorch 2.0, NumPy 1.23.5, Pandas 1.5.3, SciPy 1.10, scikit-learn 1.2.0, scikit-image 0.19.3, pydicom 2.3, SimpleITK 2.2.1 and SHAP 0.41. Data preprocessing, model training and result analysis were performed on a machine with an AMD Ryzen Threadripper 1920X 12-core CPU and NVIDIA TITAN RTX GPUs, and on the Rockfish cluster at Johns Hopkins University using NVIDIA A100 GPU nodes, with NVIDIA software CUDA 11.7 and cuDNN 8.5. For a reference of the computational requirements of MAARS inference, on a machine with an AMD Ryzen 2700X 8-core CPU and an NVIDIA GeForce RTX 2060 GPU, the average processing time for inference is 0.034 s per patient using GPU or 0.086 s per patient using solely CPU.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
AI Research
Researchers Use Hidden AI Prompts to Influence Peer Reviews: A Bold New Era or Ethical Quandary?
AI Secrets in Peer Reviews Uncovered
Last updated:
Edited By
Mackenzie Ferguson
AI Tools Researcher & Implementation Consultant
In a controversial yet intriguing move, researchers have begun using hidden AI prompts to potentially sway the outcomes of peer reviews. This cutting-edge approach aims to enhance review processes, but it raises ethical concerns. Join us as we delve into the implications of AI-assisted peer review tactics and how it might shape the future of academic research.
Introduction to AI in Peer Review
Artificial Intelligence (AI) is rapidly transforming various facets of academia, and one of the most intriguing applications is its integration into the peer review process. At the heart of this evolution is the potential for AI to streamline the evaluation of scholarly articles, which traditionally relies heavily on human expertise and can be subject to biases. Researchers are actively exploring ways to harness AI not just to automate mundane tasks but to provide deep, insightful evaluations that complement human judgment.
The adoption of AI in peer review promises to revolutionize the speed and efficiency with which academic papers are vetted and published. This technological shift is driven by the need to handle an ever-increasing volume of submissions while maintaining high standards of quality. Notably, hidden AI prompts, as discussed in recent studies, can subtly influence reviewers’ decisions, potentially standardizing and enhancing the objectivity of reviews (source).
Incorporating AI into peer review isn’t without challenges. Ethical concerns about transparency, bias, and accountability arise when machines play an integral role in shaping academic discourse. Nonetheless, the potential benefits appear to outweigh the risks, with AI offering tools that can uncover hidden biases and provide more balanced reviews. As described in TechCrunch’s exploration of this topic, there’s an ongoing dialogue about the best practices for integrating AI into these critical processes (source).
Influence of AI in Academic Publishing
The advent of artificial intelligence (AI) is reshaping various sectors, with academic publishing being no exception. The integration of AI tools in academic publishing has significantly streamlined the peer review process, making it more efficient and less biased. According to an article from TechCrunch, researchers are actively exploring ways to integrate AI prompts within the peer review process to subtly guide reviewers’ evaluations without overt influence (). These AI systems analyze vast amounts of data to provide insightful suggestions, thus enhancing the quality of published research.
Moreover, AI applications in academic publishing extend beyond peer review management. AI algorithms can analyze and summarize large datasets, providing researchers with new insights and enabling faster discoveries. As TechCrunch suggests, these technologies are becoming integral to helping researchers manage the ever-increasing volume of scientific literature (). The future of academic publishing might see AI serving as co-authors, providing accurate data analysis and generating hypotheses based on trends across studies.
Public reactions to the influence of AI in academic publishing are mixed. Some view it as a revolutionary tool that democratizes knowledge production by reducing human errors and biases. Others, however, raise concerns over ethical implications, fearing that AI could introduce new biases or be manipulated to favor particular agendas. As TechCrunch highlights, the key challenge will be to implement transparent AI systems that can be held accountable and ensure ethical standards in academic publishing ().
Looking ahead, the influence of AI in academic publishing is poised to grow, potentially transforming various aspects of research dissemination. AI-powered platforms could revolutionize the accessibility and dissemination of knowledge by automating the proofreading and formatting processes, making academic work more readily available and understandable globally. However, as TechCrunch notes, the future implications of such developments require careful consideration to balance innovation with ethical integrity, especially in how AI technologies are governed ().
Challenges and Concerns in AI Implementation
Implementing AI technologies across various sectors presents numerous challenges and concerns, particularly regarding transparency, ethics, and reliability. As researchers strive to integrate AI into processes like peer review, hidden AI prompts can sometimes influence decisions subtly. According to “TechCrunch” in their article about researchers influencing peer review processes with hidden AI prompts, such practices raise questions about the integrity of AI systems . Ensuring AI operates within ethical boundaries becomes crucial, as we must balance innovation with maintaining trust in automated systems.
Furthermore, the opacity of AI algorithms often leads to public and expert concerns about accountability. When AI systems make decisions without clear explanations, it can diminish users’ trust. In exploring the future implications of AI in peer review settings, it becomes apparent that refinements are needed to enhance transparency and ethical considerations. As noted in the TechCrunch article, there is an ongoing debate about the extent to which AI should be allowed to influence decisions that have traditionally been human-centric . This calls for a framework that sets clear standards and guidelines for AI implementation, ensuring its role supplements rather than overrides human judgment.
In addition to transparency and ethics, reliability is another significant concern when implementing AI. The technological robustness of AI systems is continuously tested by real-world applications. Errors or biases in AI can lead to unintended consequences that may affect public perception and acceptance of AI-driven tools. As industries increasingly rely on AI, aligning these systems with societal values and ensuring they are error-free is paramount to gaining widespread acceptance. The TechCrunch article also highlights these reliability issues, suggesting that developers need to focus more on creating accurate, unbiased algorithms .
Experts Weigh in on AI-driven Peer Review
In recent years, the academic community has seen a growing interest in integrating artificial intelligence into the peer review process. Experts believe that AI can significantly enhance this critical phase of academic publishing by bringing in efficiency, consistency, and unbiased evaluation. According to a report on TechCrunch, researchers are exploring ways to subtly incorporate AI prompts into the peer review mechanism to improve the quality of feedback provided to authors (TechCrunch).
The inclusion of AI in peer review is not without its challenges, though. Experts caution that the deployment of AI-driven tools must be done with significant oversight to prevent any undue influence or bias that may occur from automated processes. They emphasize the importance of transparency in how AI algorithms are used and the nature of data fed into these systems to maintain the integrity of peer review (TechCrunch).
While some scholars welcome AI as a potential ally that can alleviate the workload of human reviewers and provide them with analytical insights, others remain skeptical about its impact on the traditional rigor and human judgment in peer evaluations. The debate continues, with public reactions reflecting a mixture of excitement and cautious optimism about the future potential of AI in scholarly communication (TechCrunch).
Public Reactions to AI Interventions
The public’s reaction to AI interventions, especially in fields such as scientific research and peer review, has been a mix of curiosity and skepticism. On one hand, many appreciate the potential of AI to accelerate advancements and improve efficiencies within the scientific community. However, concerns remain over the transparency and ethics of deploying hidden AI prompts to influence processes that traditionally rely on human expertise and judgment. For instance, a recent article on TechCrunch highlighted researchers’ attempts to integrate these AI-driven techniques in peer review, sparking discussions about the potential biases and ethical implications of such interventions.
Further complicating the public’s perception is the potential for AI to disrupt traditional roles and job functions within these industries. Many individuals within the academic and research sectors fear that an over-reliance on AI could undermine professional expertise and lead to job displacement. Despite these concerns, proponents argue that AI, when used effectively, can provide invaluable support to researchers by handling mundane tasks, thereby allowing humans to focus on more complex problem-solving activities, as noted in the TechCrunch article.
Moreover, the ethical ramifications of using AI in peer review processes have prompted a call for stringent regulations and clearer guidelines. The potential for AI to subtly shape research outcomes without the overt consent or awareness of the human peers involved raises significant ethical questions. Discussions in media outlets like TechCrunch indicate a need for balanced discussions that weigh the benefits of AI-enhancements against the necessity to maintain integrity and trust in academic research.
Future of Peer Review with AI
The future of peer review is poised for transformation as AI technologies continue to advance. Researchers are now exploring how AI can be integrated into the peer review process to enhance efficiency and accuracy. Some suggest that AI could assist in identifying potential conflicts of interest, evaluating the robustness of methodologies, or even suggesting suitable reviewers based on their expertise. For instance, a detailed exploration of this endeavor can be found at TechCrunch, where researchers are making significant strides toward innovative uses of AI in peer review.
The integration of AI in peer review does not come without its challenges and ethical considerations. Concerns have been raised regarding potential biases that AI systems might introduce, the transparency of AI decision-making, and how reliance on AI might impact the peer review landscape. As discussed in recent events, stakeholders are debating the need for guidelines and frameworks to manage these issues effectively.
One potential impact of AI on peer review is the democratization of the process, opening doors for a more diverse range of reviewers who may have been overlooked previously due to geographical or institutional biases. This could result in more diverse viewpoints and a richer peer review process. Additionally, as AI becomes more intertwined with peer review, expert opinions highlight the necessity for continuous monitoring and adjustment of AI tools to ensure they meet the ethical standards of academic publishing. This evolution in the peer review process invites us to envision a future where AI and human expertise work collaboratively, enhancing the quality and credibility of academic publications.
Public reactions to the integration of AI in peer review are mixed. Some welcome it as a necessary evolution that could address long-standing inefficiencies in the system, while others worry about the potential loss of human oversight and judgment. Future implications suggest a field where AI-driven processes could eventually lead to a more streamlined and transparent peer review system, provided that ethical guidelines are strictly adhered to and biases are meticulously managed.
AI Research
Xbox producer tells staff to use AI to ease job loss pain

An Xbox producer has faced a backlash after suggesting laid-off employees should use artificial intelligence to deal with emotions in a now deleted LinkedIn post.
Matt Turnbull, an executive producer at Xbox Game Studios Publishing, wrote the post after Microsoft confirmed it would lay off up to 9,000 workers, in a wave of job cuts this year.
The post, which was captured in a screenshot by tech news site Aftermath, shows Mr Turnbull suggesting tools like ChatGPT or Copilot to “help reduce the emotional and cognitive load that comes with job loss.”
One X user called it “plain disgusting” while another said it left them “speechless”. The BBC has contacted Microsoft, which owns Xbox, for comment.
Microsoft previously said several of its divisions would be affected without specifying which ones but reports suggest that its Xbox video gaming unit will be hit.
Microsoft has set out plans to invest heavily in artificial intelligence (AI), and is spending $80bn (£68.6bn) in huge data centres to train AI models.
Mr Turnbull acknowledged the difficulty of job cuts in his post and said “if you’re navigating a layoff or even quietly preparing for one, you’re not alone and you don’t have to go it alone”.
He wrote that he was aware AI tools can cause “strong feelings in people” but wanted to try and offer the “best advice” under the circumstances.
The Xbox producer said he’d been “experimenting with ways to use LLM Al tools” and suggested some prompts to enter into AI software.
These included career planning prompts, resume and LinkedIn help, and questions to ask for advice on emotional clarity and confidence.
“If this helps, feel free to share with others in your network,” he wrote.
The Microsoft cuts would equate to 4% of Microsoft’s 228,000-strong global workforce.
Some video game projects have reportedly been affected by the cuts.
AI Research
Multilingualism is a blind spot in AI systems
For internationally operating companies, it is attractive to use a single AI solution across all markets. Such a centralized approach offers economies of scale and appears to ensure uniformity. Yet research from CWI reveals that this assumption is on shaky ground: the language in which an AI is addressed, influences the answers the system provides – and quite significantly too.
Language steers outcomes
The problem goes beyond small differences in nuance. Researcher Davide Ceolin, tenured researcher within the Human-Centered Data Analytics group at CWI, and his international research team discovered that identical Large Language Models (LLM’s) can adopt varying political standpoints, depending on the language used. They delivered more economically progressive responses in Dutch and more centre-conservative ones in English. For organizations applying AI in HR, customer service or strategic decision-making, this results in direct consequences for business processes and reputation.
These differences are not incidental. Statistical analysis shows that the language of the prompt used has a stronger influence on the AI response than other factors, such as assigned nationality. “We assumed that the output of an AI model would remain consistent, regardless of the language. But that turns out not to be the case,” says Ceolin.
For businesses, this means more than academic curiosity. Ceolin emphasizes: “When a system responds differently to users with different languages or cultural backgrounds, this can be advantageous – think of personalization – but also detrimental, such as with prejudices. When the owners of these systems are unaware of this bias, they may experience harmful consequences.”
Prejudices with consequences
The implications of these findings extend beyond political standpoints alone. Every domain in which AI is deployed – from HR and customer service to risk assessment – runs the risk of skewed outcomes as a result of language-specific prejudices. An AI assistant that assesses job applicants differently depending on the language of their CV, or a chatbot that gives inconsistent answers to customers in different languages: these are realistic scenarios, no longer hypothetical.
According to Ceolin, such deviations are not random outliers, but patterns with a systematic character. “That is extra concerning. Especially when organizations are unaware of this.”
For Dutch multinationals, this is a real risk. They often operate in multiple languages but utilize a single central AI system. “I suspect this problem already occurs within organizations, but it’s unclear to what extent people are aware of it,” says Ceolin. The research also suggests that smaller models are, on average, more consistent than the larger, more advanced variants, which appear to be more sensitive to cultural and linguistic nuances.
What can organizations do?
The good news is that the problem can be detected and limited. Ceolin advises testing AI systems regularly using persona-based prompting, which involves testing different scenarios where the language, nationality, or culture of the user varies. “This way you can analyze whether specific characteristics lead to unexpected or unwanted behaviour.”
Additionally, it’s essential to have a clear understanding of who works with the system and in which language. Only then you can assess whether the system operates consistently and fairly in practice. Ceolin advocates for clear governance frameworks that account for language-sensitive bias, just as currently happens with security or ethics.
Structural approach required
According to the researchers, multilingual AI bias is not a temporary phenomenon that will disappear on its own. “Compare it to the early years of internet security,” says Ceolin. “What was then seen as a side issue turned out to be of strategic importance later.” CWI is now collaborating with the French partner institute INRIA to unravel the mechanisms behind this problem further.
The conclusion is clear: companies that deploy AI in multilingual contexts would do well to consciously address this risk not only for technical reasons, but also to prevent reputational damage, legal complications and unfair treatment of customers or employees.
“AI is being deployed increasingly often, but insight into how language influences the system is in its infancy,” concludes Ceolin. “There’s still much work to be done there.”
Author: Kim Loohuis
Header photo: Shutterstock
-
Funding & Business7 days ago
Kayak and Expedia race to build AI travel agents that turn social posts into itineraries
-
Jobs & Careers6 days ago
Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding
-
Mergers & Acquisitions6 days ago
Donald Trump suggests US government review subsidies to Elon Musk’s companies
-
Funding & Business6 days ago
Rethinking Venture Capital’s Talent Pipeline
-
Jobs & Careers6 days ago
Why Agentic AI Isn’t Pure Hype (And What Skeptics Aren’t Seeing Yet)
-
Funding & Business4 days ago
Sakana AI’s TreeQuest: Deploy multi-model teams that outperform individual LLMs by 30%
-
Funding & Business7 days ago
From chatbots to collaborators: How AI agents are reshaping enterprise work
-
Jobs & Careers6 days ago
Astrophel Aerospace Raises ₹6.84 Crore to Build Reusable Launch Vehicle
-
Tools & Platforms6 days ago
Winning with AI – A Playbook for Pest Control Business Leaders to Drive Growth
-
Jobs & Careers4 days ago
Ilya Sutskever Takes Over as CEO of Safe Superintelligence After Daniel Gross’s Exit