AI Research
Selective identification of polyploid hepatocellular carcinomas with poor prognosis by artificial intelligence-based pathological image recognition
Development of an AI-based image recognition model to estimate HCC ploidy
First, we constructed a model to evaluate the ploidy status of HCC using deep learning and CNN-based image classification. A total of 44 cases whose ploidy status had been determined by chromosome FISH in our previous study6 were used as the training data. The training set included 27 diploid and 17 polyploid HCC cases. After obtaining a whole-slide image of the HE-stained slide for each tumor, we selected three or more ROIs showing the representative pathological appearance of the tumor (Fig. 1a). Each ROI was divided into 2048×2048-pixel tiles, and the tiles were subdivided into 256 × 256 patches for input into the deep-learning algorithm (see Materials and Methods). Deep learning for tumor ploidy classification was performed by training 42,240 small-patch images. The models calculated the probability of tumor polyploidization in each 2048 × 2048-pixel tile, and the average value across all tiles for each tumor was defined as the polyploidy score for the tumor (Fig. 1b, Supplementary Fig. 4).
a Scheme for the construction of AI-based image recognition models for determining HCC ploidy. b Representative HE-stained images of ROIs in diploid and polyploid HCC. The probabilities of HCC polyploidization in the corresponding 2048 × 2048-pixel tiles are shown in a color map. Scale bar, 200 μm. c ROC curves and AUC values of representative AI models in cross-validation. The data for the other models are shown in Supplementary Fig. 5. d Evaluation and comparison of the constructed AI models.
We first used three CNN-based image classification models: DenseNet12115, ResNet50d16, and EfficientNet_B017. The validity of the models was assessed by analyzing their receiver operating characteristic (ROC) curves and areas under the curve (AUC)24 (Fig. 1c, Supplementary Fig. 5). Five-fold cross-validation revealed that all three models achieved high AUC values (0.998−1.0, Fig. 1d). For example, with the optimized cutoff value (0.457) of polyploidy score determined based on the ROC curve, the EfficientNet_B0-based model exhibited high accuracy, sensitivity, and specificity (0.977, 1.0, and 0.963, respectively, Fig. 1d, Supplementary Fig. 6). These findings indicate that CNN-based models can be used to evaluate the HCC ploidy status using pathological HE images.
AI-based image recognition successfully assessed HCC ploidy at a low calculation cost
The coloration of HE staining is known to vary due to factors such as fixation conditions and staining protocols, potentially affecting AI model performance25,26. To address this, we constructed a model using EfficientNetB0 on grayscale-converted images to minimize the impact of such variability. The constructed EficientNetB0_gray model showed a high AUC value (0.998), comparable to that of the original CNN-based models, suggesting that the cellular morphological information obtained from grayscale images was sufficient to evaluate HCC ploidy (Fig. 1c, d, Supplementary Fig. 6).
We also developed models using ViT-based architectures, which incur lower calculation costs than CNN-based image recognition. Two and one models were constructed using HIPT20 and DINO19, respectively, both of which enabled the scalability of ViT to large images via self-supervised learning (see Materials and Methods). These encoders were trained on TCGA or liver pathology images obtained at our institution. By freezing parts of the model during training, overfitting can be moderated, even when the labeled data are insufficient. In particular, because it allows for easy replacement of the first stage with other publicly available models trained on pathology images using self-supervised learning, model construction using HIPT requires a shorter learning time than CNN-based models. All three models exhibited high accuracy and AUC values that were comparable to those of the CNN models (Fig. 1d, Supplementary Fig. 6).
AI-based ploidy assessment identified polyploid HCC cases with poor prognosis within a large cohort
We examined whether our constructed AI models could properly assess HCC ploidy using a separate dataset. Tumor ploidy was determined using chromosome FISH in 38 new HCCs (Dataset 2) that were not included in the first dataset (Dataset 1). Their polyploidy scores were then calculated by analyzing their HE images using AI models. The sensitivity, specificity, and proportion of polyploid HCC were determined based on the cutoff values determined in the analysis of Dataset 1 (Fig. 1d, Supplementary Fig. 6). Among the models examined, some, including the two HIPT-based models, exhibited relatively high AUC values over 0.8 (Fig. 2a, b). The decrease in accuracy observed in Dataset 2 compared to Dataset 1 may be attributed to the fact that cases with typical histology of diploid and polyploid cancers were used for training in Dataset 1, while cases in Dataset 2 were selected in an unbiased manner.
a, b Performance of AI models in the validation assessments. The ploidy statuses of 38 HCCs (determined by chromosome FISH) were compared with the ploidy statuses, as assessed by AI models. ROC curves of the representative AI models are shown in (a). c Prognostic stratification based on ploidy assessments by the AI models. A total of 169 HCCs were analyzed. d Kaplan–Meier curves of overall survival. Statistical difference was determined by log-rank test. The three AI models that identified a significant difference in prognosis between diploid and polyploid HCCs in (c) are shown.
To further evaluate the utility of AI-based polyploid HCC identification, a large cohort of 169 HCC cases (Dataset 3) was examined using AI models (Fig. 2c). In particular, the EfficientNet_B0-based and HIPT_unfrozen2 models diagnosed a number of polyploid HCC cases proportional to their prevalence, as shown in previous reports (36–38% 3,6). By identifying polyploidy in HCC, the EfficientNet_B0-based and HIPT_unfrozen2 models discriminated HCC patients with significantly worse overall survival after surgery (Fig. 2c, d, Supplementary Fig. 7). These findings indicate that AI models, especially the HIPT_unfrozen2 model, are useful for identifying polyploid HCC and predicting poor prognosis.
Analysis of a large cohort revealed the characteristics of polyploid HCC
The HIPT_unfrozen2 model, which exhibited the most optimal features for ploidy determination among the constructed models, was used to investigate the characteristics of polyploid HCC by analyzing a large cohort. In Dataset 3, consisting of 169 cases, 113 and 56 cases were diagnosed as diploid and polyploid HCC, respectively, using the HIPT_unfrozen2 model. As observed in other datasets, where no associations were found between tumor ploidy and age, sex, or body mass index (Supplementary Table 1), the two groups showed no significant differences in these variables (Table 1, Supplementary Fig. 8). Consistent with our previous results, serum alpha-fetoprotein (AFP) levels were significantly higher in polyploid HCC than in diploid HCC, whereas tumor size and stage were comparable between the two groups (Table 1, Fig. 3a). Polyploid HCC was also significantly associated with a high prevalence of poor differentiation and exhibited MTM or scirrhous structures (Table 1, Fig. 3b, c). Polyploid giant cancer cells (PGCCs), which exhibit a distinct appearance with prominently large nuclei or profound multinucleation, are frequently observed in polyploid HCC (Table 1). Furthermore, the expression of UBE2C, which we previously reported as a marker suggestive of polyploid HCC, was significantly elevated in polyploid HCC relative to levels in diploid HCC (Fig. 3d). These findings confirm the characteristics of polyploid HCC demonstrated in our previous study and suggest accurate ploidy evaluation by our HIPT_unfrozen2 model. In addition, most polyploid HCCs diagnosed using the AI model did not exhibit well-defined pathological features characteristic of polyploid HCC (Fig. 3e), indicating that the AI model comprehensively assessed ploidy in HCC, considering a complex array of histological information beyond mere tumor structures and differentiation status.
a Serum AFP levels. Error bars indicate mean ± SD. b, c Pathological classification of HCC differentiation and structure. d Immunostaining of UBE2C. Scale Bar 50μm. e Heatmap indicating ploidy scores assessed using the HIPT_unfrozen2 model and clinicopathological features. f t-SNE plots of tile images. Probabilities of polyploidy assessed using the HIPT_unfrozen2 model and clinicopathological features of the tumors are shown. A total of 169 HCCs were analyzed. SC scirrhous, MacroT macro-trabecular, MicroT micro-trabecular, C compact, PG pseudo-glandular, UC unclassified, PIVKA protein induced by vitamin K absence or antagonist II, HBV hepatitis b virus, HCV hepatitis c virus, MASLD metabolic dysfunction associated steatotic liver disease, PBC primary biliary cholangitis.
To further explore the characteristics of polyploid HCC, we visualized case-by-case correlations between the polyploidy scores and clinicopathological features (Fig. 3e). In addition, data derived from all 2048 × 2048-pixel tile images of the 169 HCCs were compressed into two dimensions and visualized using t-SNE plots (Fig. 3f). These plots validated that high serum AFP levels were correlated with high polyploidy probability values calculated using our AI models. Interestingly, HCCs with high polyploidy scores were predominantly positive for PGCCs, highlighting their importance in inferring HCC polyploidy (Fig. 3e). In contrast, hepatitis etiology seemed to exert little influence on HCC ploidy, and HCCs with high polyploidy scores developed in livers with viral hepatitis and steatotic liver diseases (Fig. 3e, f). Our investigation of poorly understood features of polyploid HCC in a large cohort, utilizing the high-throughput analysis capabilities of AI models, verified recently revealed characteristics and provided additional insights.
The AI model robustly identified polyploid HCC in a public dataset and predicted a poor prognosis
To further verify the utility of the AI-based ploidy discrimination models, we analyzed the HE images of 350 HCC cases in the public TCGA dataset using our representative models, EfficientNet_B0, EfficientNet_B0_gray, and HIPT_unfrozen2. Ploidy assessments obtained by these AI models were compared with a prior determination of genome duplication (GD) by SNP array analysis of tumor genomes4,5. Assessment using the HIPT_unfrozen2 model showed a strong correlation with the GD status determined by genomic analysis (Fig. 4a). The other two models did not demonstrate a significant correlation. Using the GD status based on genomic analysis as a reference, the sensitivity and specificity of the HIPT_unfrozen2 model were 0.77 and 0.41, respectively. Similar to Dataset 3, the polyploid HCC in the TCGA dataset identified by the HIPT_unfrozen2 model showed a high prevalence of PGCC and elevated AFP serum levels, supporting the idea that the AI model can robustly evaluate HCC ploidy status from pathological images obtained under heterogeneous conditions at various facilities (Table 2).
a Conformity between GD detected by genomic analysis and the ploidy status assessed using our AI models. b, c Kaplan–Meier curves displaying overall survival. Statistical difference was determined by log-rank test. d Aneuploidy score. A total of 350 HCC cases in TCGA dataset were divided by their GD status detected by genomic analysis and their ploidy status assessed using the HIPT_unfrozen2 model. Error bars indicate mean ± SD.
We further examined whether the HIPT_unfrozen2 model was helpful in identifying a subset of HCC with poor prognosis. As expected, GD-positive HCC evaluated by genomic analysis showed a trend toward poor prognosis compared to GD-negative HCC, although the difference was weak and insignificant (Fig. 4b). In notable contrast, polyploid HCC identified by the HIPT_unfrozen2 model exhibited markedly poorer prognosis than their diploid counterpart (Fig. 4b). Among the 350 HCCs, the images of 188 cases were designated suboptimal for diagnosis because a substantial proportion of their ROIs were affected by necrosis, severe fibrosis, and contamination with nontumor components. Importantly, however, the HIPT_unfrozen2 model similarly distinguished prognostic differences depending on ploidy status, regardless of the inclusion of these 188 suboptimal cases, highlighting the robust diagnostic capacity of the AI model (Supplementary Fig. 9).
To explore the reasons for the differences in ploidy-related prognostic prediction capability between the HIPT_unfrozen2 model and genomic analysis, TCGA cases were categorized into four groups based on the AI (diploid or polyploid) and genomic results (GD-positive or GD-negative). As expected, GD-positive polyploid HCC had a significantly poorer prognosis than GD-negative diploid HCC (Fig. 4c). Interestingly, polyploid but GD-negative HCC exhibited a poor prognosis, comparable to that of GD-positive polyploid HCC. In addition, diploid but GD-positive HCC showed a good prognosis, similar to that of GD-negative diploid HCC. The HIPT_unfrozen2 model consistently identified HCC with a significantly poorer prognosis regardless of the SNP array results, leading to its superior prognostic prediction over genomic analysis (Fig. 4c). Moreover, among the GD-negative HCC identified using the SNP array, AI-diagnosed polyploid HCC had significantly more chromosomal aberrations than their diploid counterparts (Fig. 4d), suggesting that the AI model distinguished HCC with a poor prognosis by detecting chromosomal instability and polyploidy from pathological images. These findings indicate that our AI model interpreting HCC ploidy status from pathological images can robustly identify HCC with poor prognosis across diverse conditions in multiple facilities.
The HIPT_unfrozen2 model outperforms conventional methods for estimating HCC ploidy from pathological images
Finally, we compared HIPT_unfrozen2 with existing methods for estimating HCC ploidy from pathological images, evaluating their performance in ploidy classification and prognosis prediction. In our previous study, we proposed a scoring system (PUB score) that combines PGCC detection in HE-stained sections with immunostaining for UBE2C to infer polyploidization in HCC6. When tumors exhibiting both PGCC presence and UBE2C overexpression were classified as polyploid, the PUB classification achieved an accuracy of 0.76 (sensitivity: 0.91, specificity: 0.70) in Dataset 2 (Fig. 5a), which is comparable to that of the AI models. Among the 118 cases in Dataset 3 with available UBE2C immunostaining, the PUB classification identified a group with a poor prognosis, although the difference was not statistically significant (p = 0.063, Fig. 5b). In contrast, HIPT_unfrozen2 distinguished the poor prognosis group more clearly and significantly in the same cases, suggesting that while the combination of PGCC and UBE2C is a useful marker, AI-based ploidy assessment is more effective for predicting prognosis through tumor ploidy classification (p = 0.017, Fig.5c).
a Performance of PUB classification for assessing HCC ploidy. Tumors exhibiting both PGCC presence and UBE2C overexpression were classified as PUB-positive. b, c Kaplan–Meier curves for overall survival. A subset of Dataset 3 (n = 118) with available UBE2C immunostaining was analyzed according to PUB classification and HIPT_unfrozen2 assessment. Correlation between nuclear morphology features extracted by HEIP and the polyploidy score calculated by HIPT_unfrozen2. Median nuclear area (d) and median nuclear major axis (e) were derived from 169 cases in Dataset 3. ROC curves and AUC values for assessing HCC ploidy using median nuclear area (f) or median nuclear major axis (g) extracted by HEIP. Dataset 2 was used for analysis. Kaplan–Meier curves for overall survival analyzed based on the median nuclear area (h) or median nuclear major axis (i). Cases in Dataset 3 were stratified using cutoff values determined by ROC curves in f, g based on the Youden method. In b, c, f, g, statistical significance was assessed using the log-rank test.
We also compared HIPT_unfrozen2 with another published AI-based tool that assesses tumor ploidy by evaluating nuclear morphology, the HE Image Processing pipeline (HEIP)27, using the same HE-stained images analyzed in our study. After segmenting cell nuclei, we identified tumor nuclei using the HEIP algorithm and assessed tumor ploidy based on two morphological features: nuclear area, which is known to correlate with ploidy28, and the nuclear major axis, which was reported as the most strongly correlated feature in the original study27. As expected, both the median tumor nuclear area and the median nuclear major axis extracted by HEIP showed a highly significant correlation with the polyploidy score calculated by HIPT_unfrozen2, suggesting that HEIP accurately captured tumor nuclear morphology (Fig. 5d, e). Using Dataset 2, where tumor ploidy was confirmed by chromosome FISH, we assessed the performance of HEIP in tumor ploidy classification through ROC analysis, yielding AUC values comparable to that of HIPT_unfrozen2 (0.761 for nuclear area and 0.828 for the nuclear major axis, Fig. 5f, g). We further examined the prognostic utility of HEIP-based tumor ploidy assessment in Dataset 3. When tumors were stratified by the nuclear area, no significant difference in prognosis was observed between the high (n = 35) and low (n = 134) groups (log-rank, p = 0.25, Fig. 5h). Stratification using the nuclear major axis showed better separation of prognostic groups, but the difference remained statistically insignificant (log-rank, p = 0.093, Fig. 5i).
Taken together, these findings indicate that HIPT_unfrozen2 outperforms conventional methods in classifying tumor ploidy and stratifying prognosis based on pathological images of HCC.
AI Research
Radiomics-Based Artificial Intelligence and Machine Learning Approach for the Diagnosis and Prognosis of Idiopathic Pulmonary Fibrosis: A Systematic Review – Cureus
AI Research
A Real-Time Look at How AI Is Reshaping Work : Information Sciences Institute
Artificial intelligence may take over some tasks and transform others, but one thing is certain: it’s reshaping the job market. Researchers at USC’s Information Sciences Institute (ISI) analyzed LinkedIn job postings and AI-related patent filings to measure which jobs are most exposed, and where those changes are happening first.
The project was led by ISI research assistant Eun Cheol Choi, working with students in a graduate-level USC Annenberg data science course taught by USC Viterbi Research Assistant Professor Luca Luceri. The team developed an “AI exposure” score to measure how closely each role is tied to current AI technologies. A high score suggests the job may be affected by automation, new tools, or shifts in how the work is done.
Which Industries Are Most Exposed to AI?
To understand how exposure shifted with new waves of innovation, the researchers compared patent data from before and after a major turning point. “We split the patent dataset into two parts, pre- and post-ChatGPT release, to see how job exposure scores changed in relation to fresh innovations,” Choi said. Released in late 2022, ChatGPT triggered a surge in generative AI development, investment, and patent filings.
Jobs in wholesale trade, transportation and warehousing, information, and manufacturing topped the list in both periods. Retail also showed high exposure early on, while healthcare and social assistance rose sharply after ChatGPT, likely due to new AI tools aimed at diagnostics, medical records, and clinical decision-making.
In contrast, education and real estate consistently showed low exposure, suggesting they are, at least for now, less likely to be reshaped by current AI technologies.
AI’s Reach Depends on the Role
AI exposure doesn’t just vary by industry, it also depends on the specific type of work. Jobs like software engineer and data scientist scored highest, since they involve building or deploying AI systems. Roles in manufacturing and repair, such as maintenance technician, also showed elevated exposure due to increased use of AI in automation and diagnostics.
At the other end of the spectrum, jobs like tax accountant, HR coordinator, and paralegal showed low exposure. They center on work that’s harder for AI to automate: nuanced reasoning, domain expertise, or dealing with people.
AI Exposure and Salary Don’t Always Move Together
The study also examined how AI exposure relates to pay. In general, jobs with higher exposure to current AI technologies were associated with higher salaries, likely reflecting the demand for new AI skills. That trend was strongest in the information sector, where software and data-related roles were both highly exposed and well compensated.
But in sectors like wholesale trade and transportation and warehousing, the opposite was true. Jobs with higher exposure in these industries tended to offer lower salaries, especially at the highest exposure levels. The researchers suggest this may signal the early effects of automation, where AI is starting to replace workers instead of augmenting them.
“In some industries, there may be synergy between workers and AI,” said Choi. “In others, it may point to competition or replacement.”
From Class Project to Ongoing Research
The contrast between industries where AI complements workers and those where it may replace them is something the team plans to investigate further. They hope to build on their framework by distinguishing between different types of impact — automation versus augmentation — and by tracking the emergence of new job categories driven by AI. “This kind of framework is exciting,” said Choi, “because it lets us capture those signals in real time.”
Luceri emphasized the value of hands-on research in the classroom: “It’s important to give students the chance to work on relevant and impactful problems where they can apply the theoretical tools they’ve learned to real-world data and questions,” he said. The paper, Mapping Labor Market Vulnerability in the Age of AI: Evidence from Job Postings and Patent Data, was co-authored by students Qingyu Cao, Qi Guan, Shengzhu Peng, and Po-Yuan Chen, and was presented at the 2025 International AAAI Conference on Web and Social Media (ICWSM), held June 23-26 in Copenhagen, Denmark.
Published on July 7th, 2025
Last updated on July 7th, 2025
AI Research
SERAM collaborates on AI-driven clinical decision project
The Spanish Society of Medical Radiology (SERAM) has collaborated with six other scientific societies to develop an AI-supported urology clinical decision-making project called Uro-Oncogu(IA)s.
The initiative produced an algorithm that will “reduce time and clinical variability” in the management of urological patients, the society said. SERAM’s collaborators include the Spanish Urology Association (AEU), the Foundation for Research in Urology (FIU), the Spanish Society of Pathological Anatomy (SEAP), the Spanish Society of Hospital Pharmacy (SEFH), the Spanish Society of Nuclear Medicine and Molecular Imaging (SEMNIM), and the Spanish Society of Radiation Oncology (SEOR).
SERAM Secretary General Dr. MaríLuz Parra launched the project in Madrid on 3 July with AEU President Dr. Carmen González.
On behalf of SERAM, the following doctors participated in this initiative:
- Prostate cancer guide: Dr. Joan Carles Vilanova, PhD, of the University of Girona,
- Upper urinary tract guide: Dr. Richard Mast of University Hospital Vall d’Hebron in Barcelona,
- Muscle-invasive bladder cancer guide: Dr. Eloy Vivas of the University of Malaga,
- Non-muscle invasive bladder cancer guide: Dr. Paula Pelechano of the Valencian Institute of Oncology in Valencia,
- Kidney cancer guide: Dr. Nicolau Molina of the University of Barcelona.
-
Funding & Business7 days ago
Kayak and Expedia race to build AI travel agents that turn social posts into itineraries
-
Jobs & Careers7 days ago
Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding
-
Mergers & Acquisitions7 days ago
Donald Trump suggests US government review subsidies to Elon Musk’s companies
-
Funding & Business6 days ago
Rethinking Venture Capital’s Talent Pipeline
-
Jobs & Careers6 days ago
Why Agentic AI Isn’t Pure Hype (And What Skeptics Aren’t Seeing Yet)
-
Funding & Business4 days ago
Sakana AI’s TreeQuest: Deploy multi-model teams that outperform individual LLMs by 30%
-
Jobs & Careers6 days ago
Astrophel Aerospace Raises ₹6.84 Crore to Build Reusable Launch Vehicle
-
Funding & Business7 days ago
From chatbots to collaborators: How AI agents are reshaping enterprise work
-
Tools & Platforms6 days ago
Winning with AI – A Playbook for Pest Control Business Leaders to Drive Growth
-
Jobs & Careers4 days ago
Ilya Sutskever Takes Over as CEO of Safe Superintelligence After Daniel Gross’s Exit