Connect with us

AI Research

PixelGuard: Advancing healthcare data privacy through AI-driven de-identification system for medical imaging research

Published

on


Medical images play a crucial role in medical research by providing valuable insights that help advance our understanding of human health, disease management, and treatment efficacy. Researchers use medical images to study the structure and function of organs, tissues, and cells in healthy and diseased states. These images are also used to train and educate healthcare professionals, to create educational resources and workshops, and to train medical staff in understanding and interpreting imaging data.

Although medical images play a crucial role in research and medical education, protecting privacy in medical images is of paramount importance to ensure patient confidentiality, trust, and compliance with ethical and legal standards, such as HIPAA and HITRUST. Medical images contain highly sensitive information about a patient’s health, conditions, and prognosis. Maintaining privacy of Personally Identifiable information (PII) and Personal Health Information (PHI) ensures that this information is never disclosed without the patient’s explicit consent while also building patient trust in healthcare.

Furthermore, laws such as the Health Insurance Portability and Accountability Act (HIPAA), Health Information Trust Alliance Common Security Framework in the US, and General Data Protection Regulation (GDPR) in the European Union (EU) set strict guidelines on how medical data such as images should be handled, stored, anonymized, and shared. Violating these regulations can result in severe penalties and can damage the reputation of healthcare providers and their institutions. Moreover, healthcare professionals have a moral and ethical obligation to respect patient privacy.

Digital Imaging and Communications in Medicine (DICOM)

DICOM is the standard format used for storing, transmitting, and sharing medical images and related data. DICOM files hold information structured into image data and other metadata. Each DICOM field is used to describe and categorize various aspects of the medical image such as patient information, study details, and imaging parameters. For example, DICOM fields include patient name, birth date, gender, study date, referring physician, pixel data, series information, equipment manufacturer, model name, software version, study, and diagnosis descriptions, among others.

DICOM files contain Personally Identifiable Information (PII), thus they cannot be used in medical research or training without sensitive information being redacted. However, it is crucial to make sure that the process of redacting sensitive information does not compromise the quality of the information that is not individually attributable. It is also important to minimize the size of the de-identified medical images to reduce storage and processing costs while providing the flexibility save resulting anonymized files as DICOM or JPEG. AWS HealthImaging, the scalable and high performant cloud based DICOM store, provides sub-second image retrieval from anywhere. The total cost of ownership (TCO) of image storage and data transfer can be substantially reduced using the industry standard High Throughput JPEG 2000 (HTJ2K) image encoding.

Solution walkthrough

PixelGuard—built on Amazon Web Services (AWS) and created by Northwestern University Assistant Professor and Founder of Xtasis, LLC, Dr. Adrienne Kline—is an advanced software solution that deidentifies medical images while preserving clinical relevance and efficacy. It uses over 75 state-of-the-art AI-driven models capable of detecting and redacting multilingual, multi-orientation text across all major formats (DICOM, JPEG, PNG, NIfTI, etc.), alongside configurable metadata anonymization. With an intuitive UI, enterprise SSO, and in-tenant deployment (no data egress), PixelGuard delivers secure, compliant, and high-throughput image de-identification. PixelGuard is available on AWS Marketplace. ScaleCapacity—an AWS Partner—was instrumental in the development of the UI, cloud infrastructure, and deployment to AWS Marketplace.

Ingestion layer

The ingestion layer provides a user interface and an API to submit a medical image for de-identification. The API can also be used by a third-party medical device manufacturer to provide the redaction ability. Prior to the ingestion, the web UI and API can be secured with enterprise identity provider based authorization. Along with ingesting the image needing de-identification, this layer must also capture the specific de-identification configuration, which defines which fields need to be redacted from the image. For common redaction scenarios such as redaction to comply with HIPAA regulations, predefined field sets are defined and recorded.

Pre de-identification layer

The storage layer is used to store the image prior to the de-identification. Furthermore, the storage layer may also store a compressed file containing several images needing de-identification. Moreover, this layer also stores metadata pertaining to the de-identification job such as the job ID, submission time, specific fields being redacted, format of the file provided, etc.

Processing layer

The storage layer identifies the format of an image needing de-identification and based on the specific de-identification configuration, the image metadata defined by the DICOM tags and the Pixel-level de-identification is undertaken. Furthermore, the image can be compressed for storage optimization and a crosswalk file referencing a unique ID is created. The crosswalk file makes it possible to do a reverse lookup of the original file if it is ever needed. Care should be exercised to secure the crosswalk file and store it separate from the de-identified file.

De-identification pipeline

The de-identified storage layer stores the processed de-identified medical image, which can be used by medical researchers as part of a study. Furthermore, the image can be generated in various formats consistent with the needs of the research study.

De-identified output layer

The de-identified images can be created in conjunction with a crosswalk file that can maintain a mapping between an identifier in the de-identified image and a relevant identifier in the original image. The crosswalk file is maintained separately and encrypted to allow only authorized individuals to trace the de-identified image back to its original source if needed.

Auditing, monitoring, and observability

The auditing, monitoring, and observability layer stores access logs to make sure that records of who accessed what, when, and how can be stored for record keeping purposes. Furthermore, detailed error logs if any can be stored to enable troubleshooting if certain medical images could not be de-identified.

Figure 1. PixelGuard logical solution architecture

The de-identification of medical images necessitates metadata and pixel data scrubbing. In some situations, the metadata may need to be anonymized rather than removed entirely. When the metadata is anonymized rather than removed, the anonymization process must not in any way dilute the research value of the image. For example, a medical image of a 20-year-old cannot be anonymized to indicate that the image belongs to an 80-year-old, because that may dilute the research value.

To scrub pixel data, optical character recognition (OCR) to detect burned-in text can be used. This must be used in conjunction with machine learning (ML) models to locate and blur areas with identifying information. ML models can also be used to generate a confidence score associated to the blurring of information. For images where the confidence score is below a certain pre-defined threshold, a human review queue can also be defined for investigation and possible approval.

Tools and frameworks

Ingestion: The ingestion layer can be handled with a Web UI and an API that can be exposed through the API Gateway and that can perform request validation, authentication, and authorization along with security rule enforcement. The API Gateway integrates with Amazon Cognito to enforce fine-grained authentication and authorization.

Pre-de-identification storage: This layer can store pre-defined image de-identification profiles such as those for HIPAA, and can store it to an Amazon Simple Storage Service (Amazon S3) bucket or to a combination of an S3 bucket and Amazon DynamoDB.

De-identification: De-identification is a multi-step process that can use DICOM parser libraries such as pydicom and scrubbing tools such as dicom-anonymizer. Amazon Textract, and Amazon Comprehend Medical can also be used to identify embedded text within images.

Output: Post de-identification medical images can be compressed, converted to a different format, and stored with identifiers that can be mapped to the original source if needed. The storage can be handled with S3 buckets or file systems as needed with or without compression.

Other supporting services: Other supporting services play a critical role in enabling security, scalability, manageability, and infrastructure automation. Services such as Amazon API Gateway make it possible to enforce AuthN/AuthZ policies consistent with organizational needs. AWS Key Management Service (AWS KMS) provides a secure way to manage cryptographic keys and protect your data. AWS Fargate can be used to run thousands of containers without the need to manage servers or clusters of Amazon Elastic Compute Cloud (Amazon EC2) instances.

Furthermore, services such as Amazon Macie can add a secondary layer of protection to identify sensitive data and to protect it in conjunction with other remediation services. Moreover, services such as AWS CloudFormation enable the defining of AWS infrastructure resources in a declarative way, using templates in the JSON or YAML format and can automate the creation, configuration, and management of these resources, acting as an infrastructure as code (IaC) tool.

The following images provide an overview of the image redaction experience using PixelGuard.

Figure 2. Configuring fields for redaction in medical images

Figure 3. Redaction job completion report

Figure 4. Previewing unredacted and redacted images together

Conclusion

De-identifying medical images is a crucial need in medical research and training. It ensures that the de-identified image cannot be linked back to an individual. A sound technical architecture for de-identifying medical images is crucial for the safe and compliant use of medical images for research and medical discovery. This architecture can make sure that PII is reliably removed or obscured protecting patient privacy without compromising the utility of the data.

PixelGuard as a solution built on AWS enables flexible de-identification, thereby enabling compliance with privacy laws, enhancing security, and reducing risks. At the same time, it facilitates data sharing and collaboration, ultimately promoting faster medical imaging AI research in public health and medical advancements, all while maintaining ethical and legal responsibilities.

Next steps



Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

AI Research

Hong Kong start-up IntelliGen AI aims to challenge Google DeepMind in drug discovery

Published

on


IntelliGen AI, an artificial intelligence (AI) start-up founded in Hong Kong, is positioning itself as a competitor to Google DeepMind in the field of drug discovery, as the city increasingly seeks to bolster its AI capabilities.

In an interview with the Post, founder and president Ronald Sun expressed confidence that IntelliGen AI could soon compete globally with Isomorphic Labs, a spin-off of DeepMind, in leveraging AI for drug screening and design.

“For generative science, new breakthroughs and application opportunities are global in nature,” Sun said. “Within 12 to 18 months, we aim to land major, high-value clients on a par with Isomorphic.”

The term “generative science”, although not widely recognised yet, refers to the use of AI to model the natural world and facilitate scientific discovery.

Ronald Sun, founder and president of IntelliGen AI. Photo: Handout

The company’s ambitious plan follows the launch of its IntFold foundational model, which is designed to predict the three-dimensional structures of biomolecules, including proteins. The model’s accuracy levels were comparable to DeepMind’s AlphaFold 3, according to IntelliGen AI.



Source link

Continue Reading

AI Research

Hong Kong start-up IntelliGen AI aims to challenge Google DeepMind in drug discovery

Published

on


IntelliGen AI, an artificial intelligence (AI) start-up founded in Hong Kong, is positioning itself as a competitor to Google DeepMind in the field of drug discovery, as the city increasingly seeks to bolster its AI capabilities.

In an interview with the Post, founder and president Ronald Sun expressed confidence that IntelliGen AI could soon compete globally with Isomorphic Labs, a spin-off of DeepMind, in leveraging AI for drug screening and design.

“For generative science, new breakthroughs and application opportunities are global in nature,” Sun said. “Within 12 to 18 months, we aim to land major, high-value clients on a par with Isomorphic.”

The term “generative science”, although not widely recognised yet, refers to the use of AI to model the natural world and facilitate scientific discovery.

Ronald Sun, founder and president of IntelliGen AI. Photo: Handout

The company’s ambitious plan follows the launch of its IntFold foundational model, which is designed to predict the three-dimensional structures of biomolecules, including proteins. The model’s accuracy levels were comparable to DeepMind’s AlphaFold 3, according to IntelliGen AI.



Source link

Continue Reading

AI Research

July: Bristol AI partnership with France | News and features

Published

on


A new and unique supercomputing collaboration between the UK and France was announced at the UK-France Summit today (10 July).

As two of the most advanced countries in the development and use of AI for science, industry and public services, this partnership will significantly strengthen both countries’ national AI ecosystems and the wider European AI ecosystem.

The Bristol Centre for Supercomputing (BriCS) based at the University of Bristol and the Grand équipement national de calcul intensif (GENCI) will work on building and establishing a collaboration on supercomputing for the benefit of their respective communities and the broader European research ecosystem.

This joint initiative will foster bilateral scientific collaborations in the field of AI-specialisation across materials science, life sciences and medical, cybersecurity, AI security and safety, energy, and engineering, and more globally in AI for science.

The collaboration will ensure sharing of best practice on industrial involvement as well as establishing joint training and education tracks, exchange of students and researchers, hackathons, and the organisation of joint scientific seminars.

Both parties will collaborate in assessing new scientific and technical approaches including federated/distributed learning, agentic and frugal (cost efficient) AI, as well as jointly developed gathering and analysing information about advancements and trends in AI hardware and software. 

Professor Simon McIntosh-Smith, Director of the Bristol Centre for Supercomputing, said: “We are delighted to work alongside GENCI to deliver an innovative and productive European supercomputing ecosystem. Our AI supercomputer, Isambard-AI, is the 11th fastest and 4th greenest supercomputer in the world and having delivered on this project successfully and at pace, the BriCS team is perfectly positioned to co-lead this with the GENCI team.”

Professor Evelyn Welch, Vice-Chancellor and President of the University of Bristol, said: “The University of Bristol is proud to have forged this pioneering new European partnership that will enable unique collaboration with our French colleagues. We will continue to develop and grow the UK’s AI and supercomputing strategy alongside the Government, accelerating critical research and supporting industry innovations at home and internationally.”

A Letter of intention between BriCS and GENCI (PDF, 309kB) provides further information on the ambitions of the partnership.

Further information

About Isambard-AI

Isambard-AI is set to become the UK’s fastest and most powerful supercomputer, purpose-built for AI research following build completion in Summer 2025. Designed to provide open-source intelligence, it will transform research and drive AI-led breakthroughs in critical areas like automated drug discovery and climate research. 



Source link

Continue Reading

Trending