TOP > Publications > Health & Medical Real World Data Infrastructure - Catalyst of Generative AI Development in Japan -/CRDS-FY2023-SP-04
Mar. /2024
(Strategic Proposals)
Health & Medical Real World Data Infrastructure - Catalyst of Generative AI Development in Japan -/CRDS-FY2023-SP-04
Executive Summary

This strategic proposal outlines:

  • 1) The building of a data infrastructure in Japan to accelerate health & medical real world data (RWD) collection, efficient distribution of collected data to stakeholders and end users while maintaining patient privacy, utilization of RWD for medical & clinical intelligence creation and application;
  • 2) Development of a medical version of "large language model (LLM)" / "foundation model" and medical treatment supporting tools based on them.

Japan has the unique distinction of being the world's longest-lived country and a corresponding attainment of a high level of its people's health standard, which are sustained by its medical and healthcare system, including the universal health insurance system. On the other hand, Japan's medical and healthcare system has begun showing signs of stress and stagnation, such as exhaustion at clinical sites, inflexible allocation of medical resources, delays in R&D and drug discovery of personalized medicine, and inefficiency in medical information industry, became apparent during the COVID-19 pandemic. National medical expenses have exceeded 45 trillion yen in FY2021 and is further increasing year by year. The sustainability of Japan's medical and healthcare systems is declining, and it will have a negative impact on the health standards of the people in the future. One factor that has led to this situation is ineffective utilization of the health and medical RWD due to lack of the solid data infrastructure in Japan.

Some of the major problems related to the utilization of health and medical RWD in Japan are (1) an inadequate health and medical RWD collection systems, (2) an inadequate health and medical RWD distribution systems, and (3) delay in creating intelligence through analysis of health and medical RWD and its social implementation. In regard to the latter, the development of generative Artificial Intelligence (AI) such as LLM and its application are rapidly progressing in development worldwide. Economic security and leakage of personal medical information become added risks as domestic medical services come to rely on LLMs / foundation models developed a non-Japanese ecosystem. As a result, the development of a medical LLM / foundation model in Japan is an urgent issue. In order to accumulate large-scale health and medical RWD required for its development, it is essential to first solve problems (1) and (2).

Toward this end, we propose the following three subjects should be tackled in Japan.

(1) Investment in technology to accelerate health and medical RWD collection.

To simultaneously realize the acceleration of medical care data collection and reduction of the burden on clinical sites, [Subject 1] proposes:

  • Develop a medical LLM and tools for medical-recording support (drafting medical treatment records, unifying medical terminology, automatic coding, etc.) based on the LLM, in parallel with the development of templates for electronic medical record input, which is a conventional approach.
  • Study optimal domain adaptation of a general-purpose LLM in conjunction between a working group initiated by National Institute of Informatics (NII), and the medical community, e.g., the incorporation of medical ontology (correspondence between medical concepts and terms).
  • Develop HL7 FHIR-compliant medical data exchange standards and an electronic medical record tool conforming to it, and codes for names of injuries / diseases / drugs / clinical tests etc., which are the preconditions of the development of medical-recording support tool.

(2) Development of health and medical RWD distribution system for its primary use and secondary use.

From the perspective of primary use, in which individual patients' data are directly utilized for their own medical care and healthcare, data sharing among medical institutions and the processes for provisioning to PHR (Personal Health Record) are required. From the perspective of secondary use, in which health and medical RWD is utilized for R&D such as the development of new diagnosis/treatment technology and macro analysis of medical expenses, a particular quality and scale of RWD that is commensurate with the research goal is required. For smooth distribution of health and medical RWD collected through the system established in [Subject 1], [Subject 2] proposes:

  • Establish a health and medical RWD distribution system that meets the requirements of both primary and secondary use based on the receipt information / specific medical examination information database (NDB: National Data Base) and the Next Generation Medical Infrastructure Law.
  • Build a consent process and review process on the premise that health and medical RWD will be utilized for academic discovery as well as commercialization to promote utilization of intelligence created by secondary use.

(3) Development of a medical foundation model and the laws and regulations for its implementation.

By making use of the health and medical RWD infrastructure built in [Subject 1] and [Subject 2], [Subject 3] proposes:

  • Build a data warehouse to accumulate health and medical RWD on a large scale, and develop medical version of multimodal (language, image) foundation model as a groundwork for medical/clinical intelligence.
  • Study the adequate regulation of medical AI based on the medical LLM / foundation model in parallel with its development and implementation.
  • Discuss concrete measures to prevent discrimination based on genetic and genomic information as one of the most serious consequences by misuse/abuse of medical big data and AI such as personal estimation from anonymously processed information or AI models, so that concerns about such misuse/abuse won't hinder the utilization of health and medical RWD.

Milestones for the overall project will be set based on the degree of difficulty in developing and implementing a medical LLM / foundation model, and its concrete sub-subjects and timeline for each subject will be designed to achieve these milestones. For each of [Subject 1] to [Subject 3], academia and other related ministries, agencies, and companies will work together to promote the project, in collaboration with related existing projects such as the Cabinet Office SIP "Establishment of an Integrated Health Care System" and "Medical DX". In order to achieve social acceptance and collaboration with the broader public, the effects to be achieved in conjunction with the problems to be solved through [Subject 1] - [Subject 3] will be appealed. This is expected to foster understanding among individuals and citizens, the primary providers and the greatest beneficiaries of health and medical RWD, as well as medical professionals, the contact points for the collection and provision of RWD.

By building a health and medical RWD infrastructure envisioned in this strategic proposal, the development of medical related industries, the improvement in medical treatment outcomes, and acceleration of human data-driven medical research would be realized. In particular, the medical foundation model of language and image could further strengthen Japan's presence in diagnostic imaging equipment. By these effects, it is expected that the sustainability of Japan's medical and healthcare system would be enhanced and consequently the health of the people would be maintained and further improved.

Related Reports