Even before that can happen, the country has to overcome language barriers …
To enhance its generative AI (GenAI) capabilities, Japan’s New Energy and Industrial Technology Development Organization (NEDO), under the Ministry of Economy, Trade and Industry’s (METI)’s Generative AI Accelerator Challenge (GENIAC) initiative, has kicked off what is called the “Research and Development Project of the Enhanced Infrastructures for Post-5G Information and Communication Systems/Development of post-5G information and communication systems.”
Despite the complicated terminology, the project simply involves research and development (R&D) on technology that will be “combining knowledge graphs with large language models (LLMs) that emulate logical reasoning.”
Part of the R&D has been slated to focus on the problem posed by unexpected “hallucinations” in LLMs: a phenomenon in which GenAI answers result in plausible but incorrect or unrelated output. This phenomenon could prevent GenAI from use in operations that require high levels of compliance and explainability, such as in the legal, accounting, finance, and medical fields.
To address this, one approach has been to control LLM output using knowledge graphs. The firm involved in the GENIAC project, Fujitsu, will develop two specialized LLMs for this purpose:
1. An LLM for knowledge graph generation, which converts natural language documents into knowledge graphs to form knowledge.
2. An LLM for knowledge graph inference, that searches for relevant information on knowledge graphs for a given question, aggregates it logically, and then generates the proper answer, in the especially with a focus on the Japanese language. The firm recently announced its own LLM that possesses enhanced capabilities in the Japanese language.
To efficiently develop two specialized LLMs in a limited development period, Fujitsu will first develop a pre-trained LLM that is common to both specialized LLMs. The advantage of this approach is that it enables LLMs to handle both documents in natural language and knowledge graphs by adding a bilingual corpus (English and Japanese) to the pre-learning data, to facilitate the development of the specialized LLMs for knowledge graph generation and inference.
The firm plans to offer the new technology to the Japanese market at the end of fiscal year 2024, followed by other knowledge and know-how acquired during the project to various AI communities such as Hugging Face, GitHub, and GENIAC-affiliated users.