PLLuM – Polish Open Large Language Model as a Tool for Knowledge Base Exploration

PLLuM is a family of AI models that can serve multiple purposes, such as summarizing extensive reports, answering questions in natural language, generating official documents, or even automatically tagging archival collections. Crucially, PLLuM models can also be fine-tuned for specific industries, including medicine, banking, and public administration. This provides researchers and enterprises with a cost-effective yet flexible tool that understands the cultural context, idioms, and colloquialisms unique to the Polish language.
Recording from the seminar:
An open, specialized language model enables research teams to implement advanced natural language processing capabilities (such as fact extraction, content classification, and semantic search) without the expensive task of training their own language models from scratch. Providing such models can significantly enhance various projects involving the analysis of Polish-language texts. For instance, within the OpenFact project conducted by the Department of Information Systems, large language models can support verifying the credibility of internet sources and detecting fake news.
Prof. Maciej Piasecki: briefly about the PLLuM project:
Prof. Maciej Piasecki specializes in natural language processing, computational linguistics, lexicography, and digital humanities. He is also the coordinator and co-founder of CLARIN-PL, the Polish part of the European research infrastructure CLARIN ERIC, which supports linguistic technologies for humanities and social sciences. CLARIN-PL aids researchers by providing language resources, tools, research applications, and computational infrastructure while promoting open science.
The seminar was held on May 9, 2025, in a hybrid format. More information about PLLuM can be found at pllum.org.pl.