NLP @ IDSIA
Table of Contents
Introduction
This web site is an entry point for NLP research at IDSIA. The NLP group at IDSIA has been established in 2019. Together with our host Institute (IDSIA), we share a joint affiliation with the University of Applied Sciences and Arts of Southern Switzerland (SUPSI) and the Università della Svizzera Italiana (USI).
The dual nature of IDSIA (basic research and technology transfer) allows us to perform cutting edge state-of-the-art research, and at the same time requires us to collaborate with local and national companies in order to bring these technologies into practical usage.
Follow us on Twitter/X: @idsianlp
Our Research
We combine an understanding of the nature of natural language (human language) with expertise in the most recent techniques in the field of Natural Language Processing (NLP), in particular transformer-based architectures (including Large Language Models).
We apply our expertise to basic research and applied projects in collaboration with industry, in many cases funded by the Swiss Innovation Agency (InnoSuisse). See below some selected examples of recent projects.
A specific area of research interest is biomedical text processing for different textual domains, such as the scientific literature, clinical reports, and social media. We are also working on applications of NLP deep learning models (LLMs) in the financial domain, in collaboration with the Swiss banking industry.
During the COVID-19 pandemic we performed several biomedical text mining activities in support of COVID-19 research, in particular:
- Processing biomedical literature about COVID-19.
- Monitoring Twitter conversations about COVID-19.
- Collaborating at a repository of COVID-19 literature with classification into clinically relevant-categories and translations in Spanish.
Recent publications
Follow this link for the full list of publications. Below you can find a few selected publication.
- Joseph Cornelius, Oscar Lithgow-Serrano, Sandra Mitrovic, Ljiljana Dolamic, and Fabio Rinaldi. 2024. BUST: Benchmark for the evaluation of detectors of LLM-Generated Text. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 8029–8057, Mexico City, Mexico. Association for Computational Linguistics.
doi: 10.18653/v1/2024.naacl-long.444 https://github.com/IDSIA-NLP/BUST
- Lorenzo Ruinelli, Amos Colombo, Mathilde Rochat, Sotirios Georgios Popeskou, Andrea Franchini, Sandra Mitrović, Oscar William Lithgow, Joseph Cornelius, and Fabio Rinaldi. 2024. Experiments in Automated Generation of Discharge Summaries in Italian. In Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024, pages 137–144, Torino, Italia. ELRA and ICCL.
https://aclanthology.org/2024.cl4health-1.17/
- Nico Colic, Jin-Dong Kim, and Fabio Rinaldi. 2024. Pre-Gamus: Reducing Complexity of Scientific Literature as a Support against Misinformation. In Proceedings of the Workshop on DeTermIt! Evaluating Text Difficulty in a Multilingual Context @ LREC-COLING 2024, pages 196–201, Torino, Italia. ELRA and ICCL.
https://aclanthology.org/2024.determit-1.18
- Anastassia Shaitarova, Jamil Zaghir, Alberto Lavelli, Michael Krauthammer, Fabio Rinaldi. Exploring the Latest Highlights in Medical Natural Language Processing across Multiple Languages: A Survey. IMIA Yearbook of Medical Informatics, 2023 December 2023 Yearbook of Medical Informatics 32(01):230-243 doi: 10.1055/s-0043-1768726
- Vani Kanjirangat, Tanja Samardžić, Ljiljana Dolamic, Fabio Rinaldi (2023). Optimizing the Size of Subword Vocabularies in Dialect Classification. doi: 10.18653/v1/2023.vardial-1.2
- Lithgow-Serrano, O., Cornelius, J., Rinaldi, F., Dolamic, L. (2022). mattica@SMM4H’22: Leveraging sentiment for stance & premise joint learning. Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop and Shared Task, 75–77. https://aclanthology.org/2022.smm4h-1.22
- Kanjirangat,V., Samardzic,T., Rinaldi,Fabio., Dolamic,Ljiljana. (2022). Early Guessing for Dialect Identification. To appear in In Findings of The 2022 Conference on Empirical Methods in Natural Language Processing.
- Kanjirangat,V., Samardzic,T., Dolamic,Ljiljana., Rinaldi,Fabio. (2022). NLPDI at NADI Shared Task Subtask-1: Sub-word Level Convolutional Neural Models and Pre-trained Binary Classifiers for Dialect Identification. Proceedings of the NADI Shared Task, The Seventh Arabic Natural Language Processing Workshop (WANLP) at The 2022 Conference on Empirical Methods in Natural Language Processing.
- Lenz Furrer, Joseph Cornelius, Fabio Rinaldi. Parallel sequence tagging for concept recognition. BMC Bioinformatics volume 22, Article number: 623 (2021). doi: 10.1186/s12859-021-04511-y
- Roberto Zanoli, Alberto Lavelli, Theresa Löffler, Nicolas Andres Perez Gonzalez, Fabio Rinaldi. An annotated dataset for extracting gene-melanoma relations from scientific literature. Journal of Biomedical Semantics, volume 13, Article number: 2 (2022). doi: 10.1186/s13326-021-00251-3
Team Members
Researchers
Associated group members
Former members and temporary visitors
Group news
2024
- M2P2) has been approved!!! We are looking forward to a collaboration with Prof. Michael Krauthammer (UZH) and Prof J.L. Raisaro (CHUV) to increase the impact of modern NLP technologies in the Swiss health sector! Our SNF project proposal (
- Invited presentation at the "Giornata della democrazia", Locarno.
- BUST: Benchmark for the evaluation of detectors of LLM-Generated Text Joseph Cornelius, Oscar Lithgow-Serrano, Sandra Mitrovic, Ljiljana Dolamic, Fabio Rinaldi Our article on evaluating detectors of LLM-generated text accepted at NAACL 2024!
- Swiss AI initiative will be a core focus of our activity. The Swiss AI initiative is a Swiss-wide consortium to develop innovative AI applications using the new powerful infrastructure Alps, provided by the Swiss National Supercomputing Centre. During 2024 our participation in the
Past
See here.
Our Projects
We execute several technology transfer projects in collaboration with Swiss companies, with the aim of bringing the benefits of advanced NLP technologies into an industrial context.
We also have a few pure research projects, exploratory in nature. Our main research interest is NLP applications in the health area. Check in particular the SNF-funded projects QUADRATIC and M2P2.
Below you can find some representative examples of the projects we are involved in. This is not an exhaustive list (partially because for contractual reasons we are not allowed to mention some projects).
Active (as of Oct 2024)
SNF/M2P2
Medical, Multilingual and Privacy-Preserving Natural Language Processing in the clinical domain (M2P2)
SNF/QUADRATIC (2024)
NLP in support of Pharmacovigilance: QUality Adverse Drug Reaction AcTIve Control (QUADRATIC)
Project in collaboration with EOC.
Swiss AI initiative (2024)
Coordination of IDSIA activities in relation to the Swiss AI Initiative
The National Supercomputing Center (CSCS) is performing a major upgrade of its infrastructure. The new ALPS infrastructure, which will be capable of supporting the development of innovative AI applications, such as Large Language Models, will become available early next year, and the Swiss academic community is organizing itself to make use of it. Working groups are being formed across Switzerland to deal with different potential applications (from the development of a foundational model to specific applications in science, education, medicine, etc). The purpose of this project is to coordinate IDSIA's participation in the Swiss AI initiative.
Mini-MUSE (2023-2024)
WRSD (2022-2024)
Identificazione del Rischio e Prevenzione di Disordini dovuti allo Stress nell’ambiente lavorativo.
Brisk.AI (2023-2024)
This is a small project in collaboration with Dr. Yalbi Itzel Balderas-Martínez of the National Institute of Respiratory Diseases-Mexico (INER) in Mexico, aiming at using AI techniques to produce translated and simplified versions of scientific literature, for educational purposes.
Previous projects
A list of all current and previous projects can be found here.
How to find us
We are based at the Dalle Molle Institute for Artificial Intelligence (IDSIA), in Lugano, Switzerland.
Address
Click here to find our location on a map
Dalle Molle Institute for Artificial Intelligence Research / Istituto Dalle Molle di studi sull’intelligenza artificiale (IDSIA) IDSIA USI-SUPSI Polo universitario Lugano - Campus Est Via la Santa 1 CH-6962 Lugano - Viganello
Contact
Dr. Fabio Rinaldi E-Mail: fabio AT idsia.ch Tel: +41 (0)79 300 67 71 Skype: fabio.rinaldi.uzh