Jesper Findahl is a Senior R&D Software Engineer at CodeLounge, a center for software research and development (R&D) at the Università della Svizzera italiana (USI) in Switzerland. He joined CodeLounge in 2018 after earning a Master's degree in Informatics at USI.
At CodeLounge, Jesper's role has been diverse, encompassing tasks such as frontend and backend engineering, data visualization, continuous integration/development (CI/CD), software analysis, and, more recently, data analytics and machine learning.
Jesper is deeply passionate about software design and productivity and is always eager to explore new technologies and further his knowledge in the field.
As developers, we manipulate data every day. But there's one kind of data that is difficult and non-trivial to process and structure: human language. This is even more difficult when you're in a country like Switzerland, where there are four official languages! Fortunately, there has been a decades long effort in computer science regarding this specific purpose, namely Natural Language Processing (NLP).
In this talk, we cover many of the available components of a modern NLP pipeline: From the basic tasks, like tokenization and lemmatization, to the most interesting techniques like Named Entity Recognition (NER), coreference resolution, and the dependency parser. Furthermore, we show where and how LLMs, like GPT, can be plugged in to (possibly) enhance a pipeline.
To provide a real-world - and Swiss - context, our target dataset will be the Swiss Commercial Registry. This complex, multilingual public database is central to an expansive interdisciplinary research project in economics and political science, where we are building the software engineering backbone using cutting-edge NLP technology.
Searching for speaker images...