Introducing … Rosa Filgueira
What’s your name?
What’s your background?
I’m a Computer Scientist with a background in High Performance and Data-Intensive Computing. I’ve been always working in academia, working closely to different domains, such as geosciences, biomedicine and most recently digital humanities. During the last years I’ve developed several optimization algorithms for improving the performance of parallel applications, scientific gateways to open up opportunities to share data and scientific methods, and contributed to the design of a new data-streaming framework among other works. But always keeping in mind the end-users of my code, so they can express their computational activities without being distracted of the technology and middleware details.
In one sentence, what is your role on the project?
Research Software Engineer that enables complex analysis of digital datasets at scale.
What excites you about the project?
This project gives me the opportunity provide the means to search across large scale datasets and to return results for further analysis and interpretation by historians and computational linguistics. For me it is very important to build solutions so researchers can analyse large datasets as easily as the smaller ones, so they can work more effectively as they create, refine and use their text-mining queries and languages models in a scalable way.
What challenges do you see ahead?
Although the product of most digitisation of historical text is structured files derived from Optical Character Recognition (OCR) software, the schemas, structure, and physical representation of the datasets are heterogeneous, and they are often difficult to link and cross-query. So, extracting automatically the commonalities among them, will allow researchers to work with digital collections in a transparently and homogeneous way. Also, the quality of the OCR can vary a lot among different datasets. Therefore, improving the quality of the scanned text will be a challenge that we will face in the project.
What’s the last (non work) book you read, exhibition or performance you saw?
Tetsuya Ishida, “Portrait of the other”, Madrid (Palacio Velazquez).