The Hellenic OCR Team announces the finalization of the software development project Tool for Extracting Quantitative Text Profiles. The code was developed by Panagiotis Papantonakis for GSoC 2019. Project management was undertaken by Hellenic OCR Team members, so called mentors.

Project Description

In this project, I developed a user-friendly desktop GUI to extract various linguistic features from texts, using existing NLP packages. The application was developed using ElectronReactJSMaterialUI CSS framework and MongoDB for the database. The application’s main target groups are students and scientists of computational glossology, who lack programming skills and need an easy to use tool to perform their analysis. Within the application, the user can import texts, select the indices he wants to calculate and export the generated results. Additionally, the application is flexible and modular, offering to the user the ability to add custom scripts to be executed upon the selected texts. For more information about the project, the used technologies and instructions on how to install and operate, visit the project wiki.

Repository

All of my work can be found at the project’s repository, along with the code of the tool. My commits are here.

Project Progress

Since this project was developed under GSoC 2019 program, I kept weekly reports on my progress, which can be found at the relevant wiki page

Future Work

The current version of the tool can be considered alpha. It is functional but contains many bugs and has many areas for improvement. Planned and suggested future work can be found here

Written by Panagiotis Papantonakis; copied from his GitHub gist, 2019.

GSoC Mentors:

  • Mentor (front-end): Sotiris Leventis (sotirisleventis)
  • Mentor (back-end): George Mikros (gmikros)
  • Mentor (project management): Fotis Fitsilis (fitsilisf)

Links:

Copy for the Hellenic OCR Team, 29 August 2019

*******

Pin It on Pinterest