Statistical analysis of textual data
A large part of data on news sites is in the form of text. Unlike numbers, text can’t be merged, subtracted or multiplied. At the same time, the statistical analysis of large texts is necessary for finding trends, identifying contexts, classifying documents, etc. STACC, in collaboration with its spin-off company TEXTA, offers solutions for analyzing, classifying and visualizing text.
Identifying documents containing personal data
News sites operate with personal data on daily basis, but this data exists in very different formats (SQL databases, Word, Excel, PDF, etc.). In the context of the revised General Data Protection Regulation, organizations must have a clear overview of which documents contain personal information. Over the years, STACC has developed an anonymization solution for texts in Estonian, which identifies whether the text contains characteristics referring to a particular person (name, personal identification code, address, etc.).