Entities are keywords found in the text that have special meaning. These entities are grouped into types to identify meaningful clusters.
Concepts are more broad and they do not have to be mentioned in the text directly. As such, they contain the hidden meaning of a piece of text.
For entity recognition we are using a very advanced NLP library called spaCy. It also detects parts of speech and their relations that allows to process the detected entities in specific context.
Concepts are extracted using a model trained on data from Wikipedia, that’s why we can find a correlation of a text referring to a concept and it’s description.
The technology behind it
We developed concept recognition using popular machine learning libraries for the Python programming language. From there we can extract information that is not explicitly mentioned in the text.
This allows to classify the content very precisely and gain deep insights or provide very relevant content.
We are either using popular and robust libraries, or develop our own solution using general machine learning and deep learning frameworks.
Lead Engineer Special Projects
Lead Developer Behind this AI Application:
Jan Waś had his PhD in 2008 at Karazin Kharkiv National University. During the work there as an Associate Professor has deal with various heterogeneous data from scientific measurements and simulations.
Since 2015 he use his expertise in Python programming and Machine Learning at TJ.Software.
Now – leads Data Science direction at Opentopic and has lecture courses at Bialystok Technical University.
He’s passionate in activities, which extends frameworks of possible – either for himself or mankind as a whole: from skydiving and climbing to state of the art algorithms for the AI.