European Proceedings Logo

System of Automated Text Messages Clustering by Semantic Proximity Based on NLP and Machine Learning Methods

Table 1: Examples of data tokenization in BBC tweets dataset

Original documents from the corpus Tokenized documents
Ambulance progress not fast enough ambulance progress not fast enough
Guinea declares Ebola emergency guinea declare ebola emergency
VIDEO: Sitting down poses health risk video sit down pose health risk
< Back to article