The meeting discussed various methods and models for representing and analyzing text data, including one hot encoding, n-grams, word co-occurrence metrics, skip gram and continuous back of work (CBOOW) architectures, the Glove model, and FastText.
One hot encoding is a method used to represent text data as binary vectors.
N-grams can be used to analyze and merge words in sentences based on their frequency.
Word co-occurrence metrics, such as PMI and PPMI, can be used to measure similarity between words.
Skip gram and continuous bag of words (CBOW) are two different architectures for word representation.
The Glove model uses probability ratios to determine the relationship between words.
FastText is an open source library for learning word embeddings and word specifications.
Негізгі бет Фильм және анимация 20240919 - TA - Text Representation - Distributed-based methods
Пікірлер