We introduce Feature Hashing as a technique for reducing the number of features in a sparse feature vector. This is particularly useful together with bag-of-words and td-idf representations of documents. Such dimensionality reduction is necessary for many applications, for instance with neural nets, where sparse input vectors cannot easily be exploited by the model. We also discuss similarities with the Johnson-Lindenstrauss random projection.
Негізгі бет Machine Learning 50: Feature Hashing
Пікірлер: 5