Description: | This package contains the pure Python implementation of gensim.
If you don't need the highly optimized version of word2vec, it is
sufficient to install this package. Otherwise installing the
"python-gensim-addons"-package is strongly recommended.
Gensim is a Python library for topic modelling, document indexing
and similarity retrieval with large corpora. Target audience is
the natural language processing (NLP) and information retrieval
(IR) community.
Features:
* All algorithms are memory-independent w.r.t. the corpus size
(can process input larger than RAM).
* Intuitive interfaces
- easy to plug in your own input corpus/datastream (trivial
streaming API)
- easy to extend with other Vector Space algorithms (trivial
transformation API)
* Efficient implementations of popular algorithms, such as online
Latent Semantic Analysis (LSA/LSI), Latent Dirichlet Allocation (LDA),
Random Projections (RP), Hierarchical Dirichlet Process (HDP) or
word2vec deep learning.
* Distributed computing: can run Latent Semantic Analysis and Latent
Dirichlet Allocation on a cluster of computers, and word2vec on
multiple cores.
* Extensive HTML documentation and tutorials. |