Setup Gensim for learning word2Vec

Sagiruddin Mondal
Sep 6, 2018 · 1 min read

Hi! This is very straightforward. I am making a note of my gensim setup for doing word2vec on a small dataset. So here it is,

I will use anaconda for my platform management. so, here is the platform setup,

Installations:

  1. https://docs.anaconda.com/anaconda/install/mac-os#macos-graphical-install
  2. Conda update conda
  3. Conda update anaconda
  4. conda install -c anaconda gensim

I am going to create one virtual environment for the operation so the dependencies can be managed well.

Virtual Conda Environment

  1. https://conda.io/docs/user-guide/tasks/manage-environments.html
  2. conda create -n myenv python=3.6
  3. Conda install –n myenv gensim

External Packages I need,

  1. pandas,
  2. nltk,
  3. gensim

And here is the code,

import os
import pandas as pd
import nltk
import gensim
from gensim import corpora, models, similarities

os.chdir("/Users/sagir/Documents/data/ai/deeplearning")
df = pd.read_csv('jokes.csv')


x = df['Question'].values.tolist()
y = df['Answer'].values.tolist()

corpus = x + y

print(corpus)

token_corpus = [nltk.word_tokenize(data) for data in corpus]

#-----------------------------------------------
#For trainnig and creating the model
#-----------------------------------------------
model = gensim.models.Word2Vec(token_corpus, min_count=5, size=32)

#-----------------------------------------------
#For Saving the model
#-----------------------------------------------
model.save('jokeModelSaved')

model = gensim.models.Word2Vec.load('jokeModelSaved')

print(model.most_similar('Hi'))
#model.most_similar([0.8904996514320374])

For full implementation and code please visit:

https://github.com/beingsagir/basic-ai-practices/blob/master/gensim-basics/main.py

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade