Bag of words in 5 lines:
#bag of words
from collections import Counter
temp = Counter()
for xs in df['tags']:
temp += Counter(xs)
Quick stemmer, not super good for my purposes though
#this stemmer leads to _more_ unique words, not fewer
from nltk.stem.porter import PorterStemmer
porter_stemmer = PorterStemmer()
df['tags'] = df['tags'].apply(lambda row: [porter_stemmer.stem(word) for word in row])
https://github.com/plotly/dash-sample-apps/blob/master/apps/dash-manufacture-spc-dashboard/app.py
See also: