Wrote a word frequency analyzer script for Russian text
Hello everyone! I have been trying to find a way to make frequency lists that don't treat different cases of the same word as different words.
I couldn't find something to do what I wanted, so I made it. It's a python script that reads in a text file, then finds word frequency of base words in descending order and prints it to the screen, or in CSV format to an output file. It uses the Yandex Mystem 3 morphological analyzer to do it.
It only supports Russian, unfortunately, but here it is: https://github.com/branover/word_frequencifizer
I just wrote it yesterday, so it might be buggy. If you find any problems with it or would like me to add new features, let me know! Enjoy :)