Uncovering The Hottest Topics in Machine Learning
- 1 minutes read - 130 words- Explored the NIPS Papers data, to determine what type of data it is and how it is structured
- Prepared the data by removing all the columns that do not contain useful text information
- Visualized the number of publications per year, to understand the extent of the machine learning ‘revolution’!
- Modified the text data by applying regular expression to remove any punctuation in the title to make it more amenable for analysis
- Created a word cloud of the titles of the research papers using Andreas Mueller’s word cloud library.
- Employed Latent Dirichlet allocation (LDA) model to determine the topics, after representing the document into a simple vector representation
- Analyzed Research titles using LDA, by tweaking ‘number of topics’ and ‘number of words’ parameter in the LDA algorithm