Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Activity 6: Clustering Utah Legislative Documents (Python, sklearn)
For the activity, I thought I would share how I used K-Means clustering to help organize documents, and use it to recommend similar documents (bills). I used sklearn, which includes a lot of useful tools that were used throughout the whole process. This was part of a class project creating a website, and so you can actually see how K-Means is integrated into a system at the page we created. 
The data was from, which has bills and voting information for Utah's legislation. I ended up scraping and cleaning the data, and then using that to cluster it. You can see my Python code here and see how the clustering was done.

Forum Jump:

Users browsing this thread: 1 Guest(s)

About Becoming A Data Scientist is a blog created by Renee Teate to track her path from "SQL Data Analyst pursuing an Engineering Master's Degree" to "Data Scientist". She created this club so participants can work together and help one another learn data science. See her other site DataSciGuide for more learning resources.

Sponsored by DataCamp!