Data Visualization Project

May 10, 2015

You may have seen me tweeting about some research I did on “Data Visualization for Exploratory Data Analysis” for my Cognitive Systems Engineering course. My presentation went really well! I’m less satisfied with the paper since it was done in a hurry to complete the project deliverables, but i’m including it because it explains some things that aren’t obvious from the powerpoint without my commentary. Principles of Data Visualization for Exploratory Data Analysis [presentation – pdf] Principles of Data Visualization for Exploratory Data Analysis [paper – [pdf] Check out the references in both documents for some good resources. I’ll include some links in the post below, too. I had a lot more material from my research that I wanted to include and just didn’t have time to in a 15-minute presentation! The professor was happy about the topic I picked because she’s teaching a class on Data Visualization next semester, so I think that worked out in my favor :) These two books by Stephen Few covered the very basics of visualization for human perception: Show Me the Numbers: Designing Tables and Graphs to Enlighten Now You See It: Simple Visualization Techniques for Quantitative Analysis Blog posts about related topics: Six Revisions: Gestalt Laws eagereyes: Illustration vs Visualization Detailed visualization of NBA shot selection Publications and articles: IEEE Transactions on Visualization and Computer Graphics Toward a Perceptual Science of Multidimensional Data Visualization: Bertin and Beyond by Marc Green, Ph. D. Scagnostics by Dang and Wilkinson Generalized Plot Matrix (GPLOM) by Im, McGuffin, Leung UpSet: Visualization of Intersecting Sets by Lex, Gehlenborg, Strobelt, et al. …and there are more resources in the paper and presentation files! (and if you’re REALLY interested in this topic, post a comment and I will add even more links I have bookmarked) I also did a project using some data from my day job related to university fundraising and major gift prospects, but unfortunately I can’t share that study here because I don’t have permission to do so. It included some cool visuals like bubble charts, and also an interesting analysis of movement through the prospect pipeline using Markov Chains. I learned a lot doing that one! It was nice to end my final semester of grad school with two data-related projects! (Yes, I’m finally graduating! Masters of Systems Engineering! woo...

Read More

What I’m up to

October 26, 2014

I haven’t written here in a while because I haven’t “finished” anything I have been wanting to write about, but why wait until I’m completely done, right? So, here’s a bit about what I’ve been up to data-science-wise: I’m in a grad class called Stochastic Models and we’re learning about Markov Chains right now. Fascinating stuff! Here’s a cool site that visually shows some Markov Chain concepts. My other grad class is Intro to Systems Engineering. (Yeah, because of the courses offered online, I’m taking the intro class in my second-to-last semester!) We just did a neat project in that class that involved coming up with a strategy and participating in a baseball draft, so I’ll come back later and write about that in more detail. I’m almost at the end of Udacity’s Cloudera Hadoop course. I am really enjoying learning about MapReduce, and will definitely write up a review of the class when I’m done. The biggest frustrations I’ve had so far haven’t involved the Hadoop concepts, but using the VM they provide has been frustrating! All I have left on that is the final project, so soon i’ll be able to cross that one off my goals list. Soon, I need to come up with a final project for my Systems Engineering Masters degree. I’m definitely doing something data science related, and will update when that plan is finalized. I’ve been telling more people about my data science plans, and have had more people asking me about data science and the learning process. I may be giving a talk to my alma mater’s IEEE Computer Society club meeting soon about “What is data science?”, so that will be fun! Are you “becoming a data scientist”, too? What projects are you in the middle of right...

Read More

Goal #1 Reached!

May 13, 2014

My first “Becoming A Data Scientist” goal was to get an “A” in my Machine Learning class this semester, and I did! Now I can cross that one off the list: Updated Goals

Read More

ML Project 4 Results

May 11, 2014

I am happy to report that I got 100% on the final project I did in the last 2 weeks for my Machine Learning grad class (which is especially great because that was 30% of my grade for the semester!) and I got some good feedback from the professor: Very good analysis and you showed great potential to become a good researcher! Comments: 1. when you code your categories features, 1 of k coding is a good choice. Did you apply this method to all categories features? 2. Some time, normorlize features will make a huge difference. One way to do this is to comput the z-score for features before you train a model on the data. 3. In terms of machine learning application, your analysis is good. If you try to find a social study expert to collobrate with you, I believe your findings can be published on high impacting journals. 4. In order to publish your work, you will need to do some research to found what have been done in this field. This is especially encouraging since I want to become a data scientist, so hearing positive feedback like this, even encouraging me to publish after having only taken one semester of Machine Learning, feels great! So, I will take time this summer to do more research and learning and expand on this project (since it was a rush to complete enough to turn in on time in this class but there’s a lot more I want to do with it), and I will collaborate with some people at the university where I work to further distill the results and see if we can apply them to segment out some potential first-time donors for next fiscal year. This is...

Read More

Machine Learning Project 4

May 11, 2014

So immediately after I turned in project 3, I started on Project 4, our final project in Machine Learning grad class. We had a few options that the professor gave us, but could also propose our own. One of the options was learning how to implement Random Forest (an ensemble learning method using many decision trees) and analyzing a given data set, so I proposed using Random Forest on University Advancement (Development/Fundraising) data I got from my “day job”. The professor approved it, so I started learning about Random Forest Classification.

Read More

ML Projects 2 & 3 Results

April 29, 2014

I was in such a rush to finish Project 3 by Sunday night, I didn’t post about the rest of my results, and now before I got a chance to write about it, the professor has already graded it! I got 100% on this one I just turned in, and also just found out I got 100% on Project 2!! This makes me feel so good, especially since I didn’t do so well on the midterm, and confirms that I can do this!

Read More