projects – Page 2 – Becoming A Data Scientist

My “Secret” Side Project, Revealed

OK So I was actually hoping to show this to you all long ago, and I kept coming up with more and more ideas for it, so it’s not going to be “ready” to reveal for a while, but I figured I’d go ahead and show it to you anyway. My main motivation is that… Continue reading My “Secret” Side Project, Revealed

API and Market Basket Analysis

I was considering waiting until I’m done before posting about this project, but instead I thought I’d post my progress and plans while I think about the next steps. I posted earlier about using the UsesThis API to retrieve data about what other software people that use X software also use. I thought I was… Continue reading API and Market Basket Analysis

IPython, Requests, lxml, and the NPR API

Last week, I decided to learn how to use python to get data from an API. I started with the Codecademy “Introduction to APIs in Python” course, which got me oriented to how requests work, and in the subsequent NPR API lesson, specifically how the NPR stories API works. Certain parts of the course assumed… Continue reading IPython, Requests, lxml, and the NPR API

Relative Year SQL

I wrote this SQL code recently and wanted to share it here (in a modified form to simplify). This isn’t a “typical” SQL SELECT statement, because of how each row is checking the rest of the table relative to its own fiscal year value.

Summer of Data Science 2015

I was daydreaming about all of the data science learning I’m going to do this summer, now that I’m done with grad school (M.Eng. in Systems Engineering, yay!) – I’m so excited to get to choose what to work on, and not have homework deadlines in the middle of the work-week! I had a thought… Continue reading Summer of Data Science 2015

Data Visualization Project

You may have seen me tweeting about some research I did on “Data Visualization for Exploratory Data Analysis” for my Cognitive Systems Engineering course. My presentation went really well! I’m less satisfied with the paper since it was done in a hurry to complete the project deliverables, but i’m including it because it explains some… Continue reading Data Visualization Project

Data Science Practice – Classifying Heart Disease

This post details a casual exploratory project I did over a few days to teach myself more about classifiers. I downloaded the Heart Disease dataset from the UCI Machine Learning respository and thought of a few different ways to approach classifying the provided data. ——————————————- “MANUAL” APPROACH USING EXCEL So first I started out by… Continue reading Data Science Practice – Classifying Heart Disease

ML Project 4 Results

I am happy to report that I got 100% on the final project I did in the last 2 weeks for my Machine Learning grad class (which is especially great because that was 30% of my grade for the semester!) and I got some good feedback from the professor: Very good analysis and you showed… Continue reading ML Project 4 Results

Machine Learning Project 4

So immediately after I turned in project 3, I started on Project 4, our final project in Machine Learning grad class. We had a few options that the professor gave us, but could also propose our own. One of the options was learning how to implement Random Forest (an ensemble learning method using many decision trees) and analyzing a given data set, so I proposed using Random Forest on University Advancement (Development/Fundraising) data I got from my “day job”. The professor approved it, so I started learning about Random Forest Classification.

ML Projects 2 & 3 Results

I was in such a rush to finish Project 3 by Sunday night, I didn’t post about the rest of my results, and now before I got a chance to write about it, the professor has already graded it! I got 100% on this one I just turned in, and also just found out I got 100% on Project 2!! This makes me feel so good, especially since I didn’t do so well on the midterm, and confirms that I can do this!