Monday Silliness

June 8, 2015

OK so I’m tired enough to be a little silly right now, and I think I’m finally feeling bold enough to share my parody songs inspired by these twitter exchanges: @matt_slotnick @dpatil haha i've said b4 that O'Jays "for the love of money" but w/"data data data…DATA" runs thru my head b4 projects :) — Data Science Renee (@BecomingDataSci) May 14, 2015 @BecomingDataSci @lukedones @matt_slotnick possible theme song, alter @bryanadams "Summer of 69" to "Summer of Data Science" — Ryan Swanstrom (@ryanswanstrom) May 27, 2015 @ryanswanstrom @becomingdatasci @matt_slotnick '69' -> "datasci" — lukemeister (@lukedones) May 27, 2015 So here goes…. The Summer of Data Sci’ To the tune of… Lyrics: I took my first course in machine learning Found it all for free online Coded ’til my fingers bled Was the Summer of Data Sci Me and some twitter friends Got together and we tried real hard To understand how data covaried (linear dependence between two vars) Oh when I look back now That summer was the best endeavor And if I had the choice I wouldn’t change it whatsoever Those were the best days of my life Ain’t no use in complainin’ When you got learning to do Spent my evenings staring at my laptop And that’s when I downloaded you You were free and open source I thought that you would last forever And then I faced a choice I had to port you now or never Python 2.7 to 3.5 Oh yeah Back in the Summer of Data Sci Ohhh We were consumin’ APIs Which were RESTful and RESTless The response was undefined… I guess nothin’ can work forever, 410, gone And now the times are changin’ Look at everything that’s come and gone Sometimes I import that old library Documented it in IPython You were free and open source Analyzed my datasets on weather Your model helped me understand Bayesian time series forecasts better Those were the best days of my life Oh yeah Back in the Summer of Data Sci Uh-huh It was the Summer of Data Sci, oh yeah Me and my Summer of Data Sci, oh and… For the Love of Data To the tune of… Lyrics: Data, data, data, data. DATA. [6x] Some people got to have it Some people really need it Listen to me y’all, make things make things, make things rad things with it You wanna do things, do things Do things, good things with it Talk about raw data, data Talk about raw data Data files, y’all, c’mon now For the love of data People will knowledge-discover For the love of data People will track their own numbers For the love of data People can’t even press delete Because they never know which algorithm just might need it For that clean, learned-by-machine, forecasting almighty model (cross-validated) For the love of data People will train decision trees For the love of data People develop their own techniques For the love of data A woman will sell her precious time For a small CSV, it contains a lot of weights Call it clean, learned-by-machine, forecasting almighty model (talk about talk about stats) I know that data’s generated by all people Internet of Things, some evil Give me a sample, brother can you data mine? Data can drive some people out of their minds Data data data no good no good don’t sell your soul for data Data data data AI AI deep learning will I know that data’s generated by all people Internet of Things, some evil Give me a sample, brother can you data mine? Data can drive some people out of their minds Data data data Got to have it, I really need it Data data data Give it up, give it up, give it up, yeah. Data data data Gotta have it Some people really need it Give me give me give me raw data Data data data I need I need Give me give me give me How many strings are in this array Don’t let don’t let don’t let data rule ya How many neighbors optimize this k Don’t let don’t let don’t let don’t let data fool ya yeah yeah yeah Got to train it, then really test it Save your code. Save your code. Let the feeds stream stream stream stream stream. People, don’t let data, don’t let data change you almighty model It will keep on changing, changing up your mind I’m tellin’...

Read More

IPython, Requests, lxml, and the NPR API

June 7, 2015

Last week, I decided to learn how to use python to get data from an API. I started with the Codecademy “Introduction to APIs in Python” course, which got me oriented to how requests work, and in the subsequent NPR API lesson, specifically how the NPR stories API works. Certain parts of the course assumed you knew more python than you had learned in the course, so heads-up that there are places you will probably have to google for help since the hints aren’t always related to what you’re stuck on. The course isn’t really a requirement for learning this stuff (and I thought it could use a lot of improvement), but it does give you a guided walk-through, which is nice when you are totally new to a topic. Then, I tweeted about my experience, and got 2 responses encouraging me to use the requests library instead of urllib that codecademy used. @BecomingDataSci the urllib api is terrible. You should take a look at http://t.co/CzIPob2tBV — Daniel Moisset (@dmoisset) June 1, 2015 @dmoisset @BecomingDataSci 2nding using of requests over urllib; esp. with HTTPS, requests tends to do saner things (e.g., cert validation) — Cheng H. Lee (@chenghlee) June 1, 2015 I decided to redo what I had learned from scratch, but using requests. I also wanted to learn how to use IPython, so I used an IPython notebook to play around with the code. Below is the HTML export of my IPython notebook, with comments explaining what I was doing. I’m sure there are better ways to do what I did (feel free to comment with suggestions!), but this was my first time doing any of this without any guidance, so I don’t mind posting it even if it’s a little ugly :) I definitely spent a lot of time understanding the hierarchy of the NPR XML and how to loop through it and display it. If you have done something similar in a more elegant way, please point me to your code! Here are the main resources I used to learn how to do what is in the code: python requests library documentation NPR API documentation python lxml library documentation iPython videos I also wanted to mention that there are a lot of frustrations you can run up against when you’re a python beginner. I was having a lot of problems with seemingly basic stuff (like installing packages with pip) and it took a couple hours of googling and asking someone for help to figure out there was a problem with my path environment variables in windows. I’ll post about that another time, but I just wanted to 1) encourage people not to give up if you get stuck on something that seems to be so basic that most “intro” articles don’t even cover it, and 2) encourage people writing intro articles to make some suggestions about what could go wrong and how to problem-solve. Here’s one example: When I tried to export my IPython notebook to HTML, it gave me a 500 server error saying I needed python packages I didn’t already have. After I installed the first, it told me I needed pandoc, so I installed that as well, but it kept giving me the same error. It turns out that you have to run IPython Notebook as an Administrator in Windows in order to get the HTML export to work properly, but the error message didn’t indicate that at all. This is the kind of frustration that may make beginners think they’re not “getting it” and give up, when it fact it’s something outside the scope of what you’re learning. Python seems to require a lot of this sort of problem-solving. (Note: on my other laptop, I installed python and the scipy stack using Anaconda, and have had a lot fewer issues like this.) Without further ado, here’s my iPython notebook! (I’m having issues making it look readable while embedded in wordpress, so click the link to view in a new tab for now, and I’ll fix for viewing later!) Renee’s 1st IPython Notebook (NPR API using requests and lxml) Here’s the actual ipynb file if you have IPython installed and want to run it yourself: First Python API Usage** **NOTE: WordPress wouldn’t let me upload it with the IPython notebook extension for security reasons, so after you download it, change the “.txt” extension to...

Read More

Summer of Data Science 2015

May 18, 2015

I was daydreaming about all of the data science learning I’m going to do this summer, now that I’m done with grad school (M.Eng. in Systems Engineering, yay!) – I’m so excited to get to choose what to work on, and not have homework deadlines in the middle of the work-week! I had a thought while daydreaming, and tweeted this, thinking a few people might think it was fun and respond: I'm planning to do a lot of data science learning this summer. Anyone else? Maybe we shld start a hashtag #SoDS "Summer of Data Science" :) — Data Science Renee (@BecomingDataSci) May 14, 2015 …and as you can see by the RT and Favorite count, it kind of took on a life of its own! I thought of a variation …or maybe more fun #SODAS "Summer of Data Science". like a cool, refreshing beverage. & we'll hand off to So Hemisphere ppl in the fall :) — Data Science Renee (@BecomingDataSci) May 14, 2015 and so did some other people @BecomingDataSci It could be #SoDaS (just add the little "a" in there for D"a"ta…) — Nicole Radziwill (@nicoleradziwill) May 14, 2015 @BecomingDataSci #DSS15 Data Science Summer 2015 — BigMikeInAustin (@BigMikeInAustin) May 14, 2015 In the end, it looks like #SoDS won…. and got a whole lot of support because of a RT by @dpatil! Thanks to him, this is what my notifications started to look like: Too bad I was supposed to be working on writing up something for work…. that didn’t get done that night! I came back later and was really surprised by the response! I was excited by all of the new followers, and especially happy that some people appeared to have been inspired by the hashtag to do some data science learning of their own! @BecomingDataSci @seinecle and is there something like "data science for über-beginners"? =D — Lexane Sirac (@lexanesirac) May 14, 2015 2 minutes later… @BecomingDataSci @seinecle @clarecorthell thank you so much! I'll make sure to take part in #SoDS then! — Lexane Sirac (@lexanesirac) May 14, 2015 So it seems I started something and now I need to follow up! I’m going to tag my summer learning projects on here with the “#SoDS 2015” post category, and tweet about them (of course!) using the #SoDS hashtag on twitter. Will you join me? :) Here’s to an awesome Summer of Data Science! Now I’m going to try to go respond to all of your tweets! (P.S. the hashtag just started being used by some Dutch foodies, but we’ll overwhelm that version with our data science tweets pretty soon!) P.P.S. we even have a unicorn joining us this summer! @BecomingDataSci @DataSkeptic count me in! #SoDS #becomingaunicorn — Data Science Unicorn (@DataScienceUni) May 14,...

Read More