PyData DC 2016 Talk

October 11, 2016

I just got back from PydataDC, where I learned a lot, had fun, and met a bunch of awesome people! I’ll definitely write about it more later, but I wanted to share my slides here since I told the attendees they could find them on my website. I got good feedback on the talk, and I’m so glad that my message resonated with some people! The talk was recorded and video should be out within a few weeks! Here are the slides: Becoming a Data Scientist – Advice from my Podcast Guests and the slide notes. Update 10/26: Here is the recording of my talk, with a playlist of other talks from PyData...

Read More

Becoming A Data Scientist Podcast Episode 0: Me!

December 14, 2015

Here is the first episode of the Becoming a Data Scientist Podcast, which is also available in video form!


(sorry for the poor video quality!)

In this episode, I talk a little about the podcast, I talk about my own background, and I introduce the Data Science Learning Club. Enjoy!
(Note: Episode 1, the first interview episode, comes out Monday 12/21!)

Podcast Audio Links:
Link to podcast Episode 0 audio
Podcast’s RSS feed for podcast subscription apps
(I will distribute this out to sites like iTunes and Stitcher soon)

Podcast Video Playlist:
Youtube playlist where I’ll publish future videos

More about the Data Science Learning Club:

Read More

BPDM’s interview with….. me!

October 26, 2015

An organization based in Puerto Rico called “Broadening Participation in Data Mining” (BPDM) interviewed me over the weekend, and it’s online now! Without further ado…. Thanks to Orlando and Herbierto for having me on! (P.S. I did put up the post about Data Sources on DataSciGuide)

Read More

Becoming A Data Scientist Flipboard Magazine

October 10, 2015

I love finding and sharing good articles about data science related topics on twitter, but I know not everyone is on twitter, and also sometimes tweets get quickly lost in the timeline and they’re easy to miss. So, I’ve started sharing the best articles via a Flipboard magazine as well! Check it out! https://flipboard.com/@becomingdatasci/becoming-a-data-scientist-5ktft1lky

Read More

Data Visualization Project

May 10, 2015

You may have seen me tweeting about some research I did on “Data Visualization for Exploratory Data Analysis” for my Cognitive Systems Engineering course. My presentation went really well! I’m less satisfied with the paper since it was done in a hurry to complete the project deliverables, but i’m including it because it explains some things that aren’t obvious from the powerpoint without my commentary. Principles of Data Visualization for Exploratory Data Analysis [presentation – pdf] Principles of Data Visualization for Exploratory Data Analysis [paper – [pdf] Check out the references in both documents for some good resources. I’ll include some links in the post below, too. I had a lot more material from my research that I wanted to include and just didn’t have time to in a 15-minute presentation! The professor was happy about the topic I picked because she’s teaching a class on Data Visualization next semester, so I think that worked out in my favor :) These two books by Stephen Few covered the very basics of visualization for human perception: Show Me the Numbers: Designing Tables and Graphs to Enlighten Now You See It: Simple Visualization Techniques for Quantitative Analysis Blog posts about related topics: Six Revisions: Gestalt Laws eagereyes: Illustration vs Visualization Detailed visualization of NBA shot selection Publications and articles: IEEE Transactions on Visualization and Computer Graphics Toward a Perceptual Science of Multidimensional Data Visualization: Bertin and Beyond by Marc Green, Ph. D. Scagnostics by Dang and Wilkinson Generalized Plot Matrix (GPLOM) by Im, McGuffin, Leung UpSet: Visualization of Intersecting Sets by Lex, Gehlenborg, Strobelt, et al. …and there are more resources in the paper and presentation files! (and if you’re REALLY interested in this topic, post a comment and I will add even more links I have bookmarked) I also did a project using some data from my day job related to university fundraising and major gift prospects, but unfortunately I can’t share that study here because I don’t have permission to do so. It included some cool visuals like bubble charts, and also an interesting analysis of movement through the prospect pipeline using Markov Chains. I learned a lot doing that one! It was nice to end my final semester of grad school with two data-related projects! (Yes, I’m finally graduating! Masters of Systems Engineering! woo...

Read More

Data Sciencey Podcasts (Updated)

April 13, 2015

I’ve been listening to a lot of podcasts this semester since I am driving 1 hour each way to class twice a week, and I thought I’d share some good ones I’ve found. I started out by listening to the entire season of Serial (which I recommend!), then switched to fun and sciencey ones for my commutes after that. I found a few that are data-science-related and wanted to share them here! (the title of each section is a link to the podcast’s homepage) The Talking Machines This podcast about machine learning is educational and, though academic, is pretty accessible to people interested in learning more about the field even if you’re new to it. It is executive produced and co-hosted by Katherine Gorman along with co-host Ryan Adams, an Assistant Professor of Computer Science at Harvard. They start out by interviewing attendees and presenters from the NIPS (Neural Information Processing Systems) conference, including Hannah Wallach and Max Welling, among others. Today, I listened to this episode with Charles Sutton, who covered some interesting topics such as using machine learning and natural language processing on computer code for tasks such as understanding how different programmers involved in open source projects name variables, and suggesting naming conventions to new project participants. The larger goals of the research were really interesting to me, and Charles Sutton was really clear and easy to understand, even though he was touching on some heavy concepts. I also like how the show answers questions submitted by listeners each episode. Partially Derivative This show has two director/developer/data scientists from Ushahidi, Chris Albon and Jonathan Morgan, who talk about recent data science items in the news, and chat about the implications and add their opinions. I like that they link to the news articles on the podcast site so you can read up on what they’re referring to. Honestly, I didn’t enjoy this one as much as I did the others. The episode I listened to started out with each of them explaining what beer and wine they were drinking, and how much they had had, and some inside joking and laughing, which already made me cringe a bit (their latest episode is called “morning drinking edition”), but I wanted to hear them out. They talk about plenty of interesting topics, but I had already read most of the articles they referred to via twitter. They had some good insights, including a discussion about Uber and data collection (and data selling) by companies in general, and some interesting food for thought about what all of that means for us in the future, but overall, I found their “bro-y” banter a bit annoying. However, if you don’t get a chance to keep up with data science in the news, or you enjoy feeling like you’re hanging out with some guys from school drinking and chatting about data science topics, then definitely give it a listen. I imagine a lot of people would like their style more than I did – just not my thing. TED Radio Hour TED Radio Hour is an NPR production where they take TED talks and group them by topic, then Guy Raz interviews the speakers and basically refactors the talks so they tell an overarching story as a group and sound good on radio. The episode I wanted to point you to is “Solve for X” because it made me think about math in a fun way, and they do incorporate some talks about machine learning algorithms into this one as well. This podcast is one of my go-tos when I want to learn something interesting that is presented in a fun and curious way. Invisibilia Invisibilia isn’t a tech podcast, but does sometimes talk about technology. The episode, “Our Computers, Ourselves” tells the stories of some interesting people that “let technology go to their heads”, I guess you can say. The only thing I don’t like about the show is that sometimes think the intelligent co-hosts Alix and Lulu purposely make themselves come across ditzier than necessary, but overall I highly recommend checking this one out. Also take a listen to the episode “How to Become Batman” that might change your mind about how expectations impact outcomes, even in scientific research. Snap Judgment Last but not least, I happened upon NPR’s Snap Judgment podcast with Glynn Washington because it had an episode called “Artificial Intelligence” that came up in a search. It turned out not to be about AI in the sense that developers think about AI, but it has some great storytelling involving human-computer interactions, and really made me think about the human side of all...

Read More