Data Visualization Project

May 10, 2015

You may have seen me tweeting about some research I did on “Data Visualization for Exploratory Data Analysis” for my Cognitive Systems Engineering course. My presentation went really well! I’m less satisfied with the paper since it was done in a hurry to complete the project deliverables, but i’m including it because it explains some things that aren’t obvious from the powerpoint without my commentary. Principles of Data Visualization for Exploratory Data Analysis [presentation – pdf] Principles of Data Visualization for Exploratory Data Analysis [paper – [pdf] Check out the references in both documents for some good resources. I’ll include some links in the post below, too. I had a lot more material from my research that I wanted to include and just didn’t have time to in a 15-minute presentation! The professor was happy about the topic I picked because she’s teaching a class on Data Visualization next semester, so I think that worked out in my favor :) These two books by Stephen Few covered the very basics of visualization for human perception: Show Me the Numbers: Designing Tables and Graphs to Enlighten Now You See It: Simple Visualization Techniques for Quantitative Analysis Blog posts about related topics: Six Revisions: Gestalt Laws eagereyes: Illustration vs Visualization Detailed visualization of NBA shot selection Publications and articles: IEEE Transactions on Visualization and Computer Graphics Toward a Perceptual Science of Multidimensional Data Visualization: Bertin and Beyond by Marc Green, Ph. D. Scagnostics by Dang and Wilkinson Generalized Plot Matrix (GPLOM) by Im, McGuffin, Leung UpSet: Visualization of Intersecting Sets by Lex, Gehlenborg, Strobelt, et al. …and there are more resources in the paper and presentation files! (and if you’re REALLY interested in this topic, post a comment and I will add even more links I have bookmarked) I also did a project using some data from my day job related to university fundraising and major gift prospects, but unfortunately I can’t share that study here because I don’t have permission to do so. It included some cool visuals like bubble charts, and also an interesting analysis of movement through the prospect pipeline using Markov Chains. I learned a lot doing that one! It was nice to end my final semester of grad school with two data-related projects! (Yes, I’m finally graduating! Masters of Systems Engineering! woo...

Read More

Data Sciencey Podcasts (Updated)

April 13, 2015

I’ve been listening to a lot of podcasts this semester since I am driving 1 hour each way to class twice a week, and I thought I’d share some good ones I’ve found. I started out by listening to the entire season of Serial (which I recommend!), then switched to fun and sciencey ones for my commutes after that. I found a few that are data-science-related and wanted to share them here! (the title of each section is a link to the podcast’s homepage) The Talking Machines This podcast about machine learning is educational and, though academic, is pretty accessible to people interested in learning more about the field even if you’re new to it. It is executive produced and co-hosted by Katherine Gorman along with co-host Ryan Adams, an Assistant Professor of Computer Science at Harvard. They start out by interviewing attendees and presenters from the NIPS (Neural Information Processing Systems) conference, including Hannah Wallach and Max Welling, among others. Today, I listened to this episode with Charles Sutton, who covered some interesting topics such as using machine learning and natural language processing on computer code for tasks such as understanding how different programmers involved in open source projects name variables, and suggesting naming conventions to new project participants. The larger goals of the research were really interesting to me, and Charles Sutton was really clear and easy to understand, even though he was touching on some heavy concepts. I also like how the show answers questions submitted by listeners each episode. Partially Derivative This show has two director/developer/data scientists from Ushahidi, Chris Albon and Jonathan Morgan, who talk about recent data science items in the news, and chat about the implications and add their opinions. I like that they link to the news articles on the podcast site so you can read up on what they’re referring to. Honestly, I didn’t enjoy this one as much as I did the others. The episode I listened to started out with each of them explaining what beer and wine they were drinking, and how much they had had, and some inside joking and laughing, which already made me cringe a bit (their latest episode is called “morning drinking edition”), but I wanted to hear them out. They talk about plenty of interesting topics, but I had already read most of the articles they referred to via twitter. They had some good insights, including a discussion about Uber and data collection (and data selling) by companies in general, and some interesting food for thought about what all of that means for us in the future, but overall, I found their “bro-y” banter a bit annoying. However, if you don’t get a chance to keep up with data science in the news, or you enjoy feeling like you’re hanging out with some guys from school drinking and chatting about data science topics, then definitely give it a listen. I imagine a lot of people would like their style more than I did – just not my thing. TED Radio Hour TED Radio Hour is an NPR production where they take TED talks and group them by topic, then Guy Raz interviews the speakers and basically refactors the talks so they tell an overarching story as a group and sound good on radio. The episode I wanted to point you to is “Solve for X” because it made me think about math in a fun way, and they do incorporate some talks about machine learning algorithms into this one as well. This podcast is one of my go-tos when I want to learn something interesting that is presented in a fun and curious way. Invisibilia Invisibilia isn’t a tech podcast, but does sometimes talk about technology. The episode, “Our Computers, Ourselves” tells the stories of some interesting people that “let technology go to their heads”, I guess you can say. The only thing I don’t like about the show is that sometimes think the intelligent co-hosts Alix and Lulu purposely make themselves come across ditzier than necessary, but overall I highly recommend checking this one out. Also take a listen to the episode “How to Become Batman” that might change your mind about how expectations impact outcomes, even in scientific research. Snap Judgment Last but not least, I happened upon NPR’s Snap Judgment podcast with Glynn Washington because it had an episode called “Artificial Intelligence” that came up in a search. It turned out not to be about AI in the sense that developers think about AI, but it has some great storytelling involving human-computer interactions, and really made me think about the human side of all...

Read More

Girl Develop It! Meetup

March 2, 2015

In the past, I’ve been “semi-anonymous” on this blog, not advertising my real name or employment. However, if I’m going to start doing “appearances”, I might as well make it easy for people to match up my blog with my name! So hello, I’m Renee Marie Parilak Teate, and I’m becoming a data scientist :) I’m about to give a talk/Q&A for our local Girl Develop It! Central Virginia chapter via Google Hangout on March 18 at 7pm EST as part of their “Day in the Life of” series featuring women in technical roles. Here’s the link: Day in the Life of a Data Analyst PLUS “What is Data Science?” with Renee Teate I would love for any ladies following this blog to attend and say hi! Update I think the meetup went well! (Other than issues with Google Hangouts On Air not allowing as many people as I read it would allow… I need to figure out how to include “viewers” that aren’t in the limited set of “video participants” for the future.) I was nervous and ad-libbing a lot of it, trying to balance between making it understandable for beginners, but including some info for the more advanced viewers. It’s tough giving a talk like this for a general audience. I had fun, though! I’ve gotten good feedback from a few attendees, and we also discussed a future meetup focusing on Geospatial/GIS data, so that was exciting! Here is the powerpoint from the data science part of my talk: What is Data Science? (PDF) The books and some of the courses we talked about are listed here, and there are links to the ones I have reviewed: Learning page For those of you that weren’t able to attend, there is a recording of the meetup here: YouTube I’ll list more links in the comments as I come across them this...

Read More

My “What is Data Science?” Talk

November 19, 2014

I got a chance to tell the undergraduate students at JMU about data science tonight! Despite the cold weather and short notice (the invites went out 2 days before), over 20 students showed up for the talk, which was hosted by the IEEE Computer Society club. The students’ majors included Integrated Science & Technology, Computer Science, Health Science, Information Analysis, Computer Information Systems, and others. It was nice to see that variety! A few professors and staff members were there as well. Some of the audience members at my talk I guess they liked the talk, which was 30-40 minutes, because we ran right up to the time another group needed to use the classroom, and several students stayed afterward to ask more questions in the hallway. Their feedback ranged from “I had never really considered data science, but your talk had me interested, and now I want to look into it!” to another student who already won a machine learning competition in his computer science class. Students shared their data project ideas with me, and asked about whether they were on the right track to possibly have a future in data science. One student took a photo of the books I suggested reading, and I hope several more of them come here afterwards to download the slides with all of the references and links. (I wish I had them done far enough in advance to have this post ready!) If you are here and you attended my talk, please leave any feedback or questions in the comments below! I may be giving it again, so suggestions are encouraged! Here is a link to the slides (PDF, so no animations, but the links should work): What is Data Science? And the companion notes with extra info: Companion Notes I was excited to get to share my love of data, and get the students thinking about “big data”, how much data they generate, privacy issues, the vast possibilities of data analysis and data science, and a possible future in this field! Of course, the only photo I got of me, my eyes are closed! I’ll update if someone sends a better one Students: I have removed my full name and personal email from the slides for publication here, so just leave a message below in the comments if you want to get in touch, and I’ll pass my info on to you individually. (Don’t type your email in the comments, I should be able to see it behind the...

Read More