Verena, David, Kerry, and Anthony are members of the Becoming a Data Scientist Podcast Data Science Learning Club! They appear in the order in which they joined the club, and each discuss their starting points before joining, their participation in the activities, and advice they have for new data science learners.
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
I have been thinking about doing a “Becoming a Data Scientist” podcast for a long time, at least since April. The podcast would include interviews focused on how people working in various data-science-related jobs got to where they are today (how did they “become a data scientist”?). I’m getting closer to taking the dive and getting it started.
I had an idea today that would take it a step further. Imagine how book clubs work where you pick a book, go off and read it, then gather occasionally to discuss and record your thoughts. Except instead of a book club, it’s a data science learning club!
An organization based in Puerto Rico called “Broadening Participation in Data Mining” (BPDM) interviewed me over the weekend, and it’s online now! Without further ado…. Thanks to Orlando and Herbierto for having me on! (P.S. I did put up the post about Data Sources on DataSciGuide)
Last night, my husband and I watched The Imitation Game. First of all, it’s a great movie and you should see it. Secondly, there was a moment that got me thinking about the human element of machine learning.
[Spoiler Alerts – but you probably already know much of the story, and the movie is still good even if you know the historical outcome.]
I thought a moment like this may be coming when Alan Turing was first applying to work at Bletchley Park, and Denniston can’t believe he’s applying to be a Nazi codebreaker without even knowing how to speak German. Alan emphasizes that he is masterful at games and solving puzzles, and that the Nazi Enigma machine is a puzzle he wants to solve. He starts designing and building a machine that will theoretically be able to decode the Nazi radio transmissions, but the decoder settings change every day at 12am, so the machine must solve for the settings before the stroke of midnight every day in order for the day’s messages to be decoded in time to be useful and not interfere with the next day’s decoding process. Turing can’t prove his machine will work, simply because it is simply taking too long to solve the daily puzzle. In the meantime, people are dying in the war, and the Nazis are going on transmitting their messages over normal radio waves believing the code is “unbreakable”.
OK So I was actually hoping to show this to you all long ago, and I kept coming up with more and more ideas for it, so it’s not going to be “ready” to reveal for a while, but I figured I’d go ahead and show it to you anyway. My main motivation is that I keep hearing people say (and sometimes feel myself) that learning to becoming a data scientist on your own using online resources is totally overwhelming: there are so many different possible topics to dive into, few really good guides, lots of impostor-syndrome-inducing posts by people you follow that make you feel like they’re so far ahead of where you are and you’ll *never* get there…. but there’s so much great data science learning content online for everyone from beginners to experienced data scientists! We need a better way to navigate it. Hence my new website: “Data Sci Guide”. It will eventually have a personalized recommender system and structured learning guides and all kinds of other features to help you find the resources to go from where you are to where you want to be, but for now it’s “just” a directory / content rating site. And it’s not ready for you to interact with yet, but it’s getting there, and I’ll need your help fleshing it all out soon. So go take a look! Then come back here to give me feedback and suggestions, because you have to be registered to comment there and I didn’t turn on new user registration yet. OK go now. Don’t forget to come back! >>>> DATA SCI GUIDE.COM <<< So…. what did you think? What do you think of the overall idea and plans? What should I be sure to remember to include? Tell me below!...
Between an interview from a local TV station about my job and going through the process of hiring someone onto our team, I’ve been thinking about what would be the bare minimum skills someone would need to have a chance at being hired as a data analyst. Maybe this would be a helpful list for someone trying to change careers and trying to decide where to focus their learning time. I posted this picture on Twitter: and got some interesting responses: @BecomingDataSci I'd include familiarity with business process in one of those columns. Can't analyze in a vacuum,. — Karen Clark (@clarkkaren) July 17, 2015 @BecomingDataSci @aflyax You've got analytical thinking & problem solving. Maybe add "adaptable to a variety of environments" as generic? — Karen Clark (@clarkkaren) July 20, 2015 @barbarafenton i mentioned that as a misconception! i spend a lot more time communicating than most people think — Data Science Renee (@BecomingDataSci) July 17, 2015 @DataSkeptic yes i think that's important, but you can get an entry level job w/just basic charting skills. was trying to keep to minimum. — Data Science Renee (@BecomingDataSci) July 17, 2015 @BecomingDataSci so e.g. "SQL" could be "data manipulation skills (e.g. SQL)" – don't get hung up on a specific tool to to the job! 2/2 — Martin Monkman (@monkmanmh) July 17, 2015 @BecomingDataSci This is great! My ready-fire-aim data science side says to add "asking forgiveness is easier than permission" to traits :P — Shannon Quinn (@SpectralFilter) July 17, 2015 @BecomingDataSci I'd add : autodidact — craig pfeifer (@aCraigPfeifer) July 17, 2015 What do you think? I’ll revisit this topic later, and I’ll also post about the conference I’m attending (APRA Data Analytics Symposium) when I have a chance to summarize. For the moment, heading back to the...