twentytwentyone domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home1/moderna7/public_html/wp-includes/functions.php on line 6131Recorded at Tom Tom Fest Applied Machine Learning Conference in Charlottesville, VA on April 11, 2019.
Episode 17 Audio
@therriaultphd on twitter
Data and Democracy
This O’Reilly ebook by Andrew Therriault explores how political data science helps to drive everything from overall strategy and messaging to individual voter contacts and advertising.
Data Security for Data Scientists by Andrew Therriault on Medium
Ten practical tips for protecting your data (and more importantly, everyone else’s!)
World premiere
#becomingadatascientist podcast by the famous and fabulous, with guest, Andrew Therriault of @facebook @BecomingDataSci#AMLCville #datascience #machinelearning @UVADSI @TomTomFest pic.twitter.com/2EwjsUwULL
— Data Science Connect (@DataScienceATL) April 11, 2019
]]>Our first ever live audience for Becoming a Data Scientist podcast at #AMLCville! With @therriaultphd! pic.twitter.com/JA1RpiKq0u
— Data Science Renee (@BecomingDataSci) April 11, 2019
This is a conference I’ve helped plan since the beginning, and it’s grown in 3 years from a single theater with a partial day of talks to 4 theaters with non-stop presentations all day, plus keynotes in an even larger venue!
An exciting announcement from me is that I will be recording a short episode of my Becoming a Data Scientist Podcast in front of a live audience for the first time at the AMLC! I’ll be interviewing Andrew Therriault, one of our keynote speakers, about how he became an Infrastructure Data Science Manager at Facebook, after starting out with degrees in politics, and working as the Director of Data Science for the Democratic National Committee, and the Chief Data Officer for the City of Boston, among other roles.
I am featuring an AMLC speaker each day with the #AMLCville hashtag on twitter. You can learn more about all of our speakers (still more to be added!) and get tickets on the conference website. Hope to see you there!
]]>Podcast Audio Links:
Link to podcast Episode 16 audio
Podcast’s RSS feed for podcast subscription apps
Podcast on Stitcher
Podcast on iTunes
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Data Science Learning Club Activity 16 – Genetic Algorithms
Data Science Learning Club Meet & Greet
Mentioned in the episode:
Dr. Kenneth Stanley at the University of Central Florida
Michigan State University Artificial Intelligence
BEACON NSF Science and Technology Center at MSU
Moneyball (book)
Data Science Handbook (book)
]]>Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Data Science Learning Club Activity 15 – Explain an Analysis (Communication)
Data Science Learning Club Meet & Greet
Mentioned in the episode:
NASA Knowledge (@NASAKnowledge on twitter)
Engineering Management
Knowledge Management
Organizational Learning
Knowledge Engineering
Information Architecture
Data Analysis
Neo4j
Elasticsearch
IHS Goldfire
MongoDB
@davidmeza1 on Twitter
David Meza on LinkedIN
Southern Data Science Conference in Atlanta, GA on April 7, 2017 (Coupon code RENEE takes 15% off ticket price)
]]>Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Activity 14: Hidden Markov Models
Activity 15: Neural Nets for Text
Data Science Learning Club Meet & Greet
Mentioned in the episode:
Hadley Wickham’s Advanced R book
shinyGEO: a web-based application for analyzing gene expression omnibus datasets
]]>You can stream or download the audio at this link (download by right-clicking on the player and choosing “Save As”), or listen to it in podcast players like iTunes and Stitcher. Enjoy!
Show Notes:
The White House Names Dr. Ed Felten as Deputy U.S. Chief Technology Officer
Edward W. Felten at Princeton University
Dr. Edward Felten on Wikipedia
White House Office of Science and Technology Policy (OSTP)
The Administration’s Report on the Future of Artificial Intelligence (White House Report from October 2016)
Artificial Intelligence, Automation, and the Economy (White House Report from December 2016)
Ed Felten on Twitter: Official / Personal
———
Other Podcasts in this Government Data Series:
I outlined my whole plan here on my Patreon Campaign. You’ll see a new page on this site soon acknowledging supporters, and I’ll update you on the progress.
Whether you can give financially, or even if you just share the campaign with your data science friends, you are helping Becoming a Data Scientist podcast, the learning club, Data Sci Guide, Jobs for New Data Scientists, and all of my websites get off the ground! Thank you!!
]]>Episode Audio (mp3) – also available on iTunes, Stitcher, etc.
(note, there is no video for this episode)
On the panel:
]]>Becoming a Data Scientist Podcast Interviews YouTube Playlist
or listen to/download the full audio episodes via the blog. Here are the links to the blog posts (with links to everything else), and the audio itself for those first four episodes:
Episode 0: Renee Teate (me) and intro to the podcast
Audio Only (with MP3 download link)
Episode 1: Will Kurt – English/Library Science to Data Science
Audio Only (with MP3 download link)
Episode 2: Safia Abdalla – College Student, Conference Speaker, Python/Jupyter Contributor
Audio Only (with MP3 download link)
Episode 3: Shlomo Argamon – Director of Master of Data Science program at IIT
Audio Only (with MP3 download link)
Click through to any episode for links to the RSS subscription feeds, links to the learning club activities, etc.
Enjoy!
]]>Please fill out the survey and share it with your friends and followers on social media! The survey is a little long/detailed, but most of it is optional. I value your opinions! Thank you so much for participating!!
]]>Podcast Audio Links:
Link to podcast Episode 13 audio
Podcast’s RSS feed for podcast subscription apps
Podcast on Stitcher
Podcast on iTunes
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Learning Club Activity 13: Show & Tell
Data Science Learning Club Meet & Greet
Links to topics mentioned by Debbie in the interview:
Metis Data Science Training
[more coming soon]
Podcast Audio Links:
Link to podcast Episode 12 audio
Podcast’s RSS feed for podcast subscription apps
Podcast on Stitcher
Podcast on iTunes
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Data Science Learning Club Meet & Greet
1) Verena Haunschmid
Data Science Learning Club Activity 07: Linear Regression
Verena’s Results for Linear Regression on Salary Dataset
Verena’s website
@ExpectAPatronum on Twitter
2) David Asboth
City University London Msc Data Science
Data Science Learning Club Activity 02: Creating Visuals for Exploratory Data Analysis
David’s results exploring London Underground data
Data Science Learning Club Activity 07: K-Means Clustering
David’s results using k-means to draw puppies in 3 colors
FlyLady (the house cleaning system I mentioned)
David’s website
@davidasboth on Twitter
3) Kerry Benjamin
Data Science Learning Club Activity 01: Find, Import, and Explore a Dataset
Kerry’s results for Activity 1 IGN Game Review Data exploration
Data Science Learning Club Activity 02: Creating Visuals for Exploratory Data Analysis
Kerry’s Blog Post about Activity 02 – “My First Data Set Part 2: The Fun Stuff”
Blog post about Data Camp – “The Data Science Journey Begins”
Kerry’s blog post “Getting Started in Data Science: A Beginner’s Perspective”
Kerry’s Blog “The Data Logs”
@kerry_benjamin1 on Twitter
4) Anthony Peña
molecular biology
biotechnology
Data Science Learning Club Activity 07: K-Means Clustering
Anthony’s results for Activity 07
Podcast Audio Links:
Link to podcast Episode 11 audio
Podcast’s RSS feed for podcast subscription apps
Podcast on Stitcher
Podcast on iTunes
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
[learning club activity coming soon]
Data Science Learning Club Meet & Greet
Links to topics mentioned by Stephanie in the interview:
Total Domination in Graph Theory (pdf)
Some research publications by Stephanie:
Machines Watch you Surf the Web
Total domination dot-stable graphs
The University of Tennessee Knoxville Center for Intelligent Systems and Machine Learning (CISML)
UTK Distributed Intelligence Laboratory
UTK Infant Perception Action Laboratory
Natural Language Processing (NLP)
Explore Data Science (now via Metis)
]]>Podcast Audio Links:
Link to podcast Episode 10 audio
Podcast’s RSS feed for podcast subscription apps
Podcast on Stitcher
Podcast on iTunes
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
[learning club activity coming soon]
Data Science Learning Club Meet & Greet
Links to topics mentioned by Trey in the interview:
Commodore VIC-20
Bulletin Board
C++
Pascal
BASIC
Virginia Tech
Odyssey of the Mind
University of Washington Sociology
Complexity Theory and organizations
[more links to come! …sorry for all of the delays on getting this episode out! -Renee]
]]>Podcast Audio Links:
Link to podcast Episode 9 audio
Podcast’s RSS feed for podcast subscription apps
Podcast on Stitcher
Podcast on iTunes
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Learning Club Activity 9: Normalization [coming soon]
Data Science Learning Club Meet & Greet
Links to topics mentioned by Justin in the interview:
European Starling

video of starling singing
European Starling song file from Justin [1 min wav]
bird song recursive syntactic structure
Jobs for New Data Scientists website mentioned by Renee after interview
]]>Renee interviews computational biologist, author, data scientist, and Michigan State PhD candidate Sebastian Raschka about how he became a data scientist, his current research, and about his book Python Machine Learning. In the audio interview, Sebastian also joins us to discuss k-fold cross-validation for our model evaluation Data Science Learning Club activity.
Podcast Audio Links:
Link to podcast Episode 8 audio
Podcast’s RSS feed for podcast subscription apps
Podcast on Stitcher
Podcast on iTunes
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Learning Club Activity 8: Evaluation Metrics [coming soon]
Data Science Learning Club Meet & Greet
Links to topics mentioned by Sebastian in the interview:
Sebastian’s Python Machine Learning repository on GitHub
Python Machine Learning Book on DataSciGuide
scikit-learn – Voting Classifier
logistic regression (from Sebastian’s github)
regularization in logistic regression (from Sebastian’s github)
@rasbt on Twitter
Sebastian Raschka on Quora
Data Scientist, Author, and manager of data science teams Enda Ridge talks to us about data governance, data provenance, reproducible analysis, work pipelines and products, and people, among other topics covered in his book “Guerrilla Analytics – A practical Approach to Working with Data: The Savvy Manager’s Guide”.
Podcast Audio Links:
Link to podcast Episode 7 audio
Podcast’s RSS feed for podcast subscription apps
Podcast on Stitcher
Podcast on iTunes
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Learning Club Activity 7: Linear Regression [coming soon]
Data Science Learning Club Meet & Greet
More show Notes Coming Soon!
Enda’s book on Amazon:
]]>In this episode, Renee interviews Bioinformatics PhD and Data Scientist Erin Shellman about her path to becoming a data scientist, including jobs at Nordstrom Innovation Lab and zymergen. Erin discusses school, job interviews, teaching, and eventually getting to do data science within her field of scientific expertise.
Podcast Audio Links:
Link to podcast Episode 6 audio
Podcast’s RSS feed for podcast subscription apps
Podcast on Stitcher
Podcast on iTunes
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Learning Club Activity 6: k-Means Clustering [coming soon]
Data Science Learning Club Meet & Greet
Bioinformatics
Evolutionary Biology
Economics Game Theory
Machine Learning
Biostatistics
Information Science
Systems Biology
Systems Modeling
Comparative Genomics
Nordstrom Innovation Lab (old innovation lab links inactive – appears to be the Nordstrom Technology People Lab now)
Jim Vallandingham (d3)
Crushed It! Landing a Data Science Job
University of Michigan Computational Medicine and Bioinformatics
R
dplyr
ggvis
ggvis interactive controls
ggplot2
R Markdown
Hadley Wickham
Elements of Statistical Learning book
BI Tech CP303 (course Erin taught at University of Washington – use arrow keys to go through slides)
GitHub repository for class
regression
classification – logistic regression, trees
market basket analysis
clustering
UW Business Intelligence Certification
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Learning Club Activity 5: Naive Bayes Classification
Data Science Learning Club Meet & Greet
Resources/topics mentioned by Clare in the interview:
Management Science and Engineering
Markov Chains
Science, Technology, and Society at Stanford
A Challenge to Data Scientists (blog post Renee mentioned)
Mattermark
Product Management
Machine Learning
Open Source Data Science Masters
Nate Silver’s book The Signal and the Noise
Linear Algebra (on Khan Academy)
Bill Howe’s Introduction to Data Science Coursera Course
Recurrent Neural Nets
Bayesian Networks
Open Source Data Science Masters on GitHub (pull requests welcome!)
summer.ai (Update 2/15 – Clare’s company is now Luminant Data, Inc.)
@ClareCorthell on twitter
Other links:
SlideShare Slides about Open Source Data Science Masters
Talk Clare gave at Wrangle Conference about AI Design for Humans
]]>In Episode 4 of the Becoming a Data Scientist Podcast, we meet Sherman Distin, owner of analytics consulting firm QueryBridge. We discuss his primarily self-taught path to learning the data science techniques he uses to find business insights in marketing data, and he also tells us what he thinks is the most important trait he looks for in data scientists.
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Learning Club Activity 4: Learn a New Math Concept [to be posted Tuesday]
Data Science Learning Club Meet & Greet
Resources/topics mentioned by Sherman in the interview:
QueryBridge (Sherman’s business)
Target Pregnant Customer story
Survival Analysis
Proportional Hazards Model
@ShermanDistin on Twitter
]]>In Episode 3 of the Becoming a Data Scientist Podcast, we meet Shlomo Argamon, who is the founding director of the Master of Data Science program at Illinois Institute of Technology. He talks to us about his path to data science, including research in robotic vision and natural language processing, we discuss the traits of a good data science student, and he gives some advice for those of us learning data science.
Podcast Audio Links:
Link to podcast Episode 3 audio
Podcast’s RSS feed for podcast subscription apps
Podcast on Stitcher
Update 1/19: You should be able to find it on iTunes now!
Podcast Video Playlist:
Youtube playlist of interview videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Learning Club Activity 3: Business Questions and Communicating Data Answers [to be updated Monday]
Data Science Learning Club Meet & Greet
Here are the links to things Shlomo references in the video:
Illinois Institute of Technology – Professional Master of Data Science Degree
machine vision
robotic mapping
Google Scholar Search for Shlomo Argamon’s publications related to robotics
“Passive map learning and visual place recognition” Doctoral Dissertation [ps.gz from yale]
probability theory
probability distributions
statistical inference
bayesian statistics
Natural Language Processing (NLP)
Google Scholar Search for Shlomo Argamon’s publications related to language
“Automatically Categorizing Written Texts by Author Gender” [Moshe Koppel, Shlomo Argamon, and Anat Rachel Shimoni]
Weka
scikit-learn
Natural Language Toolkit (nltk)
Ethics in Data Science at IIT
Becoming a Data Scientist – A Challenge to Data Scientists (re: bias)
In Episode 2 of the Becoming a Data Scientist Podcast, we meet Safia Abdalla, who started programming and even exploring machine learning and natural language processing as a teenager, and is now a student at Northwestern University, a conference speaker and trainer, co-organizer of PyLadies Chicago, and a contributor to Project Jupyter.
Podcast Audio Links:
Link to podcast Episode 2 audio
Podcast’s RSS feed for podcast subscription apps
(I will distribute the feed out to iTunes and Pocket Cast ASAP. It’s available on Stitcher now!)
Podcast Video Playlist:
Youtube playlist where I’ll publish future videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Learning Club Activity 2: Creating visuals for exploratory data analysis
Data Science Learning Club Meet & Greet
Here are the links to things Safia references in the video:
information retrieval
Introduction to Information Retrieval by C. D. Manning, P. Raghavan, H. SchĂĽtze
natural language processing
NLTK
machine learning
Northwestern Neuroscience and Robotics Lab
pyladies
Chicago PyLadies Meetups
mathematicalmonk’s YouTube series on machine learning
@captainsafia on twitter
Safia’s website
Safia’s blog
JupyterDay Chicago 2016 (post by Safia on jupyter.org)
Jupyter documentation
There is also a built-in audio player here on the blog that I link to in each episode: https://www.becomingadatascientist.com/podcast/
I’m working on getting a logo now, so hopefully it won’t have a placeholder image for long :) I want to submit it to iTunes, but I have to download the dreaded iTunes desktop software in order to submit and manage it… ugh ridiculous.
Anyway, enjoy it on the blog and on Stitcher for now!
]]>In this episode we meet Will Kurt, who talks about his path from English & Literature and Library & Information Science degrees to becoming the Lead Data Scientist at KISSmetrics. He also tells us about his probability blog, Count Bayesie, and I introduce Data Science Learning Club Activity 1. Will has some great advice for people learning data science!
Podcast Audio Links:
Link to podcast Episode 1 audio
Podcast’s RSS feed for podcast subscription apps
(I will distribute the feed out to sites like iTunes and Stitcher this week)
Podcast Video Playlist:
Youtube playlist where I’ll publish future videos
More about the Data Science Learning Club:
Data Science Learning Club Welcome Message
Learning Club Activity 1: Find and explore a dataset
Data Science Learning Club Meet & Greet
Here are the links to things Will references in the video:
Library and Information Science
Andrew Ng’s Machine Learning course on Coursera
probabalistic graphical models
Count Bayesie blog
Count Bayesie – Parameter Estimation and Hypothesis Testing
Donald Knuth
Literate programming
Claude Shannon’s Mathematical Theory of Communication
Count Bayesie – Measure Theory
Bayes’ Theorem with Lego
Voight-Kampff and Bayes Factor
Black Friday Puzzle – Markov Chains
Zen Buddhism concept of “beginner’s mind”
Count Bayesie Recommended Books on Probability and Statistics
]]>In this episode, I talk a little about the podcast, I talk about my own background, and I introduce the Data Science Learning Club. Enjoy!
(Note: Episode 1, the first interview episode, comes out Monday 12/21!)
Podcast Audio Links:
Link to podcast Episode 0 audio
Podcast’s RSS feed for podcast subscription apps
(I will distribute this out to sites like iTunes and Stitcher soon)
Podcast Video Playlist:
Youtube playlist where I’ll publish future videos
More about the Data Science Learning Club:
Blog post about Data Science Learning Club
Learning Club Activity 0: Set up your development environment
Data Science Learning Club Meet & Greet
Here are the links with more info of things I reference in the video:
turtle logo programming language
carmen sandiego
lemmings
SimCity
JMU Integrated Science and Technology (ISAT)
Visual Basic/VB.NET/ASP.NET
MS Access
PL/SQL
Oracle Data Warehouse
IBM Cognos
CGEP UVA Systems Engineering
Systems Engineering
Linear Algebra at Khan Academy
Stochastic Simulation
Optimization
Cognitive Systems Engineering
Principles of Data Visualization for Exploratory Data Analysis
Machine Learning
Naive Bayes
K-Means
Pattern Recognition and Machine Learning (class textbook)
Summer of Data Science
API and Market Basket Analysis
Jupyter
Docker and Jupyter
Doing Data Science by Cathy O’Neill and Rachel Schutt
O’Reilly Data Science Books
(I’ll post more specific books later)
At the end of each podcast episode, I’ll be “assigning” a “Learning Activity” for the Data Science Learning Club. So that is starting tomorrow, too! There won’t be anyone teaching the content, but we’ll be exploring it together for 1-2 weeks between podcast episodes (usually 2 weeks). I’ll post some resources to get everyone started and help out data science beginners, then we’ll each explore the activity on our own with whatever tools and techniques we choose, and we can post our results so we can all learn from one another. If anyone gets stuck, you can post a question to the forum and hopefully someone will be able to help you through it.
I just got the Data Science Learning Club forum set up today, and it’s at this URL: https://www.becomingadatascientist.com/learningclub
Go check it out, register so you can participate, read the Welcome thread, and introduce yourself in the Meet & Greet section! Then tomorrow, the first learning activity will launch and you can get started.
I’m so excited about launching this podcast and data science learning club, and hope this turns out to be a valuable experience for all of us! Keep an eye out on the blog for the podcast post, which should go up tomorrow!
Renee
]]>