Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the twentytwentyone domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home1/moderna7/public_html/wp-includes/functions.php on line 6131

Warning: Cannot modify header information - headers already sent by (output started at /home1/moderna7/public_html/wp-includes/functions.php:6131) in /home1/moderna7/public_html/wp-includes/feed-rss2-comments.php on line 8
Comments on: The Imitation Game, and the Human Element in Data Science https://www.becomingadatascientist.com/2015/08/08/the-imitation-game-and-the-human-element-in-data-science/ Documenting my path from "SQL Data Analyst pursuing an Engineering Master's Degree" to "Data Scientist" Sun, 04 Oct 2015 19:01:51 +0000 hourly 1 https://wordpress.org/?v=6.9.4 By: Renee https://www.becomingadatascientist.com/2015/08/08/the-imitation-game-and-the-human-element-in-data-science/#comment-1942 Sun, 09 Aug 2015 00:57:24 +0000 https://www.becomingadatascientist.com/?p=612#comment-1942 Here’s a video about the enigma machine, and the flaw that was discovered that helped break the code:
https://www.youtube.com/watch?v=V4V2bpZlqx8

]]>
By: Renee https://www.becomingadatascientist.com/2015/08/08/the-imitation-game-and-the-human-element-in-data-science/#comment-1941 Sat, 08 Aug 2015 23:44:50 +0000 https://www.becomingadatascientist.com/?p=612#comment-1941 In reply to Joerg.

Huh interesting that you say it’s as high as 50%. I would tend to agree (and it’s similar when working as a data analyst), and wonder if other people have found the same in their data science roles.

]]>
By: Renee https://www.becomingadatascientist.com/2015/08/08/the-imitation-game-and-the-human-element-in-data-science/#comment-1940 Sat, 08 Aug 2015 23:43:20 +0000 https://www.becomingadatascientist.com/?p=612#comment-1940 In reply to Nicole.

Yep, I agree with you. And I’ve read interviews with several data scientists where they emphasize “make sure you know the question you are trying to answer, and how the answer to that question will be used before you start developing an approach”.

Also, good point about autopilot vs emergency manual mode.

]]>
By: Joerg https://www.becomingadatascientist.com/2015/08/08/the-imitation-game-and-the-human-element-in-data-science/#comment-1939 Sat, 08 Aug 2015 21:19:43 +0000 https://www.becomingadatascientist.com/?p=612#comment-1939 Oh I think that the human aspect is the most important aspect of Data Science. You need to communicate your findings, need to meet business needs, need to get the DevOps to help you with your stack etc. I think Machine Learning is like 5%, programming is 45 % and 50 % is communication in Data Science (numbers sampled from a rear end distribution)

]]>
By: Nicole https://www.becomingadatascientist.com/2015/08/08/the-imitation-game-and-the-human-element-in-data-science/#comment-1938 Sat, 08 Aug 2015 21:16:16 +0000 https://www.becomingadatascientist.com/?p=612#comment-1938 Amen. And it’s brute-force-data-science that’s unfortunately going to be what “democratizes” it over the next few years… then, after a few high profile bad decisions (which hopefully don’t involve extensive litigation or threats to human health or safety) the “craft” aspect should creep back in. “Autopilot Mode” is coming (e.g. with BigML and Amazon ML services) but you still need a pilot in case of emergency. Which, in data science, could potentially be *every single time*.

This reminds me of the discussions we were having years ago in astronomy when storage was getting cheaper, and data volumes per unit time were getting bigger and bigger. Easy solution? Just archive all of it, of course! But without being able to effectively describe the original observer’s intent, and store and search that, the value of the data was pretty low. So why spend a few tens of thousands of dollars a year on storing data that really didn’t have much archival value?

I think data science is similar. We really have to cautiously examine what value using a particular model will add… and really examine it in terms of current context and envisioned context. Brute force cloud ML services can’t do that. Nor would we want them to. But guaranteed, a lot of people will be doing just that.

]]>