Something has been bothering me about Data Science Central

So, what I’m about to write about actually occurred a few months ago, but I am reminded of it every day when I receive an email from Data Science Central or see someone tweet an article from the blog network (which includes Analytics Bridge, Big Data News, etc.), so I figured if it’s still bothering me, it’s worth writing about.

In April, I saw a post by Vincent Granville, owner of and primary author at Data Science Central, which said something like

One way we attract women and minorities to Data Science Central is to create accounts that post articles with female profiles and photos, which are not actually written by women. Can you use data science to find these 5 faux bloggers? The winner will receive $500.

I have to try to remember the original post and paraphrase here (I’m sure this is not close to the original text, but I hope I am capturing the message), because the post has now been modified to appear as if the contest was only to find the one “example” fake account with a camel avatar (current version here).

However, you can tell that the original contest was different based on the submission by “Alton” on the site, who was nice enough to hold back the names of the accounts he found in case they weren’t decoys, but was clearly trying to find more accounts than just the “camel” decoy. Below is a screenshot in case it gets modified on the site.

Alton_DSC_comment

My initial response to the original post with the fake female bloggers was originally, “How many sites do this? Is this sexist? It sure is off-putting for this guy Vincent Granville to post articles under fake accounts, pretending to be a woman or underrepresented minority. Do we actually fall for this kind of thing? Is it a widely accepted practice?” I scrolled down and saw that Cory Teshera had posted a comment questioning the practice, and responding incredulously to the approach as well. I posted about it on twitter, showing my surprise and asking questions about the approach (Including her name and tweets here with her permission):

No one had responded, then I checked back to the post and saw that Cory’s comments had been deleted! I couldn’t believe that I was seeing a post talking about attracting women to the site, but the first woman to comment on the approach was being silenced!

I found Cory on Twitter and asked whether she was the one that posted and whether she had deleted the comment or the site had, and she responded:

Then I let her know I might blog about it and we chatted a bit via tweets and DMs. At this point, the blog post had been modified to remove all reference to the practice. Neither of us had received responses from the site. The only response was the “silent” deletion of comments and editing of the contest post.

I was curious at this point, and started browsing Data Science Central to see if I could find any of these fake female accounts. I didn’t have to use any data science methods to find one right away. I just looked at the top featured posts, clicked on one with a female avatar, and found this article:
Good and Not So Good Companies for Data Scientists
Here is Amy’s profile: http://www.datasciencecentral.com/profile/Amy
I see “she” is blogging heavily now since I last looked. She lists no last name, so I can’t look her up anywhere else that way, but I did an image search on Google and found: “Amy” Image Search Results
Amy is spending a lot of time posting on all of the Data Science Central network sites. On the Hadoop360 site, she is listed as “Amy Cordan“. At this point, I was still holding out a glimmer of hope that Amy could be a real woman, and looked her up on LinkedIN. I would have been happy to find out she was actually working for Data Science Central as a writer. Oh look! There is an Amy Cordan on LinkedIN who is listed as a “Data Scientist” with a PhD in Computer Science from Stanford! Her photo looks a little different though… she sure has a lot of endorsements… but she only has one experience listed… “Co-Founder for Data Science Foundation”… let’s check out their site… DataShaping.com. Uh… well, this is clearly a dummy site, and the email address is apparently Vincent Granville’s. I did find this “staff” page, which strangely doesn’t list “co-founder” Amy. It appears the whole profile, including the sparse LinkedIN profile page with the Stanford PhD but no experience other than working on a data science blog, is totally fake.

Anyway, you get the point. Amy does not appear to be a real woman. What really got me is that there is an apparently off-topic response to “Amy”‘s post (linked above) by Vincent Granville about how “Amazon should hire people to improve security on AWS and deal with fake reviews.” Excuse me, mister… you are replying to a post on your own site, which was written by a fake author, which was probably written by you! How hypocritical.

At this point I was totally turned off from Data Science Central, so if the intent of these fake profiles was to attract women to the site, it definitely backfired for me.

Here are my questions now. How many of the females on the site are actually real? Are there many women and minorities joining the site, and are they influenced by these fake accounts falsely making it appear as if more females are participating than actually are? Is this a common practice among technology networking websites? Does it work? Should we accept it as necessary? Has Vincent Granville made any real effort to ask females to write for Data Science Central?

Do the people endorsing “Amy”‘s LinkedIN profile know it is fake? Are they all fake profiles that Vincent Granville created and had endorse each other?

I have so many questions, and am not coming up with many satisfactory answers myself, other than feeling sad and put off by it all. Please let me know what you think!

(P.S. If you’re reading this, Mr. Granville, posts like this that say things like “The first to prove or disprove our conjecture will win $500 and will have his name associated with the theorem in question” aren’t helping you attract any female readers.)

As for me, it has left a bad taste in my mouth, and I’m currently not retweeting anything that I recognize as being from the Data Science Central network, because I just can’t trust anything it produces at this point.

If anyone wants to do any analysis on the site posts, I’m sure there are algorithms out there that can determine whether they’re likely to have all been written by the same author (same typos, style, etc.). I know there is also this analysis tool which is supposed to be able to tell whether a male or female likely wrote a clip of text: Text Gender Classifier (h/t Paul Marks)
The software is of course not perfect, but the text from “Amy”‘s short article above came out to 68% likely to be written by a male, while this article you’re reading right now was classified as 65% likely female.

I guess I’m surprised at how little “sleuthing” I needed to do to see right through all of this. I didn’t spend hours poring over the site, I clicked on the first article I saw with a female author photo, and researched that author’s profile using Google. It’s practically out there in the open, and since Mr. Granville posted the contest – which has now been edited – to identify these faux bloggers, it appears he wasn’t trying to hide the practice.

And by the way, though Vincent Granville apparently has trouble finding females in Data Science to write for his blog, they do exist and aren’t hard to find on twitter or LinkedIN. I’ve started following the data science women I find on Twitter using a twitter list (Please suggest more in the comments!):
Women in Data Science Twitter List

Also check out Meta Brown’s “Binder fulla Women in Analytics” posts on LinkedIN!

31 Comments

  1. Alton
    Jul 7, 2014

    I’m glad I’m not the only one who has had this on their mind. Normally I’m a live and let live kind of guy but it has bothered me too. However ultimately I think that Vincent has good intentions and that his efforts are helping many but sometimes ends don’t justify the means.

    I thought it was a little arrogant to try the community on such a sensitive topic but as a good data scientist, when I have the time, I’m usually up fora good challenge. From the comments you’ll notice that Vincent agreed to reward me the prize. Because I was deemed the winner I took screen shots to capture the agreement and hold him accountable. I didn’t think they would come in handy but since he hasn’t shown the willingness to live up to his side of the deal then I don’t feel guilty for exposing his strange practices regarding fake profiles both on his own network and several other networks including LikedIn.

    The screenshots are located at https://drive.google.com/file/d/0B4JAreDAupYgb0ZEcFNMdHM0YU0/edit?usp=sharing
    https://drive.google.com/file/d/0B4JAreDAupYgeG1iYlZzQWxQb28/edit?usp=sharing

    Also if you are curious. To solve this problem I used a classification method on a feature set of all users of his websites. The feature set included things like time since joined, number of posts, number of likes, favorite website, number of comments, and then the frequency and nature of each of those including common text analytics like word count.

    Note that I used this feature set to also run another post which was deleted but seems to be cached at google http://webcache.googleusercontent.com/search?q=cache:ju9MLUdF3JoJ:www.datasciencecentral.com/xn/detail/6448529:Comment:162518%3Fxg_source%3Dactivity+&cd=1&hl=en&ct=clnk&gl=us

    Last I cant find the complete data set that I used but I do have a portion of it saved for those interested in exploring these users: https://dl.dropboxusercontent.com/u/96237511/recent_DSC_members.csv

    I only spent a couple of hours on this (the bulk of it spent collecting and cleaning the data) so naturally I expected my results to be wrong. Still, here is the list of individuals that I originally found having high probability of being fake according to the training metric and feature set I had.

    http://www.datasciencecentral.com/profile/Amy

    http://www.datasciencecentral.com/profile/Alesia

    http://www.analyticbridge.com/profile/DorothyHewittSanchez

    The one with the camel icon:
    http://www.analyticbridge.com/profile/Titus

    Thanks again for your great journalism and for using best practices while paving the way for many aspiring data scientists.

    Sincerely,
    Alton Alexander
    @10altoids

    • Alesia
      Feb 12, 2015

      Hi guys,

      sorry to disappoint you, but I’m a real woman and I have my profile is real:
      http://www.datasciencecentral.com/profile/Alesia

      If you don’t believe in it, here is my company (DataDrivenBusiness) “Contact us” page: http://www.datadrivenbiz.com/contact-us.php

      I’m running most of our analytics events in HR analytics, Insurance Analytics, Text Analytics, Smart Travel Analytics etc.

      Cheers,
      Alesia.

    • Renee
      Feb 22, 2015

      Hi Alesia!

      The fact that you’re real doesn’t “disappoint” me (unlike Amy, your LinkedIN profile looks believable). I’m curious, have you ever interacted with “Amy”? What’s your take on that account?

      Are you on twitter? I’d like to add you to my Women in Data Science list.

      Renee

  2. ERose
    Jul 11, 2014

    I’ve noticed a very similar practice on social media sites although it appears to be more of a ploy to reach younger demographics than necessarily women or people of color.
    It seems to be commonly used as a tool in politics – ie: one side or the other in a particular election will create fake profiles to engage in the political discussions on Facebook.
    Luckily, enough goes into an authentic online presence that aping one realistically becomes pretty difficult after about 2-3 weeks, especially if anyone does even pretty basic Google work to check you out – as you prove here.
    Anything along these lines definitely turns me off whoever does it.

    On a feminist note here, I wonder how many people have tried to contact “Amy” or “Alesia” as speakers for an event and took their inevitable refusal as evidence that it’s hard to find female speakers? I wonder how many people have seen their LinkedIn profiles and took their lack of experience as evidence that women don’t bother to get involved in the field? I am even less cool with the legitimate potential harm to real diversity efforts than I am with someone attempting to manipulate me with a false effort.

    • Ellie Kesselman
      Sep 20, 2014

      I have been active on Analytic Bridge since 2007. I thought his Ning websites might be useful in my career and job searches. I have education in operations research and math and work in quantitative risk analysis. I didn’t care for Vincent Granville’s anti-vaccination attitude, nor his dislike of the US Census Bureau, both of which he stated on Analytic Bridge. These are my profiles on his websites, and yes, I am real, not a Granville figment!

      http://www.datasciencecentral.com/profile/LKW
      http://www.analyticbridge.com/profile/lek

      I was quite angry about the Titus fake profile, as I had interacted with Titus (him?), not realizing. I thought he was real. Now I feel like a fool.

      As for Amy, I became suspicious in March 2013 when I tried to contact her, and realized she didn’t seem to exist. I came to the same conclusion about Granville’s Amy Cordon today. This post was the third search result returned foe her name by Google! I am in agreement with author Renee and E Rose. This dishonest behavior by Granville is unprofessional, regardless of whether or not it pertains to women. Titus was an elderly man, I thought. The admin of a professional group shouldn’t deliberately deceive members, then arrogantly play guessing games with identity. Talk about losing trust! That he would do this with women for his absurd reasons, including creating fraudulent profiles on LinkedIn is contemptible and violates LinkedIn Terms of Service, at a minimum.

  3. Renee
    Sep 21, 2014

    Thanks for your comments, everyone. I appreciate your additions to this conversation!

    Ellie, your comment about trust is an important one. I think a key factor that will help bring more women into tech is knowing they can trust the people they’re working with to have their best interests in mind. When a major website in the field is faking female profiles in order to appear more diverse, it breaks that trust and drives women away from what could otherwise be a valuable networking resource.

    I’ve been seriously thinking about starting my own site that can serve as an alternative to Data Science Central after I finish grad school (in the spring).

    • Randy Bartlett
      Nov 30, 2014

      Renee,

      Please consider joining our LinkedIn Group: About Data Analysis, as a prelude to your introducing your alternative.

      A number of us met on blogs like Data Science Central. What we saw led us to found our own group, About Data Analysis. There are so many founders watching everything that systematic fraud is unlikely. We are open to more founders/managers too.

    • Renee
      Dec 1, 2014

      I’ll check it out – thanks!

    • Randy Bartlett
      Dec 2, 2014

      Reneee,
      I put a link to this discussion about Data Science Central into the middle of our discussion about same.
      https://www.linkedin.com/groups/Data-science-versus-statistics-solve-8156839.S.5932985228309069826?trk=groups_items_see_more-0-b-cmr
      By Diego Kuonen

      RE: Amy Cordon
      RESP: We noticed that Amy never answered our emails. Then we found that she has at least three completely different pictures. One is from an Obamacare ad and that girl was named in the media—her name is not Amy.

    • Renee
      Dec 10, 2014

      Thanks for posting a link to this discussion. Your info on “Amy” combined with the questions in the LinkedIN discussion cast even more doubt and appear to confirm some of my suspicions. So sketchy…

      Also, I agree with Alton’s comment on Twitter:
      https://twitter.com/10altoids/status/540249639992041473

  4. niubius
    Nov 22, 2014

    Thanks for the very enlightening post. What little I have read from Granville/DSC posts, has been off-putting at best. He frequently makes bizarre & uninformed generalizations about various disciplines of quantitative analysis (“traditional statisticians don’t like newer machine learning algorithms…” huh???); although more often than not I find myself getting lost in his rambling and incoherent writing style. Hearing that he suggests that companies post fake profiles to attract women/minorities is certainly disgusting, but not surprising. This is just more evidence to stay away from whatever bottles of snake oil this guy peddles.

    • Randy Bartlett
      Dec 2, 2014

      See below.

  5. Randy Bartlett
    Nov 30, 2014

    Niubius,
    RE: He frequently makes bizarre & uninformed generalizations about various disciplines of quantitative analysis (“traditional statisticians don’t like newer machine learning algorithms..” huh?)
    RESP: You are correct. The term ‘Machine Language’ was coined by statisticians. Statistical ML is the part for analyzing data. IT ML is the other part for managing data. The term ML just describes the mechanism, not the application. Traditional statisticians deal in the application of analyzing data and are masters of the rebranded Statistical ML tool box.

  6. Randy Bartlett
    Dec 6, 2014

    Censorship

    Here is another LinkedIn group that practices censorship: ‘Statistics And Analytics Consultants Group.’

  7. Anonyme
    Dec 9, 2014

    @Nubius :
    “He frequently makes bizarre & uninformed generalizations about various disciplines of quantitative analysis (“traditional statisticians don’t like newer machine learning algorithms…” huh???); although more often than not I find myself getting lost in his rambling and incoherent writing style.”

    +1 I am happy to see that I am not the only one to think that of him.

    • Renee
      Dec 10, 2014

      Sounds like word is starting to get around as more and more people get suspicious of Data Science Central’s content.

  8. Renee
    Dec 10, 2014

    A twitter conversation following up on this topic:
    https://twitter.com/EllieAsksWhy/status/542369298774126593

  9. Hubart
    Dec 10, 2014

    I checked Vincent Granville’s background, he does not even have a college degree, just high school. Cambridge, PhD, patents, VC funding – it’s all fake. Maybe he/she/it is fake too, maybe a robot. But you can say the same thing about all of us here, how many are real?

    So why don’t we use real data science to beat him. For now, we are just a little group of complainers, who claim me know better than VG, but what about applying our data science knowledge to make Renee’s blog followed by one million people, rather than a couple dozens. That’s the only way to discredit him, Amy and all the fakes. Though making a business out of trashing someone else is not the best way to become successful, but what else can we do?

    • Renee
      Jan 6, 2015

      I’m seriously considering starting an alternative to DSC… in which case I will be very happy to get millions of followers! I’ll keep you all updated on that front :)

      (Not likely to happen until at least May when I finish grad school)

  10. Ronald Stanzach
    Dec 11, 2014

    Hi Hubart,

    I’m going to produce a list of fake data science profiles. Please contact me at rstanzach@datascience.stanford.edu, I would be happy to read your thoughts about Vincent Ganville, Amy Cordon and a few others. Claiming to have a PhD when you don’t even have a college degree is very unethical. These people should be barred from ever earning a college degree.

    Best regards,
    Ronald

  11. Meta Brown
    Dec 15, 2014

    My, I’m pretty late to the party on this discussion, but how enlightening!
    I had noticed all the suspicious female profiles that Alton mentioned here (no algorithms required, you can see with the naked eye that they don’t look real), but it never before occurred to me that Vincent would fake profiles, let alone publicly admit it. It’s a shame that he would do that and taint his reputation.
    On another note, thank you for mentioning Meta’s Binder Fulla Women in Analytics! I have a mammoth post on women authors in analytics in the works.

    • Renee
      Jan 6, 2015

      Happy to mention your “Binder”, and I look forward to your upcoming post!

  12. David Corliss
    Dec 15, 2014

    Went looking for his dissertation – couldn’t find it. Maybe I’m not very good at this kind of thing (although I have been fact-checking job candidates in analytics for a number of years now, so I shouldn’t be entirely clueless). Can anyone else find it? His LinkedIn Bio says Facultés universitaires ‘Notre-Dame de la Paix’
    Ph.D., Statistics, Mathematics, Science
    1983 – 1993

    I also looked into the patents he lists. Now, I *know* I’m no good at reading patent documents. I can see the application dates on the patents and “publication date” – 18 months after the application, the US Patent Bureau “publishes” patents – prior to that, the contents are secret. I don’t see any evidence that any of the patents were granted but maybe I just know how or where to look.

  13. Rick
    Dec 16, 2014

    Glad I see this article before joining Data Science Central website. It is unethical to attract people using dubious practices, and as one commenter points out, faking profiles on LinkedIn violates their terms and conditions. This makes Data Science Central and its founder none other than a dishonest commercial website trying to trick people into buying their products.

  14. System Administrator
    Jan 2, 2015

    I read this thread with growing personal interest. I’ll explain…

    Years ago, I was surfing Craigslist (w4m) and came upon a posting that had a photo of a luscious babe – self-described as a “fractal expert”. Fractals were big in the nineties.

    I wrote to the poster, and received a reply from “Amy Cordan”. However, I was never able to raise up a subsequent response from “her”. I did some web surfing and came up with a link of some sort indicating that “Amy” worked for one Vincent Granville….

    Soooo… being slightly psychotic, I found Vincent’s home phone number up there in the northwest, and I called him to inquire about Amy. He became instantly and intensely agitated, and yelled into the phone “Don’t ever mention her name again to me!” and hung up.

    This was all a long time ago, maybe ten years or so…

    I still have a copy of the fractal, and a sexy photo of Amy in my Temp directory…

    Write to me if you want to see them.

    I also count both Vincent and “Amy” among my LinkedIn contacts. I wonder if Vincent will figure out which one I am. I know that, if he does, Amy will too.

    With odds of .002%, the exercise will prove just how sharp he is.

    • Renee
      Jan 6, 2015

      wow…. so this has been going on for quite a while then!

  15. System Administrator
    Jan 2, 2015

    I can’t believe my attention has been so intensely hijacked by this thread.

    The result of my Sherlocking is:

    1) There is a real Amy Cordan. Her present name is Amy Henriques and you can find her on LinkedIn. You can also verify her other surnames with a simple veromi.net search (specify Pennsylvania).

    2) Paris Granville is married to Vincent. You can find her on LinkedIn as well, listed as working for the Washington Office of Superintendent of Public Instruction. Interestingly, though she does not mention that she is also employed by her husband’s company, she does show that Amy Cordan has endorsed her French and other skillsets.

    3) The connection I have discovered is that both Amy and Paris studied French at the University of Northern Iowa at the same time (1994 – 1998). There are other nuances involving the French connection, but that one grabs me best.

    So what do we have here, folks? Did Amy have her name appropriated by Vincent or Paris? Or does Amy have a second, hidden identity? I doubt the latter given that Amy’s entire formal education is in the foreign language arts.

    It is all almost as intriguing as it is stupid. I feel my interest waning as my mattress beckons.

    I’ll check back to see if anyone contributes an epilogue… or epitaph… or even some epinephrine if you have any.

    • Renee
      Jan 6, 2015

      It’s hard to imagine someone wouldn’t know that a contact is using their former name to create a public profile… I see her LinkedIN profile is pretty sparse, though, so maybe she just isn’t on the site much and isn’t aware of the other.

      She’s actually in my “day job” area of work (development/fundraising), so maybe I’ll contact her and find out whether she knows…

  16. Renee
    Jan 6, 2015

    Wow, sorry just logged into approve comments and didn’t realize interest in this article had kind of taken off of late!

    Thanks for your research, everyone. This is… interesting!

  17. Sal DiStefano
    Jan 8, 2015

    I have avoided DSC and the other related sites as well as Mr. Granville due to the lack of integrity and cloud of suspicion about the content there. Interesting to find this and learn of the practices being used on the site. Very interesting.

  18. Renee
    Mar 18, 2015

    I regularly accidentally open articles on Data Science Central because someone will share something on twitter and I’ll click through the short URL without realizing what I’m clicking on.

    Did that today and it was a post by “Amy”. Apparently her handle on DSC is now “Data Science Girl”. Ugh.

    Well, maybe Mr. Granville saw us calling him out for the fake LinkedIN profiles and such and decided to anonymize “Amy”.

Submit a Comment

Your email address will not be published. Required fields are marked *