  vincent64 Feb 15, 2013 2:18 AM

    66 job interview questions for data scientists

    Can you answer them? Google "66 job interview questions for data scientists" to read the questions and check out if you are a real data scientist. Google data scientists are expected to correctly answer at least 40 questions. What is the number for Facebook employees? Compare yourself with your peers.

    • Answer to question #10:
      Not sure if the problem of fuzzy merging can be addressed within the framework of traditional databases. Say you have a table A with 10,000 users (key is user ID), a table B with 50,000 users (key is user ID). You could created a user mapping table C with three fields:
      userID (= key),
      Alternate_UserID (this field would also be a user ID) and
      Probability (probability that userID = Alternate_UserID).
      This table would be populated after some machine learning algorithm had been applied to tables A and B to identify similar users and the probability they match. Make sure that you only include (in table C) records where probability is above (say) 0.25, otherwise you risk exploding your database.

    • We have added a few more questions. We are now at 70.

