Everyone likes a good list. There are some articles working around which talk about why Data Science doesn’t add business value. As a data scientist who love people, I wanted to add my voice to conversation. Here are three things that make it hard to work with a team of Data Scientists (& Engineers).
- Data Scientists are less likely to be mobilized on incomplete information.
A data scientist, or perhaps a PhD in general, wants to fully understand the problem before choosing a solution. This is best explained in an example: There may be one business rule which turns a problem from linear to non-linear. But, the effect of that non-linear portion is small, so the business doesn’t bother to mention it until part way through the build process. The business doesn’t understand that the data scientists might literally need to start over to incorporate that new feature. As a result, data scientists have developed a healthy suspicion of project requests. No one wants to start over because the problem wasn’t accurately described first. So, the team stalls until they believe they understand the full problem. And that can take a long time. It’s a big challenge to begin a problem fast and get quick wins while simultaneously going slowly enough to protect from future disruptions.
- Data Scientists will not believe something until they see it with their own eyes.
This personality quirk is very important to their job. It means they question everything, validate unknowns and solve “unsolvable” problems. (After all, if you believe your colleague who says that the problem is unsolvable… then you aren’t going to be the one to solve it!). However, it’s challenging to have a team that won’t accept second hand knowledge. Teams are forced to include the DS in every meeting, in order to build the requisite business knowledge. Meanwhile the DS might be pushing the meeting down a tangent which is not the main focus of the meeting. This, in combination with #1, is a hard problem.
- Data Scientists require leaders who are Triple Threats.
In the performing arts a Triple Threat is someone who can sing, dance and act. In Data Science, a triple threat is someone who can understand the Mathematics, the Business and the Communication necessary to be a liaison between the first two sets of people. And often these traits are negatively correlated. People who are good at Math are certainly perceived to be less good at people. Thus, Triple Threats are rare!
Incorporating mathematicians into the workplace is more valuable than ever. Finding and acquiring a triple threat can be a challenging prospect, but something which companies should not shy away from.
What can we do about these challenges? Have you made progress on solving any of these challenges? What do you think are the biggest challenges facing data scientists right now?
As a gamer, the holidays give me opportunities to play games with non-gamers. People like my family. Or, more relevantly to this article, people like my niece. Here’s what you need to know about my niece: she is 6 years old who really loves to play games, begot from gamer parents. This year we played Simon’s Cat. And because she is 6, we played Simon’s Cat a lot. (The hedgehogs were my favorite.) And, as a mathematician and a gamer, I have to say that Simon’s Cat has a fair amount of game for its simple rules system.
Playing games with a 6 year old, as an adult, can be challenging. Mostly because they aren’t playing games at this age. They are playing experiences. Chutes and Ladders is most definitely an experience, not a game. Despite all those fabulous ladders and exciting chutes, the game is just a very complicated randomizer. It’s basically the Rube Goldberg machine of coin flips. It hurts my brain when a child is sad because they “lost” experiences like this. I just want to say, “You only had 1/2 a chance. There was literally nothing you could do to avoid this fate.”
I think a game gets to be called a game if and only if the choices you make as a player influence your ability to win or lose. So, the question of the hour is: Is Simon’s Cat a game or an experience? Let’s cover the components and the rules. Here are all the cards in the deck arranged in a pleasing way:
You are dealt a portion of the deck. Whomever has the Pink Cat 3 plays it to the center. Next, you go around the table and if you have a card that is the same color or the same number, then you can play it. If more than one card is playable, then you pick which one to play. If you don’t have any cards that you can play, you collect all the cards from the middle of the table. These cards all count as 1 mess that you had to clean up. If you collected a mess, then you play any card you want to the table. Suffice to say, you play a card every time it’s your turn, but you want to collect as few messes as possible. Once all the cards are played, the player with the fewest messes wins! Got it? If you want the official rules, you can find those here.
A game has a game if and only if the choices you make as a player influence your ability to win or lose.
In order for this to be a real game, my choices have to influence my chances of winning. Thus, there must to be some strategy to the order I play my cards that is better than another. Is my chance of winning higher when I use a good strategy than it is for me to randomly chose a usable card?
A slightly different way to ask this question is: Are some cards more valuable than others? Can some cards be used in more situations? If there are cards which are more valuable, then I want to keep those in my hand as long as possible for increased flexibility in the late game. However, if all cards are equally valuable, then there won’t be a ‘better’ or ‘worse’ strategy and Simon’s Cat isn’t a real game (as far as I’m concerned anyways!)
Since I can only play a card if the previous card shares a color or number, I want to know which card has the most other cards which share a number or color: the most similar cards. Let’s consider the Gnomes. There are only two Gnomes in the game. At first, this may appear to be valuable. (Because you can block others perhaps?) But remember, your main goal is to not get messes. So, you are most concerned with whether a card is playable by you or not. The Green Gnome 1 is only playable off of the Green Gnome 2, Orange Mouse 1, and Purple Dog 1. This means there are three cards in the deck which can precede the Green Gnome 1.
What about my personal favorite, the Yellow Hedgehog 3? Yellow Hedgehog 3 can play off of any of the five other yellow cards or any of the four other 3s. Thus, there are nine cards that a Yellow Hedgehog 3 can play off of. This means that between Green Gnome 1, with three similar cards and Yellow Hedgehog 3, with nine similar cards, the Yellow Hedgehog 3 is a more valuable card. Except, these can’t be used in the same situation. So, we’ve shown that cards have different valuable-ness, but we haven’t shown that this will ever matter.
Let’s consider an example. Let’s say the card played before my turn was a Yellow Hedgehog 3. I have a Blue Kitten 3 in my hand and an Orange Mouse 3. Which one should I play? Well, the Blue Kitten 3 has eleven similar cards. Orange Mouse 3 only has seven. Assuming we don’t know any more information about what was played previously, it would be a better choice to play the Orange Mouse 3 because there are more cards that can trigger my Blue Kitten 3 than my Orange Mouse 3. So, in this moment of our investigation, we know that there will be a strategy which is better than random. That means Simon’s Cat is an actual game! I can make a better or worse decision. I can impact my fate! Thank you, Steve Jackson Games, for making a simple game (without reading!) that is still a game. Seriously. Thank you.
Before we go into the bonus lightning round, I have to insert an aside for the other gamer mathematicians out there: for surely you have already determined that which card is the most valuable is not just a function of the total cards in the deck, as I presented above. Returning to my example: if all the other Blue cards had already been played, then (in that moment) the Orange Mouse 3 is actually a better play because, at the moment, Blue Kitten 3 is not at the top of her game. All her friends have already been played to the table! So, she is less valuable. To those of you who thought of this, excellent work! Your insight also further proves that there is a real game to be had here.
Now for the first bonus round:
Bonus Question: Which card begins the game as the most valuable card? Can you figure it out by looking at all the available cards in the deck? You can go look. Take a guess!
Answer: Hopefully you figured out it had to be a Pink Cat because Pink has the most cards in its color. And the most valuable card should also be a 3 or 4 so it shares the most colors across a single number. Thus, the 2 most valuable cards are the Pink Cat 3 and Pink Cat 4. Except you will never get to play the Pink Cat 3 on another card, because it must be played first. Thus, the most powerful card is the Pink Cat 4. Notably, Pink Cat 4 is the only Pink Cat who isn’t doing something crazy in the graphic on the card. Therefore, I am left with no other option than to assume that Simon’s real cat is Pink Cat 4.
And the final bonus round:
Reader be warned! It is very possible to be dealt a hand of cards and preceding cards in a way which allows no choices. In this situation, we, the adults, suck it up and “play” our gaming experience. Simon’s Cat has more game than most of the games I played over the holidays, but it’s still a game which can be explained in 3 sentences. It’s bound to have some flaws. For a game which can be played with pre-readers, it gets high marks from me!