To build trust in data science, work together
As data science systems become more widespread, effectively governing and managing them has become a top priority for practitioners and researchers. While data science allows researchers to chart new frontiers, it requires varied forms of discretion and interpretation to ensure its credibility. Central to this is the notion of trust – how do we reliably know the trustworthiness of data, algorithms and models?
The kinds of data scientist
In 2012, HBR dubbed data scientist “the sexiest job of the 21st century”. It is also, arguably, the vaguest. To hire the right people for the right roles, it’s important to distinguish between different types of data scientist. There are plenty of different distinctions that one can draw, of course, and any attempt to group data scientists into different buckets is by necessity an oversimplification. Nonetheless, I find it helpful to distinguish between the deliverables they create. One type of data scientist creates output for humans to consume, in the form of product and strategy recommendations. They are decision scientists. The other creates output for machines to consume like models, training data, and algorithms. They are modeling scientists.
Three reasons why mothers should consider a career in data science
For several women, the time during their pregnancy is one of overwhelming happiness, and at times, worry. We worry about things like childbirth and not knowing what to do with our baby after he or she is born. Women with careers have an added worry; we think about how this adorable new addition to our family will impact our careers.
One thing that I’ve discovered over the past four years is that having certain skills can reduce uncertainty around our careers. I’m a mom of two little girls and have a career in data that has provided me with the more flexibility and less stress. Below, I outline the three reasons why mothers should consider a career in data science.
Why you shouldn’t be a data science generalist
I work at a data science mentorship startup, and I’ve found there’s a single piece of advice that I catch myself giving over and over again to aspiring mentees. And it’s really not what I would have expected it to be.
Rather than suggesting a new library or tool, or some resume hack, I find myself recommending that they first think about what kind of data scientist they want to be.
The reason this is crucial is that data science isn’t a single, well-defined field, and companies don’t hire generic, jack-of-all-trades “data scientists”, but rather individuals with very specialized skill sets.
To see why, just imagine that you’re a company trying to hire a data scientist. You almost certainly have a fairly well-defined problem in mind that you need help with, and that problem is going to require some fairly specific technical know-how and subject matter expertise. For example, some companies apply simple models to large datasets, some apply complex models to small ones, some need to train their models on the fly, and some don’t use (conventional) models at all.
Each of these calls for a completely different skill set, so it’s especially odd that the advice that aspiring data scientists receive tends to be so generic: “learn how to use Python, build some classification/regression/clustering projects, and start applying for jobs.”