Hello, Data Science Community - my name is Mike Tamir, Data Science faculty member at UC Berkeley and Head of Data Science at Uber Advanced Technologies Group. As a continuation of the IBM Ask A Data Scientist Program, I sat down for an interview with Isaiah Brown, co-manager of the Data Science Community, on topics ranging from career advice to the future of data science. Feel free to ask more questions in the comments and stay tuned for more content coming soon.
What hard and soft skills do you see as paramount to success in the world of data science? For example, what is that one language or software you believe every data scientist should develop fluency in?
Python has definitely become the standard coding language for data science. For hard skills, a solid understanding of machine learning, statistics, and the fundamentals of linear algebra and multidimensional calculus are also a must.
For professionals from different fields interested in opportunities in data science, what skills tend to translate well from other professions?
Professionals from other fields, including physics, bioinformatics, and statistics tend to do well as data scientists. Often this is due to the strong mathematical literacy, ability to apply that literacy, and comfortability in working with stochastic processes.
Not only are the knowledge and the usage situations for data science continuing to grow, but so are the number of people interested in landing data science jobs. How would you recommend aspiring data scientists differentiate themselves from the pack?
The absolute best way is to work on a project to showcase your skills. This will both develop experience, give you something to talk about in your interviews, and hopefully, showcase something remarkable that you were able to create.
What was the most surprising data science trend of 2018?
For the past 3 years, with the growth of so many deep learning development ecosystems, we have seen a trend toward using deep learning more and more in production systems. While any solution should ultimately depend on the data, looking back we may point to 2018 as the year when using DL transitioned from the exception to the rule for modern technology companies with sufficient data. For specific algorithms, I've been especially excited by the resurgence in graph-based algorithms like graph sage, as well as the improvements in contextualized word embeddings that Google and OpenAI have been working on.
Which data science trends do you expect to dominate 2019?
I expect the transition we've seen in 2018 to continue into 2019. Deep learning should continue to dominate in 2019 in addition to the important advances coupling traditional approaches with reinforcement learning.
Which data science technologies do you expect to gain traction in 2019?
I'd like to see deeper work in AutoML. 2017 we saw a lot of attention in AutoML, but I don't think that we have begun to fully take advantage of the potential here.
Which data science technologies do you expect to decline in importance in 2019?
I'd like to see efforts like ONNX start to make it easier to transition from one DL development ecosystem to another. This may not mean that technologies decline, but will help for more specialization, as we saw for example with Hadoop and Spark.
What will the data science landscape look like 5 years from now?
On that time scale, we could see a lot of change. While it's hard to speculate what kind of algorithms or techniques will be in vogue by then, I am optimistic that current trends which allow us to abstract and improve the ease of development will mean that in 5 years one will be able to easily implement techniques that by today's standards take quite a lot of time. (But this is a safe bet for any 5 year period.)