Machine Learning Blueprint Newsletter, Edition 11, 12/17/17

 View Only

Machine Learning Blueprint Newsletter, Edition 11, 12/17/17 

Wed June 19, 2019 03:13 PM

December 17, 2017

Hi *|FNAME|*,
Happy New year from MLBP! Here are the greatest hits of 2017. Please forward to your friends and help us grow our 12k person audience as we enter the new year!
2017 Machine Learning Trends & Greatest Hits
2017 was a big year for advances in Deep Learning, from visual reasoning and the promise of Hinton’s new Capsul net techniques in computer vision to continued acceleration in text applications.
As adoption of deep learning in industry has taken hold, the landscape of open source options for deep learning frameworks has started to crystallize in 2017. Amazon, Microsoft, and closely affiliated FaceBook projects like Pytorch have banded together in the ONNX framework, breaking the Google-TensorFlow ecosystem dominance. The popularity of Pytorch in particular has been a good thing, not just for developers, but also in spurring on TensorFlow to break out of the static runtime paradigm and provide dynamic execution options.
We have also seen new offerings from number of public cloud providers for GPU’s targeted towards deep learning applications. This helped fuel the growth for deep learning exploration and productization, since now developers don’t need invest in hardware themselves, or they can just use an API for access to a pre-trained net.
Interestingly NVIDIA's new EULA prohibits their GPU’s from being used in public cloud infrastructure. This could be a temporary roadblock at best for AWS’s ambitions. Maybe this is a hint for NVIDIA’s own GPU public cloud offering in 2018, and push the developers towards the more interoperable ONNX that won’t rely on CUDA?
What stood out here is the sunset of Theano, the deep learning framework. Its obsolescence, marked by the retirement of active development, is not entirely bad. It was born and maintained out of academia before commercial players got involved. Google and others taking the lead on the open source deep learning frameworks signals that deep learning is being taken much more seriously by industry, as industry does not generally rely on academic-grade tools.
Adversarial ML showed us that we might not possess as deep an understanding of the tools currently being built and deployed out in the wild as previously thought. Several defense mechanisms were published, but they were frequently subverted. While the threat may be overplayed, since adversarial ML appears most effective when the attacker has access to the model weights, it cannot be ignored and will hopefully push us towards better understanding as a better defense.

The Power of Seq2Seq - In 2017
While the core techniques found early exploration, 2017 has been a big year to use attention mechanism enhanced LSTMs for encoder-decoder methods in a multitude of impressive use cases: Abstractive Text Summarization (and enrichment), Emotional Chatting Machines, Machine Translation, and even applications in organic chemistry (NIPS paper winner 2017):
Other Notable Posts and Trends in 2017
Self learning algorithms, in particular incorporating reinforcement learning and evolutionary algorithms to guide ML pipeline construction and deep learning architectures has seen a lot of attention in research this year. We can likely expect to see these research techniques pushing into industry in coming years.
After a period of dormancy following the acquisition, Apple is using Turi to make training neural networks on Apple machines more accessible by leveraging the proprietary data formats of Turi and the new “metal” hardware provisions from Apple. Unlike the the ML as an API offered by AWS and Google Cloud, this framework allows developers to directly deploy algorithms that can provide inference on the device. This was previously an enormous hurdle as it required domain expertise from disparate backgrounds of application development and machine learning expertise.
The writing is in the sand for Python 2.7 as the Numpy Dev team will not be supporting moving forward. Numpy underpins so many of the existing numerical, mathematical and data processing libraries available in Python, and without them being the catalyst for change, others might have been slow to consider dropping support as well.
A venture capitalist from FirstMark Capital gives his viewpoint on what it takes to have a successful ML-based startup, touching on a number of dimensions and helping to define the opportunity for shrewd movers for this past year and the upcoming future.
Apache Spark already made it easy for data engineers and scientists to work with big data, and the DataFrame/DataSet paradigm made it even more portable for those coming from data analysis to work with. However, Spark Streaming always relied on the traditional RDD paradigm - until this year when streaming was brought to DataFrames. But on top of this, we saw DataBricks also offer the ability to port deep learning algorithms through UDF’s, and others offer the ability to distribute TensorFlow on a Spark Cluster.


#GlobalAIandDataScience
#GlobalDataScience

Statistics

0 Favorited
23 Views
0 Files
0 Shares
0 Downloads