Global AI and Data Science

Wed June 19, 2019 03:17 PM

Michael Tamir

This newsletter is written and curated by Mike Tamir and Mike Mansour.

January 7, 2018

Hi all,

Please forward to your friends and help us grow our 12k person audience as we start the new year!

Spotlight Articles

Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective

Machine Learning Blueprint's Take

Facebook shows what ML-as-a-(internal)-service looks like in its best form when the entire organization has a unified vision around machine learning. Something not frequently exposed is their business logic that went into designing this massive system. While their system may be over-engineered for most other companies, there are a number of useful considerations to take away.

[Link]

Facebook’s Virtual Assistant M Is Dead. So Are Chatbots

After years of employing human-operators assisting their limited-release chatbot, M, Facebook shuts down the project due to high expense and low success in transitioning from a human-in-the-loop active learning project to a fully automated algorithm.

Machine Learning Blueprint's Take

The power of neural embedding algorithms and increasingly sophisticated deep learning architectures to transform our ability to process language has generated a lot of promise in the area of interactive chatbots. The failure of the M project is significant for two reasons. First, it highlights that while these algorithmic gains have been substantial, there is still more work to be done. Second, there is a lesson to be learned here about unbounded active learning projects. The strategy of enabling humans to provide labeling while supplementing an ML pipeline that is still learning is a compelling one. This article punctuates that how such active learning projects are executed can have a large impact on whether the algorithms will ever be able to kick away the ladder of human in the loop support.

[Link]

A message from our sponsors...

Do you wish you had a development environment tailored to Data Science?

SherlockML empowers hundreds of Data Scientists by managing their projects, infrastructure, packages and deployment. It is a platform designed to support the workflow all the way from data ingestion to a deployed model.

What’s the result? Iteration cycles that are weeks rather than months. No need to worry about the tech stack. More time spent doing science, collaboratively.

Contact ben.g@sherlockml.com for more information.

[Link]

Learning Machine Learning

[1801.01078] Recent Advances in Recurrent Neural Networks - Get Up to Speed on Where State-of-the-Art has Progressed

A review paper covering the basics of RNN’s, common problems encountered when working with them, and comparisons of many new flavors - useful for the newcomer or anyone reviewing.

[Link]

MIT Releases Their Online Course for Deep Learning for Self-Driving Cars

Optimization Methods for Maximizing Interpretability in Deep Neural Networks

An Jupyter Notebook Lecture Series on “Practical Reinforcement Learning in the Wild”

Understanding PCA

Machine Learning News

Latest Version of TensorFlow, v1.5, is Released

Highlights from release include:

Eager execution (preview version)
TensorFlow Lite for mobile applications (dev version)
CUDA 9 and cuDNN 7 support

[Link]

Fair and Balanced? Thoughts on Bias in Probabilistic Modeling

This writer takes an honest look at the bias problem in machine learning and comes to the claim that: Bias is inherently normative and conditional distributions exist in our data that we may find morally questionable.

[Link]

DeepTriage - Automated Bug Triaging with Deep Learning

IBM uses bidirectional RNN’s on bug-reports to automatically identify and assign a developer within an organization to address it.

[Link]

Facebook Research Releases wav2letter, an End-to-End Speech Recognition Toolkit

Google Introduces a “Learned Image Compression Challenge” at the Next Computer Vision and Pattern Recognition Conference

Google is encouraging researchers to create state-of-the-art image compression by creating a Kaggle-like competition, providing a dataset of 1,600+ images of both professional and mobile quality for training deep networks. Official details here.

Machine Learning Blueprint's Take

This sounds oddly parallel to the premise to the popular HBO show “Silicon Valley.” Also a nice opportunity to mention the case for learned indices which we posted on in earlier letters.

[Link]

NIPS 2017: Policy Field Notes - Research that May Have Implications for Public Policy for Machine Learning

Interesting Research

Google Researchers Take Adversarial Attacks on ML-Systems out into the Real World with Stickers: Adversarial Patch

A new approach abandons the goal of minimizing human-perceptible differences to adversarially altered images, and instead attempts to optimize a minimally-sized patch that may have a tremendous amount of distortion, creating a highly-salient feature. They show that can fool an unseen model with patches developed from other networks - and furthermore can do so by printing the adversarial patch and including it in the snapped image.

Machine Learning Blueprint's Take

This may be one of the most interesting applications of adversarial attacks to date, since it does not require access to the model in which they are trying to fool, which has been a major limiting factor of the efficacy of these attacks in the real world. However, in the black-box case (fooling an unseen model) with a physically printed patch, it’s worth noting that the authors don’t release the full results as they may be lack-luster, claiming some effectiveness when “the patch takes up a significant fraction of the image.” The figure here shows efficacy w.r.t. non-physical patches added to existing images.