Machine Learning Blueprint Newsletter, Edition 28, 7/22/18

 View Only

Machine Learning Blueprint Newsletter, Edition 28, 7/22/18 

Wed June 19, 2019 04:01 PM

B
This newsletter is written and curated by Mike Tamir and Mike Mansour. 

July 22, 2018

Hi all,
Hope you enjoy this week's ML Blueprint. This week is brought to you by fastdata.io, producer of the first GPU-native real-time stream processing software engine.

Spotlight Articles
There have been a number of proposed defenses to adversarial attacks on DNN’s, but none work against all attack types and are relatively complex. This novel approach tackles the problem of how neural nets presented adversarial inputs give very high confidence to the wrong labels. The root cause, they postulate, is because these adversarial inputs are off the manifold of possible input data that the model was trained on. Their solution is to expose the model to a different kind of combinations of training data that expose the model to “off manifold” examples. Specifically, the authors take convex combinations of the target training data with out of scope datasets. For instance, in training a simple CNN model for MINST classification they combine MINST images with CIFAR images, training the model on convex combinations from both sets. Now, when presented with adversarial inputs, the classifications have a more uniform confidence across labels, instead of being highly confident on the wrong label. The quantitative results are impressive by their measure of “effort,” which is the ratio between the number of iterations required to trick the classifier, and the amount of change in the output distribution. An attacker requires a significant amount of effort (iterations on adversarial input was capped at 10K) to reach a 90% confidence threshold on the wrong label, however in many cases it could not fool the classifier to this degree. Qualitatively, the adversarial inputs now actually start to resemble the adversarial label.
Machine Learning Blueprint's Take
There are two primary lines of thought worth pointing out that have not been discussed much in this space. First, teaching the classifier to be give low-confidence predictions in areas it is confused is novel. This was tackled by not approaching it as an architectural issue, but a data-issue; the resulting solution is more elegant and does not require more computational complexity to be robust. In fact, there was only a 2% decrease in accuracy, which might be overcome with further hyperparameter tuning. But this plays into the second point: At almost no additional complexity to the modeler, the adversary now has to expend a tremendously large amount of effort to trick the model. This is a real constraint to the attacker, who has finite resources and time. These limitations are practical enough to prevent, or make much less feasible, an attack in the real world.
Zach Lipton (CMU) and Jacob Steinhardt (Stanford) critically review patterns in Machine Learning research. They identify the following four concerning trends:
  • Failure to distinguish between explanatory claims and speculative ones.
  • Failure to identify the sources of empirical gains (especially in the case of unnecessary neural architecture modifications vs hyper-parameter tuning)
  • Using “Mathiness” to obfuscate or impress readers and reviewers instead of using it to drive technical precision.
  • Misusing terms with colloquial connotations to sensationalize results, or overloading and conflating the usages of established technical terms.
Machine Learning Blueprint's Take
This paper is a must read for all ML practitioners and researchers. Lipton and Steinhardt successfully identify core threats to the long term continued growth of ML research. The four trends they identify provide a genuine risk, as the authors put it, of “stymie[ing] future research by compromising ML’s intellectual foundations.” One of the strongest takeaways in the paper (among many) is the seductive temptation to allow performance results in DL algorithms to erode scientific standards. Rapid growth in DL advances and poorly aligned incentives in academia and industry have resulted in bias towards the publication of performance improvements even when the work is guilty of speculation over explanation, poor identification of the true source of improvement, or mathematical or linguistic obfuscation. The authors show deep insight (and self-critical humility) in pointing out that the short term motivations driving this bias have the potential of impending genuine understanding of results and ultimately may result in serious damage in the long term.
A message from our sponsors...
Are you having issues processing infinite amounts of data? Plasma Engine™ is the:
  • Is the first GPU-native stream and hyper-batch processing software
  • Natively uses Apache Arrow, a vectorized columnar in-memory data format
  • Is fully compatible with Apache Spark workloads, no code changes required
Learning Machine Learning
There are some tweeters out in the wild who publish great content, but may not have internet infamy yet; finding them is a challenge. This technique uses simple metrics, translated to code, along with some ETL to identify underrated users.
Machine Learning Blueprint's Take
The author may not have done anything fancy here, but it speaks to the merit of using simple metrics as a starting point for an ML project. It echos some of the maxims from Google's “Rules of Machine Learning”, which stresses the importance of starting with simple features & metrics. Given the results here, it makes for a good springboard to start investigating how other signals, perhaps from NLP, might enhance the results.
A while back, Google released a first stab at generating some educational machine learning content, focused around some of the tasks they solve there. The project appears to have been formalized, cleaned up and completed; they’re offering a number of “seeds”, basically starting off places or examples of individual algorithms or of solutions to various problems. There are currently 57 seeds, and they’re all hosted on Collaboratory Notebooks - meaning free GPU/CPU resources!
Machine Learning News
Guido was the leader of the Python project, often having the final say in important decisions and credited with setting the culture of the community to be more open and friendly, likely leading to its large success. Having cited particularly nasty community behavior around PEP 572 (variable assignment expressions in conditional statements), he is “taking a permanent vacation… You are all on your own now.
Machine Learning Blueprint's Take
This is by no means a death knell for the language, but important to know about since it may introduce a slight change to how PEPs are handled and how Python evolves over time. Guido’s departure was obviously inevitable at some point, but he has established a strong sense how how things should be done in this community that should carry over for the long run, hopefully giving stability. Also, let’s not forget that there are some excellent other core-devs maintaining the project - Python is not going to turn into some Mad-Max like race to the bottom, but read some commentary to make your own call on that.
The NIPS conference finds their name unfortunately similar to certain human anatomical names that makes attendees uncomfortable amongst the growing diversity in the machine learning community. The conference also believes that the scope of their subject matter may also outgrown the name, so they are taking public inquiry for a new name. Neural McNeuralFace?
Machine Learning Blueprint's Take
It might be easy to think that this change is unnecessary, but it reflects a positive and much needed shift in the ML community diversity, both in gender & thought. Fostering an environment of inclusion will encourage brighter and different thinkers to join our space, while also helping to prevent any embarrassing research formed in myopism.
Sponsored Content
TuSimple selects WekaIO to fuel artificial intelligence (AI) for autonomous fleet vehicle machine learning. TuSimple evaluated WekaIO Matrix™ to standard NAS solutions and legacy file systems, and found that Matrix delivered better scalability and performance.
WekaIO leapfrogs legacy storage infrastructures and future-proofs datacenters by delivering the world’s fastest parallel file system with the most flexible deployment options—on-premises, cloud, or cloud bursting. Matrix software is ideally suited for latency-sensitive business applications at scale such as AI and machine learning.
Dubbed “The Machine,” its confidence in invest decisions are almost always respected. It was born out of GV’s initial lack of VC expertise, but large engineering talent that wanted to leverage those abilities with Google’s massive computing infrastructure + data. “The Machine” evaluates round size, syndicate partners, past investors, industry sector and the delta between prior valuation and current valuation; it only green lights deals that score above an 8/10.
When developing new compounds, it’s unknown how new combinations of chemicals will interact with living cells. New modeling techniques will allow researchers to publish toxicities of 40,000 chemicals by next year, reducing the extensivity of animal testing required.
Interesting Research
This work presents a scalable solution to open-vocabulary visual speech recognition. The authors constructed the largest existing visual speech recognition dataset, consisting of pairs of text and 3,886 hours of video clips of faces speaking. In tandem, they designed and trained an integrated lipreading system, consisting of a video processing pipeline that maps raw video to stable videos of lips and sequences of phonemes, a scalable deep neural network that maps the lip videos to sequences of phoneme distributions, and a production-level speech decoder that outputs sequences of words. Their system achieves a word error rate (WER) of 40.9% as measured on a held-out set. In comparison, professional lipreaders achieve either 86.4% or 92.9% WER on the same dataset when having access to additional types of contextual information. This approach also significantly improves on other lipreading algorithmic approaches, including variants of LipNet and of Watch, Attend, and Spell (WAS), which are only capable of 89.8% and 76.8% WER respectively.


#GlobalAIandDataScience
#GlobalDataScience

Statistics

0 Favorited
14 Views
0 Files
0 Shares
0 Downloads