Machine Learning Blueprint Newsletter, Edition 20, 4/29/18

 View Only

Machine Learning Blueprint Newsletter, Edition 20, 4/29/18 

Wed June 19, 2019 03:47 PM

B
This newsletter is written and curated by Mike Tamir and Mike Mansour. 

April 29, 2018

Hi all,
Hope you enjoy this week's ML Blueprint. This week is brought to you by fastdata.io.
9588101f-ba34-416f-a3e0-77b49ad3d5fa.png

Spotlight Articles
Chaos theorist Edward Ott and his team of researchers from the University of Maryland generate breakthrough results with artificial neural networks to predict the evolution of chaotic systems. Specifically the algorithm was able to successfully predict the evolution of chaotic flame front systems up to eight “Lyapunov times,” a technical measurement of how long it takes for a chaotic system to exponentially diverge.
Machine Learning Blueprint's Take
This result represents new territory for deep learning applications. The article’s claim that “Ott and company’s results suggest you don’t need the equations — only data” strikes a controversial tone. While a picture of deep learning algorithms replacing all analytic approaches to describing physical systems is unlikely, the application of such algorithms supplement cases in which the evolution of physical systems is too chaotic does open doors that have been left closed by traditional methods.
An exploration of bias in different pre-trained neural word embeddings like Word2Vec and GloVe. This article reviews recent techniques for using the Word Embedding Association Test (WEAT) was recently proposed by Caliskan et al. (link). The method uses cosine similarity to compare a set of embeddings like male vs female names with target concepts like pleasant vs unpleasant, allowing the user to assign a bias score based on the imbalance in affinity.
Machine Learning Blueprint's Take
Google has been making a lot of noise about the company’s dedication to removing bias in their algorithms after a series of unintentional biased results drew attention last year. As pretrained options in Tensorflow Hub coming out of Google continue to expand ready availability to the ML community, it is encouraging to see Google making an effort to accompany these embeddings with actionable tools for detecting known biases in such algorithms.
57a4ccb0-c886-446e-817b-d0e3e80871e6.png
A message from our sponsors...
9b5c8f3c-9599-414d-8113-7bcc05841436.png
Traditional storage systems that feed the GPU servers for machine learning workloads can be too slow or have insufficient throughput to keep pace with the GPU, resulting in GPU starvation.
WekaIO Matrix software delivers over 6x more data to the GPU than NFS and 2x more than a local NVMe SSD to accelerate machine learning training Epochs.
Leverage the lessons learned from one major autonomous driving vehicle manufacturer.
A lurking issue in machine learning applications is the underlying reward function that is optimized over. Business demands might encourage an engineer to maximize a user’s “time spent on a page”, but this is not really in the best interest of the user; there does not yet exist an ethical guideline for reward functions. Designing a new cost function can be difficult, and sometimes come with unintended consequences, as agents in simulations learn perverse strategies exploiting a bug in the engine. Reward shaping matters more as supervised ML applications get released into systems with feedback, making them RL systems where distributions change or inputs are adversarial.
Machine Learning Blueprint's Take
The last point about the trend ML systems becoming RL systems once feedback is incorporated is the author’s main reason for further studying this area (Note, he has a great multi-part series covering RL and dynamic programming fundamentals). It’s important that engineers understand this area well to create ethically sound applications that maximize human happiness and avoid incentivizing adverse and perverse behaviors. This is along the lines of Francois Challet’s recent opinions, in one of our past Spotlights, that claimed that humans are the RL agents now, feeding on signals from ML algorithms, except this author is trying to educate fellow engineers and encourage them to reconsider their reward functions.

Learning Machine Learning
If you ever wanted to understand at a medium level what all the ML components of a self-driving vehicle are under realistic constraints and their hardware implementations, look no further. These researchers dive into the SOA models used for all the autonomous functions, and how they’re tied together. Importantly, they cover hardware implications to achieve requisite inference latency while still maintaining power efficiency. Everything used here is open source too, so it’s repeatable! (Except for the referenced 10-TB of highly featurized maps?)
Machine Learning Blueprint's Take
It turns out that just throwing a heap of GPU computing power in your trunk is not realistically deployable, unless a short-range mobile sauna is the goal. Power-efficiency and increased cooling requirements to maintain usable driving range + a comfortable climate, while achieving acceptable latency, force designers to choose the right mix of ASIC’s, GPU’s, & CPU’s for each ML model. This study was very well done, exploring all the possible configurations of hardware setups.
MobileNet v1 was a CNN optimized to run on mobile phone architectures that utilized depthwise separable convolutions, resulting in a 9x reduction in work. V2, as explained here, brings a 20% parameter reduction and 48% reduction in multiply-accumulate operations by introducing an expansion layer and a projection layer to each block. This effectively allows the depthwise separable convolution to operate on a lower-dimensional tensor. MobileNet v2 also borrows from ResNet by implementing a residual connection, helping gradients flow more smoothly.
98617e50-298c-4b20-87a6-e0b3889e0692.png
A tutorial for artists explaining how to generate “artistic output” using an RNN. They demonstrate how to create handwriting given some seed input, but importantly how to also draw the resulting output in a web browser all using p5.js, with an emphasis on interactivity. It’s meant mainly to inspire artistic creation with machine learning as a tool.

Machine Learning News
A number of EU countries have formed an AI-Commission that’ll help coordinate efforts to maximize investment impacts, foster cooperation, and collectively define the development roadmap in the members’ interests. They want to create an environment which attracts top-talent, money and data, instead of losing it to other countries. On the immediate horizon, the Commission plans to invest $1.5B of public funds by 2020, but hope to get it up to $20B with the participation of member states. More details can be found here.
This benchmark challenges models in their ability to solve a combination of tasks including question answering, sentiment analysis, and textual entailment, with the goal to have a model be as general as possible. They make sure the dataset covers difficult areas currently faced by models, like incorporation of world knowledge, or the handling of lexical entailments and negation.
Most of the tools data scientists use on a day to day basis are not all that new (backprop came out of NASA’s Apollo missions). They’re rooted in well established fields, but have been rebranded as “AI”, a term coined in the late 50’s to classify human imative intelligence. AI comes with a ton of hype and promises that grab media attention -- however, these problems being tackled by human imative intelligence might not be all that important or useful for humankind, and actually might be distractive. Jordan thinks we should be focusing on IA (Intelligence Augmentation) and II (Intelligent Infrastructure) instead.
Machine Learning Blueprint's Take
Jordan’s comments are admittedly a breath of air amidst all the media hype to this industry. However, as distracting as the hype may be at times, it could have a positive side effects. The additional attention brings increased interest & relevance to the field, but more importantly, attracts additional talent. Without the attention, less people may choose early on to study areas that lead them to this field, or academic institutions invest less into ML-related departments. The media hype around “AI” can be seen as a double edge sword in that respect.
41b914fb-3e9f-46b7-bf23-1b0b86bae837.jpeg
Researchers invoke bacteria to generate certain proteins which are used for testing the effectiveness of pharmaceutical remedies by inserting DNA into the cells. The problem is that getting the bacteria to generate usable amounts of protein is difficult and depends on the injected DNA. The share a method for optimizing the DNA to maximize the amount of proteins generated, making the research much faster.
Facebook is recruiting for SoC (System-on-a-Chip) engineers; these are like an ASIC, but instead of a chip, the functionality is built into the silicone in the semiconductor.
Interesting Research
Researchers at Uber’s AI labs have developed a novel technique which they call differentiable plasticity to increases the ability of an neural net to adapt connections in response to ongoing experience. Instead of learning fixed weights, in this architecture the values passed to a neuron is separated into a combination of a fixed weight and a “plastic” component which responsive to earlier layer inputs and outputs. The authors close with the suggestion that such techniques might be leveraged to enhance performance in traditional non-plastic architectures or neural units like LSTMs in future work.
This model of learning is somewhat obvious after the fact, and mimics how we learn new tasks by reducing a problem to an already solved one. The technique learns in reverse by starting the agent close to its intended reward, and iteratively steps it further away and retrains. In this way, the agent is able to learn stages of the state space and acquire awards, ultimately decreasing the number of iterations required. However, picking the starting points matters; they pick points that have a medium probability of achieving the award. Using Brownian motion to generate points N-steps away from the goal, they then run inference to evaluate the probability of reaching the reward of each point. High probability points would be too easy and not imbue any state space information to the agent, and low ones would just waste computational cycles. This also lends itself to parallelization. Check out the blog for an intutive video and examples.
Machine Learning Blueprint's Take
The reverse curriculum approach makes sense for a limited set of tasks. Some places it might fail are where dynamics of the system change depending on the state space, or when certain one-way situations appear; for instance in a video game where adversaries become increasingly intelligent or gain new powers, or a previous state becomes inaccessible from a one way door. A pseudo approach to this setup, albeit probably less effective, might be to initialize an agent from a number of different save points in a video game.
b22b4a4d-07ea-4943-b24a-a1212f31db36.gif

#GlobalAIandDataScience
#GlobalDataScience

Statistics

0 Favorited
7 Views
0 Files
0 Shares
0 Downloads