Global AI and Data Science

Wed June 19, 2019 03:43 PM

Michael Tamir

This newsletter is written and curated by Mike Tamir and Mike Mansour.

April 1, 2018

Hi all,

Hope you enjoy this week's ML Blueprint. This week is brought to you by Fast Data IO.

Spotlight Articles

Learning Inside of Dreams for RL Agents

A detailed but clearly articulated exploration into using generative neural networks to enhance video game playing reinforcement learning (RL) models. Using a technique called variational autoencoders, they equip the RL agent with a "world model" in advance of the RL process. They take these results a step further, using the world models of the agents to artificially generate "dream training" environments in which the RL agent trains. Their experimental results show that not only does a world model enable the agent to outperform other leading models, but by allowing the RL agent to train in dream environments that are made artificially "noisier" than reality the agents were able to learn even more successfully.

Machine Learning Blueprint's Take

These results are very suggestive and the authors do not hesitate to suggest comparisons of their "dream training" results with neuroscience literature discussions of a phenomenon called "hippocampal replay" which study how the brain replays recent experiences when animals sleep. As with many break through results in recent deep RL, it is probably too soon to draw conclusions about what these models suggest in terms of human processes. However, it will be interesting to see where this work extends and how dream training might impact performance on increasingly more complex environment spaces.

[Link]

Hacking Human Behavior Should be What Concerns You About AI

AI has the power to influence human behavior, mainly by controlling the feed of information that we receive. This becomes a feedback loop that can be optimized w.r.t. Achieving a certain state via action, where now humans are the agent in this global reinforcement learning experiment. This means that the controllers of this information can manipulate the users into desired behavioral patterns.

Machine Learning Blueprint's Take

Clearly spurred by the ongoing Facebook revelations and Russian election interference, Francois Chollet frames the AI threat as a tool that can scalably manipulate humans. We’ve already been doing this to each other for a long time though via well known psychological holes in human thought patterns, well exploited by advertising. But this optimized information control situation is much worse than advertising, because the user is not interacting with the information on that advertorial premise, and hence may not have their guards up. But as information flows around us all become optimized for some goal, it’ll be exhausting to be wary all the time; hence why Francois proposes that users should ultimately be in control of those info flows and tune the goals / loss function of that optimization, for our own wellbeing.

[Link]

A message from our sponsors...

Now witness the real-time power of this fully GPU-armed and operational stream processing engine.

FDIO Engine™ can process Spark Structured Streaming workloads up to 1000x faster than Apache Spark, as it runs natively on NVIDIA GPUs and Apache Arrow.

To put 1000x performance in perspective, FDIO processes 1 Terabyte on a single AWS instance in 35 seconds, where Spark takes over 9 hours.

Sign up for a Test Flight today at fastdata.io

Learning Machine Learning

Classification Interpretability with Generative Adversarial Networks

Square is using this method to evaluate users flagged by some detection model (probably not a NN) to give better reasons for why they are anomalous, but with the requirement of doing it very quickly. They train a GAN to generate synthetic “good” users, and then compare a real “bad” one to the K closest neighboring synthetic “good” users. Next, the features between the average of the K good fake users and the real bad one are compared to identify the one that most likely triggered the anomaly.

Machine Learning Blueprint's Take

This is useful when one might not know what features contributed the most in a particular instance of a classification. A big upside to this method is that it is fast, and this matters in huge production systems where large numbers of classifications may be made. All the comparison data can be generated offline.

[Link]

Learning to Navigate Without Maps

This research attempts to mimic the way humans learn to navigate a new environment; the problem is akin to a maze navigation problem. They train a 3-part network to learn how to traverse across the city with Google maps street-view with first a CNN to process and featurize visual inputs, a locale-specific RNN that learns the environment, and lastly an RNN for navigation policies over actions. In the end, they’re able to successfully navigate across several different cities. The model can be made transferable by freezing the 1st and 3rd components and retraining the locale-based RNN.

A Linear Algebra Companion Guide to the Goodfellow-Bengio Deep Learning Book

A fellow machine learning denizen constructed a Jupyter Notebook series to jump start someone trying to get into Deep Learning through Goodfellow’s & Bengilos pretty good open source book.

[Link]

Learning Location Embeddings

Machine Learning News

France’s AI Strategy

Emmanuel Macron realizes that the US & China are leading on AI, and he is creating a plan that tacts between the two powerhouses. He believes that AI will have profound effects on democracy; to ensure that the effects are positive, both government and private institutions should be transparent about algorithms and opening portals to data. His country will be accepting of AI integration on all fronts except within military applications.

Machine Learning Blueprint's Take

This approach on governing AI might create a homegrown European-centered ecosystem that obeys to the their guidelines, and could exclude foreign players that don’t want to be so open (eg government backed chinese companies). This could be a good thing for their citizens, and actually the world, if the EU can encourage more players to abide by their rules.

[Link]

Highlights from TensorFlow Summit

The TF dev team announced a number of new and upcoming features: TensorFlow Hub for sharing modules, tools for better visualization evaluation metrics, enhanced Nvidia GPU & and Intel CPU support, TF in Javascript, and new upcoming libraries for geomics.

[Link]

Predicting Bipolar States with Phone Data and Neural Networks

If not already familiar with Alibaba beyond being an online retailer, they also have a massive cloud AI strategy that is rivaling territories for both US giants Amazon and Google. Alibaba is slowly catching up in R&D spending, spending $2.6B last year, while Google and Amazon spent ~$15B each. Looking to their cloud strategy, on top of offering off-the-shelf ML API’s, Alibaba supports all major ML frameworks on their hardware, eliminating any lock-in (however, this might be moot with the proliferation of ONNX).

[Link]

China’s Rolled Out Their AI-Based Social Credit System

Leveraging facial recognition algorithms, the government is able to identify jaywalking citizens, send them a fine, publicly shame them, and affect their ability to get a job or loan. They’ve already caught 14,000 criminal jaywalkers at a single intersection in Shenzhen.

Machine Learning Blueprint's Take

Piggy backing off the spotlight article on using ML to affect behavior, this is a much more direct approach. However, it only showcases the potential iceberg of techniques they use to influence behavior. China is well known for controlling information flows for directing behavior, but no one has explored their techniques for doing this with AI.

The NIH Starts Taking Data Science Seriously - Publishes Strategic Plan and RFI

NIH published their strategic plan for bringing AI into their domain to help push the research boundary. They requested feedback from data science community stakeholders about the plan.

[Link]

Stanford Releases a New Deep Learning Benchmark Focused on Cost & Time

March Madness is Over - Queue the Statistical Analyses

Apple Poaches Google’s AI Chief

Interesting Research

Code2vec

The authors find a useful way to vectorize series of paths through the Abstract Syntax Tree of code. The vectorized representation can be used to describe certain properties of the code. The authors measure the algorithm’s effectiveness by seeing how well it can predict the method name.

[Link]

Forward-Backward Reinforcement Learning

An Analysis of Neural Language Modeling at Multiple Scales

#GlobalAIandDataScience
#GlobalDataScience

More Data Science News

Global AI & Data Science

Machine Learning Blueprint Newsletter, Edition 18, 4/1/18

Additional
Resources

Office

Quick Links

Global AI and Data Science

Global AI & Data Science

Machine Learning Blueprint Newsletter, Edition 18, 4/1/18

Additional Resources

Office

Quick Links

Additional
Resources