Machine Learning Blueprint Newsletter, Edition 30, 8/23/18

 View Only

Machine Learning Blueprint Newsletter, Edition 30, 8/23/18 

Wed June 19, 2019 04:42 PM


This newsletter is written and curated by Mike Tamir and Mike Mansour. 

August 23, 2018

Hi all,
Hope you enjoy this week's ML Blueprint. We talk about the failings of self-regulation, introduce a new visual debugger for jupyter notebooks, explore adding coordinates into CNN's for translational variance, and explain some new research papers.

Spotlight Articles
DeepMind is starting a new effort to research frameworks for safety in machine learning systems. They’re focusing on 3 main areas
  • Specification: Defining the purpose of the system, especially to avoid perverse outcomes from ill-thought out reward functions, and that the behavior is expected
  • Robustness: Making sure systems can handle distribution shifts from unpredictable environments, and that agent exploration is safe
  • Assurance: Monitoring systems, predicting agent behavior, and creating effective kill-switches or human-overrides. The kill-switch/human-override problem is particularly challenging as to not have this adversely affect the reward function, and have an agent evade a shutdown.
Machine Learning Blueprint's Take
Having a unified approach to thinking about safety is important because those interacting with these ML systems should be able to consistently address behavioral issues and make design choices. Having a consistent “blueprint” to designing these systems will lead to less design errors and misunderstandings. I don’t build buildings, but I’d imagine that there are an agreed upon and predefined set of standards for architecting buildings that the builders can then easily understand and follow. There’s no reason this shouldn’t apply to ML systems. The same goes for emergency building access by responders - they too should have a standardized way of accessing controls and stopping processes.
Dijkstra’s and other related algorithms like Bellman Ford can all can solve the shortest path problem efficiently by sharing a certain principle in common; the relaxation principle. This technique overestimates costs, and then iteratively relaxes the path costs. The point in this article, however, shows that several different CS areas also use this same technique to find optimal paths, with or without knowing directly of Dijkstra's. The author, in great technical detail, shows how these same principles apply to currency arbitrage, Soft-Q reinforcement learning, and the recent advancements in ray tracing for graphics rendering.
A message from our sponsors...
WekaIO Matrix™, the world’s fastest distributed file system for machine learning workloads, has been named a Cool Vendor by Gartner, Inc., in the Gartner Cool Vendors in Storage Technologies, 2018 report.
Matrix is the first and only NVMe-native shared and distributed file system that has been written to support new high-performance workloads in machine learning. The demands of 2018 workloads cannot be met with a thirty-year old protocol such as NFS, that was designed when networks were slow relative to the storage media.

Learning Machine Learning
Yoshua Bengio shares many of his hard-learned lessons from establishing a research lab as a budding academic in this interview. He gives suggestions to the rising researcher to not repeat the same mistakes as him, and guidance for forging the best path to being a subject matter expert in a narrow area.
Netflix takes Jupyter Notebooks seriously across a number of data-related roles; so much so they’ve built their workflows around it. Learn how they use different templates, scheduling engines, and open source tools to support a data focused workforce. Part I explains how they use and integrate notebooks across different roles, while part II part II explains their compute infrastructure and scheduling engines.
Reducing the numerical precision (eg to how many decimal places a computer can process a floating point number in memory) has been one of the keys to getting larger more complex neural networks to train faster and fit into memory. DNN’s don’t seem to need that much precision, as compared to a physics simulation of a complex system. This walkthrough explains the intuition behind this idea and some methods for implementing it in your TF models.

Machine Learning News
Linus has historically been known for his rather… vibrant responses on Linux listserv. Citing a lifetime of low emotional IQ, he is stepping back from the project for some time to change his behavior, but who knows how long that will be. The Linux Foundation is also revising its "Code of Conflict" to a code of conduct to reflect the community's move towards a more civil and friendly standard of behavior across all of its projects.
Machine Learning Blueprint's Take
“Community Conduct” has been a big theme this year. We’ve seen Guido Van Rossem step down as BDFL from Python over the unpleasantries in community behavior there, and NIPS is revising its name to be more palpable to the wider deep learning community. This will probably continue through more factions of the software and ML world as the field becomes more mainstream, and it’s a positive change in tune that will further attract talent to the area.
Google releases a new search category for publically available datasets. One of the powerful aspects is the sometimes rich metadata extracted and presented in a common format, like the licensing and data-formats available, along with general descriptions.
Machine Learning Blueprint's Take
Increasing the ease of finding data can be considered a small equalizer in the ML world. It might also lead to algorithms now being trained on a more diverse selection of data, instead of just using the most popular academic datasets, which objective would be beneficial.
Facebook is cheerleading Dual Affiliation, a practice where those in academia can work in industry for part of their time. They claim that this brings together researchers from their silos and helps advance AI research overall. Dual affiliation is realistic in this field since so many are interested in making research + tools open source. A counterpoint article claims that this will actually be harmful to academia; working for industry will stifle the curiosity-driven research that does not have a profit-generating motive behind it. Furthermore, dual affiliation tears academics out of their communities, stripping mentorship opportunities from rising students.
This open source package from Facebook for time series forecasting and seasonality analysis looks powerful. An interesting feature here allows you to inject domain knowledge you may have w.r.t. seasonality, such as changepoints and holidays. It uses Pandas DataFrames as input, and produces great visualizations with confidence intervals. They provide a short tutorial on getting started that showcases the tools available.
Snorkel is a tool generating noisey labels for vast amounts of data using user defined labeling functions that help to train learned labeling functions. The resulting labels are surprisingly effective, and the overall framework relatively easy to use. See a summary of their paper here.
Interesting Research
Not the usual machine learning research, but a security researcher recently found a number of ROS instances, an open source robotics operating system [https://en.wikipedia.org/wiki/Robot_Operating_System], were left totally exposed to the internet. This means that cameras, sensors, and worst of all, potential control of the robot, were vulnerable to anyone in the world. Like a good security researcher, they provide suggestions and tools for securing a ROS environment.
Machine Learning Blueprint's Take
Security is usually not a the the tip of ML engineers’ minds when deploying algorithms or robotics out into the wild. Worse, it’s possible that some ML frameworks and systems have complicated configurations that could open up holes in a system or have vulnerabilities that might remain unchecked. Arguably, complex ML systems might be difficult for a security researcher/analyst to fully understand and vet - perhaps we’ll see a new field in the future for securing these types of environments.
A new training method for learning sentence encoders takes an unusual approach. They train the encoder on the binary classification task of fake sentence detection. The fake sentences are generated with either WordShuffle or WordDrop, and a Bidirectional LSTM is employed. This method requires significantly less data and trains in 20 hours instead of weeks. Systems that are trained with this approach achieve better language modeling scores.


#GlobalAIandDataScience
#GlobalDataScience

Statistics

0 Favorited
9 Views
0 Files
0 Shares
0 Downloads