With over 20,000 members, IBM's Data Science Community is the place to go to advance your learning on AI and Data Science topics. Connect with experienced Data Science and AI practitioners, learn from other learners following the same career objectives and share your experience and expertise.IBM is partnering with O'Reilly to offer new members a free download of Language Models in Plain English by Austin Eovito and Marina Danilevsky.
Join the community and receive an email on how to download your exclusive copy of Language Models in Plain English by clicking the "Get Offer" button below.
Recent advances in machine learning have lowered the barriers to creating and using ML models. But understanding what these models are doing has only become more difficult. We discuss technological advances with little understanding of how they work and struggle to develop a comfortable intuition for new functionality.
In this report, authors Austin Eovito and Marina Danilevsky from IBM focus on how to think about neural network-based language model architectures. They guide you through various models (neural networks, RNN/LSTM, encoder-decoder, attention/transformers) to convey a sense of their abilities without getting entangled in the complex details. The report uses simple examples of how humans approach language in specific applications to explore and compare how different neural network-based language models work.
This report will empower you to better understand how machines understand language.
Dive deep into the basic task of a language model to predict the next word, and use it as a lens to understand neural network language models
Explore encoder-decoder architecture through abstractive text summarization
Use machine translation to understand the attention mechanism and transformer architecture
Examine the current state of machine language understanding to discern what these language models are good at and their risks and weaknesses