Message Image  

IBM TechXchange Generative AI User Group Sponsored by IBM US Financial Services Market

 View Only

Generative AI exists because of the transformer -- This is how it works...

  • 1.  Generative AI exists because of the transformer -- This is how it works...

    User Group Leader
    Posted Tue September 19, 2023 10:43 AM

    This is how it: Writes, Works, Learns, Thinks and Hallucinates

    "Over the past few years, we have taken a gigantic leap forward in our decades-long quest to build intelligent machines: the advent of the large language model, or LLM.

    This technology, based on research that tries to model the human brain, has led to a new field known as generative AI - software that can create plausible and sophisticated text, images and computer code at a level that mimics human ability.

    Businesses around the world have begun to experiment with the new technology in the belief it could transform media, finance, law and professional services, as well as public services such as education. The LLM is underpinned by a scientific development known as the transformer model, made by Google researchers in 2017.

    "While we've always understood the breakthrough nature of our transformer work, several years later, we're energised by its enduring potential across new fields, from healthcare to robotics and security, enhancing human creativity, and more," says Slav Petrov, a senior researcher at Google, who works on building AI models, including LLMs.

    LLMs' touted benefits - the ability to increase productivity by writing and analysing text - are also why it poses a threat to humans. According to Goldman Sachs, it could expose the equivalent of 300mn full-time workers across big economies to automation, leading to widespread unemployment.

    As the technology is rapidly woven into our lives, understanding how LLMs generate text means understanding why these models are such versatile cognitive engines - and what else they can help create.

    To write text, LLMs must first translate words into a language they understand.

    First a block of words is broken into tokens - basic units that can be encoded. Tokens often represent fractions of words, but we'll turn each full word into a token.

    In order to grasp a word's meaning, work in our example, LLMs first observe it in context using enormous sets of training data, taking note of nearby words. These datasets are based on collating text published on the internet, with new LLMs trained using billions of words.

    Eventually, we end up with a huge set of the words found alongside work in the training data, as well as those that weren't found near it.

    As the model processes this set of words, it produces a vector - or list of values - and adjusts it based on each word's proximity to work in the training data. This vector is known as a word embedding.

    A word embedding can have hundreds of values, each representing a different aspect of a word's meaning. Just as you might describe a house by its characteristics - type, location, bedrooms, bathrooms, storeys - the values in an embedding quantify a word's linguistic features.

    The way these characteristics are derived means we don't know exactly what each value represents, but words we expect to be used in comparable ways often have similar-looking embeddings.

    A pair of words like sea and ocean, for example, may not be used in identical contexts ('all at ocean' isn't a direct substitute for 'all at sea'), but their meanings are close to each other, and embeddings allow us to quantify that closeness.

    By reducing the hundreds of values each embedding represents to just two, we can see the distances between these words more clearly.

    We might spot clusters of pronouns, or modes of transportation, and being able to quantify words in this way is the first step in a model generating text."

    Read more here: https://ig.ft.com/generative-ai/



    ------------------------------
    Kaitlyn Arnold
    ------------------------------