Global AI and Data Science

Global AI & Data Science

Train, tune and distribute models with generative AI and machine learning capabilities

 View Only
  • 1.  Introducing (NLTK)

    Posted Fri April 10, 2020 09:58 AM
    Introducing (NLTK)
    The Natural Language Toolkit (NLTK) is a platform used for building programs for text analysis.
    pip install nltk
    nltk.download()

    Sometimes we need to filter out useless data to make the data more understandable by the computer. In natural language processing (NLP), such useless data (words) are called stop words.

    from nltk.corpus import stopwords
    
    print(set(stopwords.words('Arabic')))
    
    print(set(stopwords.words('English')))

    How can we remove the stop words from our own text? The example below shows how we can perform this task:

    from nltk.tokenize import sent_tokenize,word_tokenize
    
    from nltk.corpus import stopwords
    
    data='All work and no play'
    
    
    stopword=set(stopwords.words('English'))
    word=word_tokenize(data)
    
    wordsfilter=[]
    
    for w in word:
        if w not in stopword:
            wordsfilter.append(w)
            
    print(wordsfilter)
    
    
    
    

    treebank


    import nltk
    nltk.download('treebank')
    from nltk.corpus import treebank
    t=treebank.parsed_sents('wsj_0001.mrg')[0]
    t.draw()

    wordnet 

    import nltk
    from nltk.corpus import wordnet
    
    syn=wordnet.synsets('NLP')
    
    print(syn[0].definition())
    
    
    synonyms=[]
    
    for syn in wordnet.synsets('computer'):
        for lemma in syn.lemmas():
            synonyms.append(lemma.name())
    print(synonyms)       
    
    antonyms=[]
     
    for syn in wordnet.synsets('small'):
        for l in syn.lemmas():
            if l.antonyms():
                antonyms.append(l.antonyms()[0].name())
    print(antonyms)       
    
    from nltk.stem import PorterStemmer
    stemmer= PorterStemmer()
    print(stemmer.stem('Working'))



    ------------------------------
    Mostafa Nabieh
    ------------------------------



    #DataandAILearning
    #AIandDSSkills
    #AIandDSSkills


  • 2.  RE: Introducing (NLTK)

    Posted Mon April 13, 2020 09:37 AM
    Absolutely amazing!!!

    ------------------------------
    Dângelo Fernandes de Oliveira
    ------------------------------