Kelime sıklığı analizi

Tebrikler! Az önce PyBooks ekibine katıldın. PyBooks bir kitap öneri sistemi geliştiriyor ve önerilerini iyileştirmek için metinlerdeki örüntüleri ve eğilimleri bulmak istiyor.

Başlamak için, verilen bir metindeki kelimelerin sıklığını anlaman ve nadir kelimeleri kaldırman gerekiyor.

Gerçek dünyadaki tipik veri kümelerinin bu örnekten daha büyük olacağını unutma.

Bu egzersiz

PyTorch ile Metin için Deep Learning

kursunun bir parçasıdır

Kursu Görüntüle

Egzersiz talimatları

torchtextten get_tokenizerı ve nltk kütüphanesinden FreqDisti içe aktar.
İngilizce için bir belirteçleyici (tokenizer) başlat ve verilen texti belirteçlere ayır.
tokens için frekans dağılımını hesapla ve list anlama (list comprehension) kullanarak nadir kelimeleri kaldır.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Import the necessary functions
from torchtext.data.utils import ____
from nltk.probability import ____

text = "In the city of Dataville, a data analyst named Alex explores hidden insights within vast data. With determination, Alex uncovers patterns, cleanses the data, and unlocks innovation. Join this adventure to unleash the power of data-driven decisions."

# Initialize the tokenizer and tokenize the text
tokenizer = ____("basic_english")
tokens = tokenizer(____)

threshold = 1
# Remove rare words and print common tokens
freq_dist = ____(____)
common_tokens = [token for token in tokens if ____[token] > ____]
print(common_tokens)

Kodu Düzenle ve Çalıştır

Bu egzersiz

PyTorch ile Metin için Deep Learning

kursunun bir parçasıdır

AvançadoNível de habilidade

4.8+

Kursa Ücretsiz Başlayın

This chapter introduces you to deep learning for text and its applications. Learn how to use PyTorch for text processing and get hands-on experience with techniques such as tokenization, stemming, stopword removal, and more. Understand the importance of encoding text data and implement encoding techniques using PyTorch. Finally, consolidate your knowledge by building a text processing pipeline combining these techniques.

Exercise 1: Metin için ön işleme giriş Exercise 2: Kelime sıklığı analizi

Geçerli Egzersiz

Exercise 3: Metni ön işleme Exercise 4: Metin verisini kodlama Exercise 5: One-hot kodlanmış kitap başlıkları Exercise 6: Kitap başlıkları için Bag-of-words Exercise 7: Kitap açıklamalarına TF-IDF uygulama Exercise 8: Metin işleme hattı oluşturmaya giriş Exercise 9: Shakespeare dilinde ön işleme hattı Exercise 10: Shakespeare dili kodlayıcı

Explore text classification and its role in Natural Language Processing (NLP). Apply your skills to implement word embeddings and develop both Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) for text classification using PyTorch, and understand how to evaluate your models using suitable metrics.

Exercise 1: Overview of Text Classification Exercise 2: Embedding in PyTorch Exercise 3: Categorizing text classification tasks Exercise 4: Convolutional neural networks for text classification Exercise 5: Build a CNN model for text Exercise 6: Train a CNN model for text Exercise 7: Testing the Sentiment Analysis CNN Model Exercise 8: Recurrent neural networks for text classification Exercise 9: Building an RNN model for text Exercise 10: Building an LSTM model for text Exercise 11: Building a GRU model for text Exercise 12: Evaluation metrics for text classification Exercise 13: Evaluating RNN classification models Exercise 14: Evaluating the model's performance Exercise 15: Comparing models

Venture into the exciting world of text generation and its applications in NLP. Understand how to leverage Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and pre-trained models for text generation tasks using PyTorch. Alongside, you'll learn to evaluate the performance of your models using relevant metrics.

Exercise 1: Introduction to text generation Exercise 2: Creating a RNN model for text generation Exercise 3: Text generation using RNN - Training and Generation Exercise 4: Generative adversarial networks for text generation Exercise 5: Building a generator and discriminator Exercise 6: Training a GAN model Exercise 7: Pre-trained models for text generation Exercise 8: Text completion with pre-trained GPT-2 models Exercise 9: Language translation with pretrained PyTorch model Exercise 10: Evaluation metrics for text generation Exercise 11: Evaluating pretrained text generation model Exercise 12: Understanding text generation metrics

Understand the concept of transfer learning and its application in text classification. Explore Transformers, their architecture, and how to use them for text classification and generation tasks. You will also delve into attention mechanisms and their role in text processing. Finally, understand the potential impacts of adversarial attacks on text classification models and learn how to protect your models.

Exercise 1: Transfer learning for text classification Exercise 2: Transfer learning using BERT Exercise 3: Evaluating the BERT model Exercise 4: Transformers for text processing Exercise 5: Creating a transformer model Exercise 6: Training and testing the Transformer model Exercise 7: Attention mechanisms for text processing Exercise 8: Creating a RNN model with attention Exercise 9: Training and testing the RNN model with attention Exercise 10: Adversarial attacks on text classification models Exercise 11: Adversarial attack classification Exercise 12: Safeguarding AI at PyBooks Exercise 13: Wrap-up