Localização de arquivos

Você não está satisfeito com a limpeza do conjunto de dados de tuítes. Ainda há strings extras que não fornecem nenhum sentimento. Entre elas estão as strings que se referem a nomes de arquivos de texto.

Você também encontra uma maneira de detectá-los:

Eles aparecem no início da string.
Eles sempre começam com uma sequência de 2 ou 3 vogais maiúsculas ou minúsculas (a e i o u).
Eles sempre terminam com o final txt.

Você não tem certeza se deve removê-los diretamente. Então, você escreve um script para encontrá-los e armazená-los em um conjunto de dados separado.

Você anota alguns metacaracteres para ajudá-lo: ^ âncora para o início, . qualquer caractere.

A variável sentiment_analysis, que contém o texto de dois tuítes, bem como o módulo re, já estão carregados em sua sessão. Você pode usar para visualizá-lo no Shell IPython.

Este exercicio faz parte do curso

Expressões regulares em Python

Instruções do exercicio

Escreva uma regex que corresponda ao padrão dos nomes dos arquivos de texto, por exemplo, aemyfile.txt.
Encontre todas as correspondências da regex nos elementos de sentiment_analysis. Imprima o resultado.
Substitua todas as correspondências da regex por uma string vazia "". Imprima o resultado.

exercicio interativo prático

Tente este exercicio completando este código de exemplo.

# Write a regex to match text file name
regex = ____"____[____]{____}____txt"

for text in sentiment_analysis:
	# Find all matches of the regex
	print(re.____(____, ____))
    
	# Replace all matches with empty string
	print(re.____(____, ____, ____))

Editar e Executar Código

Este exercicio faz parte do curso

Expressões regulares em Python

InicianteNível de habilidade

4.8+

Comece o curso gratuitamente

Start your journey into the regular expression world! From slicing and concatenating, adjusting the case, removing spaces, to finding and replacing strings. You will learn how to master basic operation for string manipulation using a movie review dataset.

Exercise 1: Introduction to string manipulation Exercise 2: First day!Exercise 3: Artificial reviews Exercise 4: Palindromes Exercise 5: String operations Exercise 6: Normalizing reviews Exercise 7: Time to join!Exercise 8: Split lines or split the line?Exercise 9: Finding and replacing Exercise 10: Finding a substring Exercise 11: Where's the word?Exercise 12: Replacing negations

Following your journey, you will learn the main approaches that can be used to format or interpolate strings in python using a dataset containing information scraped from the web. You will explore the advantages and disadvantages of using positional formatting, embedding expressing inside string constants, and using the Template class.

Exercise 1: Positional formatting Exercise 2: Put it in order!Exercise 3: Calling by its name Exercise 4: What day is today?Exercise 5: Formatted string literal Exercise 6: Literally formatting Exercise 7: Make this function Exercise 8: On time Exercise 9: Template method Exercise 10: Preparing a report Exercise 11: Identifying prices Exercise 12: Playing safe

Time to discover the fundamental concepts of regular expressions! In this key chapter, you will learn to understand the basic concepts of regular expression syntax. Using a real dataset with tweets meant for sentiment analysis, you will learn how to apply pattern matching using normal and special characters, and greedy and lazy quantifiers.

Exercise 1: Introdução às expressões regulares Exercise 2: Eles são bots?Exercise 3: Encontre os números Exercise 4: Corresponder e dividir Exercise 5: Repetições Exercise 6: Tudo limpo Exercise 7: Há algum tempo Exercise 8: Obtenção de tokens Exercise 9: Metacaracteres de regex Exercise 10: Localização de arquivos

Exercicio Atual

Exercise 11: Dê-me seu e-mail Exercise 12: Senha inválida Exercise 13: Correspondência gananciosa (greedy) vs. não gananciosa (non-greedy)Exercise 14: Entendendo a diferença Exercise 15: Correspondência gananciosa Exercise 16: Abordagem preguiçosa

In the last step of your journey, you will learn more complex methods of pattern matching using parentheses to group strings together or to match the same text as matched previously. Also, you will get an idea of how you can look around expressions.

Exercise 1: Capturing groups Exercise 2: Try another name Exercise 3: Flying home Exercise 4: Alternation and non-capturing groups Exercise 5: Love it!Exercise 6: Ugh! Not for me!Exercise 7: Backreferences Exercise 8: Parsing PDF files Exercise 9: Close the tag, please!Exercise 10: Reeepeated characters Exercise 11: Lookaround Exercise 12: Surrounding words Exercise 13: Filtering phone numbers Exercise 14: Finishing line