Get startedGet started for free

Cases and spaces

1. Cases and Spaces

Two of the most common inconsistencies in text data are differences in the case of characters and in the spaces in a string. We can deal with these issues by using functions to change character case or remove spaces and by querying data with the LIKE operator.

2. Converting case

First, one of the easiest ways to handle inconsistencies in case is to convert character data to either be all upper or all lower case. The upper and lower functions do just that. The functions have no effect on punctuation or numbers.

3. Case insensitive comparisons

You can use the lower or upper function to make comparisons case insensitive. For example, the fruit data here has 8 entries corresponding to apple, but there are 6 different ways the data is entered. To select rows from the fruit table with the value apple - regardless of case - we can convert all fav_fruit values to lower case with the lower function. Then select rows where the result of the function is equal to 'apple', all lower case. Note that while we got both upper and lower case versions of apple in our 5 results, we are still missing 3 values with spaces at the beginning or end of the word apple, or with the plural apples instead of apple.

4. Case insensitive searches

The LIKE operator can help us match values of apple that might have extra spaces or s-es at the end. By using a LIKE pattern with a percentage sign before and after apple, we match fav_fruit entries where apple is anywhere in the string. Remember that with LIKE, percentage matches any number of characters, including 0, while an underscore matches exactly one character. Now we have values of apple with spaces and s-es, but only lower case. To make this query case insensitive, we can use ILIKE instead of LIKE. The I stands for insensitive. ILIKE queries take longer to run than LIKE queries, so only use them when you need to. Using ILIKE we also select variations of apple with upper case characters. All 8 variations of apple are now in the result.

5. Watch out!

Remember though that LIKE searches can match more than you may intend. Our query to select apple values would also select pineapple!

6. Trimming spaces

Turning now to the issue of extra spaces, one way to deal with them is to use a trim function to remove spaces from one or both ends of a string. Trim, or btrim, removes spaces from both ends of a string by default. rtrim removes spaces only from the right, or end, and ltrim removes spaces only from the left, or beginning. By default, the trim functions remove only the space character, not other white space characters like tabs or newlines.

7. Trimming other values

While the functions remove spaces by default, you can specify other characters that should be removed instead. You can remove a single character, such as an exclamation point, or a set of characters, all together in a single string. The trim functions are case sensitive, so in the second example, we include both an upper and lower case W.

8. Combining functions

Instead of specifying both lower and upper case versions of the same letter, we can combine functions. Remember that we can nest the call to one function inside another function. The inner function is executed first, then the result is sent to the outer function. Here, we first convert all of the characters to lower case with the lower function, then we use the trim function to remove exclamation points and lower case w's.

9. Bring order to messy text!

Now it's your turn to practice bringing order to messy text. Watch out for extra spaces and case inconsistencies in the exercises!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.