Text Strings in Excel Macros

 

« Back to videos

Additional Information:

Although spreadsheets are primarily used for calculations, they often end up containing a lot of text e.g. a list of addresses may contain state and zip codes. Excel does not know that "NY 10000" is in the same state as "NY 10001". This severely restricts your ability to use standard Excel filters when analyzing the data. In this case, you could use Excel formulas to separate the data into two e.g. "=LEFT(StateZip, 2)" and "=RIGHT(StateZip, 5)". However, this is only possible if the data is extremely consistent. Otherwise you have to use Visual Basic.

All text functions are general Visual Basic functions, and can be used outside of Excel. This is unlike the majority of functions which are used to get properties from Excel objects e.g. Cells.Address would retrieve a cells address. As these are pure VBA functions, they use the prefix VBA, so to find the length of a string, you simply enter VBA.Len(TextString).

In the video, all text functions are written with the VBA prefix. In practice, it is implied. Its inclusion enhances reliability, especially if you have chosen not to explicitly define variables. Also, typing "VBA." into the Visual Basic Editor will display a list of all Visual Basic functions. Sifting through these can give you a better idea of what else exists.

Perhaps the most important text function is the InStr function. The InStr function will locate the character at which one string exists within another string. This is best illustrated with an example. Imagine you want to separate the domain from an e-mail address. To do that, you need to find the "@" symbol in the address, and take everything to the right of it. You can find the "@" symbol using InStr e.g. SymbolLocation = VBA.InStr(1, "email@generic.com", "@") returns the value 6 as "@" is the 6th character in the string. The 1 simply means that you are looking for the first occurence of the "@" symbol or, more accurately, you start looking for the "@" symbol from the first character onwards.

To extract the domain, we need to take everything that follows SymbolLocation. We can use the MID function for this e.g. Domain = VBA.Mid("email@generic.com", SymbolLocation + 1). The best way to explore functions is to try using them and see what happens. Note that most text functions e.g. LEFT, RIGHT, TRIM and LEN have exactly the same arguments in Visual Basic as in normal Excel. Some functions have different names e.g. UPPER becomes UCASE in written code.

The above example illustrates one use for the InStr function. However, it can serve another purpose in identifying anomalous data. If an e-mail address did not contain an "@" symbol, we would not be able to extract a domain. The InStr function would then return a value of 0. So we could construct an IF statement to highlight cells containing invalid addresses e.g. If VBA.InStr(1, Cells(RowNum, EmailCol), "@") = 0 Then Cells(RowNum, EmailCol).Interior.ColorIndex = 3, should we wish to turn cells red.

Note that practical applications of Visual Basic nearly always rely on the ability to use Loops and Conditions effectively.