Thursday, 15 August 2019

Is there a way to use Regex( or comparable solution) in Excel formulas?


I have a spreadshet that requires me to do some matching,namely speaking its a list of bonds, their tickers(names) and some data about them. I want to match their tickers(names), I know how would regex to match them using regex ("[A-Z]+ \d{2,4}-\w+ \w+"), but I am not familiar with how to use them outside of VBA(yes I prefer to not use VBA, people that will use this dont' knwo VBA).


I read this piece by Microsoft, so I know that this question(statement in question not the answer) is not correct, but I never had any luck actually finding out how to implelemnt regex in Excel especially formulas(maybe in MATCH just maybe).


Lastly my an example( just the relevant column) to make my question more concrete, I want to return cell rows with tickers:


JPMCC 2012-1 A1 

Prepay %
Loss %
Credit Support
ETC...

MLMTI 2014-6 B4

Prepay %
Loss %
Credit Support
ETC...

Rows to return:


JPMCC 2012-1 A1 
MLMTI 2014-6 B4

Note that I have a lot of such bonds in this column, I need to find each one by one


Also If you find that using helper columns is usefull here I am open to that, for now I just want to avoid VBA.


If needed I'll provide further information.


Thank you


UPDATE


Data, in this case includes headers such as "Credit Support", "Cumulative Loss %", "Prepay %" etc. there are dozens of such headers but its all text text, numbers are found in different column.



Answer



Excel does not have regular expressions, but you can certainly build up most of what they can do one formula at a time. Here are some examples:


Extract first space-delimited word:


B1: =FIND(" ", A1)
C1: =LEFT(A1, B1 - 1)

Check if the first word is all non-lowercase:


D1: =EXACT(UPPER(C1), C1)

Extract second space-delimited word:


E1: =FIND(" ", A1, B1 + 1)
F1: =MID(A1, B1 + 1, E1 - B1 - 1)

Extract 4-digit year if possible:


G1: =LEFT(F1, 4) - 0

You can create as many of these extractions and tests as you need to become convinced that the data in A1 matches the ticker format, and not the header format, and then use something like this to reproduce that cell only in the right cases:


H1: IFERROR(IF((B1+E1+LEN(C1)+LEN(F1)+G1+D1)*0=1,"",A1),"")

spreadsheet showing helper columns


No comments:

Post a Comment

How can I VLOOKUP in multiple Excel documents?

I am trying to VLOOKUP reference data with around 400 seperate Excel files. Is it possible to do this in a quick way rather than doing it m...