The media team is running some ad campaigns that I need to report on. The reports need to be based on either cities or city groups. A small example of the campaign names are:
Brand-Platform-Safety-YT-DV360-Female-FTP-Non-Skippable-10Sec-Next8-Top10HHI-Android-6th jan'22
Brand-Disinfection-YT-DV360-Male-FTP-Non-Skippable-20Sec-Bengaluru-Top10HHI-Android-6th jan'22
Brand-Disinfection-YT-DV360-Female-FTP-Non-Skippable-6Sec-Bengaluru-Top10HHI-Android-6th jan'22
Brand-Disinfection-Display-DV360-Male-RTP-Display-Top10City-Top10HHI-Android-6th jan'22
Brand-Platform-Safety-YT-DV360-Female-FTP-Non-Skippable-10Sec-Next12-Top10HHI-Android-6th jan'22
Brand-Disinfection-YT-DV360-Female-FTP-Non-Skippable-6Sec-Kolkata-Top10HHI-Android-6th jan'22
Brand-Disinfection-YT-DV360-Male-FTP-Non-Skippable-20Sec-Kolkata-Top10HHI-Android-6th jan'22
I need to extract city names from these campaign names. They’ve setup more than a 100 campaigns based on different criteria so I’ll be difficult to do manually. Assuming that the campaign names are stored in column A in my spreadsheet, I can almost get the entire result with this formula:
=ARRAYFORMULA(IF(ROW(A:A)=1,"City",
IF(ISBLANK(A:A),,
REGEXEXTRACT(A:A,"Sec-(. )-Top"))))
There are two issues with the formula. It currently relies on a string starting with “Sec-“ and ending with “-Top” and extracts the data between them.
- The first issue is that sometimes the string will start with “Display-” instead of “Sec-“. How do I make the formula adapt to that
- The second issue is that sometimes instead of cities like “Delhi” / “Kolkata”, the media team has used city groups like “Next8” / “Top10City”. Since the word “-Top” is used, the regexexract function just picks the entire string. How do I prevent this?
Any help is greatly appreciated!
CodePudding user response:
try:
=ARRAYFORMULA({"City";
IF(ISBLANK(A2:A),,REGEXREPLACE(
IFNA(REGEXEXTRACT(A2:A, "Sec-(. )-Top"),
REGEXEXTRACT(A2:A, "Display-(. )-Top")), ".*Display-", ))})

