Home > Back-end >  splitting underscores in Excel
splitting underscores in Excel

Time:02-05

I'm fairly new to Excel and need some assistance. I have a Column that has a list of files that look like:

12345_v1.0_TEST_Name [12345]_01.01.2022.html
45321_v55.9_Some Name Here [64398]_07.15.2018.html
56871_v14.2_Test[64398]_10.30.2019.html

Each file name can be different depending on what output is provided to me.

Note: There are other random files in the same format, however where it says Test_Name there could be an underscore and sometimes no underscore. Would like that to be ignored in the formula or vba. Files also can change but will be in the same format.

I need some help with a formula or vba that splits the underscores and outputs the data into their own cells:

Column C    12345
Column D    v1.0
Column E    TEST_Name [12345]
Column F    01.01.2022
Column G    .html 

CodePudding user response:

Is this what you are trying to achieve, although there might be more eloquent way to use a formula, and solve this, however you may try using this as well,

FORMULA USED IN CELL C1

=IF(LEN($B2)-LEN(SUBSTITUTE($B2,"_","")) 1>4,TRIM(MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE($B2,".html","_.html"),"_"," ",3),"_",REPT(" ",100)),COLUMN(A2)*99-98,100)),TRIM(MID(SUBSTITUTE(SUBSTITUTE($B2,".html","_.html"),"_",REPT(" ",100)),COLUMN(A2)*99-98,100)))

Fill Down & Fill Across !

SPLITTING_UNDERSCORES

CodePudding user response:

There are other random files in the same format.....Files also can change but will be in the same format.

So, assuming the files indeed will be in the same format, we can brake this query down into the following requirements:

  • Change the 1st and 2nd occurence and the very last of the underscore into anything to split on;
  • Change the dot before the file-extension into anything to split on under the assumption we don't know if this would be '.html' or any other extension.

Since you have Microsoft365 we can use dynamic arrays and some basic functions to retrieve what you want:

enter image description here

=LET(X,SEARCH("_??.??.????.",A1),Y,"</s><s>",TRANSPOSE(FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(REPLACE(A1,X,12,Y&MID(A1,X 1,10)&Y),"_",Y,2),"_",Y,1)&"</s></t>","//s")))

To break this down a little bit:

  • SEARCH("_??.??.????.",A1) - This part will make sure that we find the position of the very last underscore upto the dot before the file extension assuming you don't have any other date in your filenames in this specific format;
  • SUBSTITUTE() - We can use this formula to specifically change the 1st and 2nd instances of the underscore to anything we can split on;
  • FILTERXML() - You may notice we used valid xml start/end-tags to split our data using this function.
  • TRANSPOSE() - This last function will now spill the returned array over the columns instead of rows.
  •  Tags:  
  • Related