Home > OS >  Regular Expression to remove specific word from a string
Regular Expression to remove specific word from a string

Time:01-28

I want to remove Specific words with dot and without dot like (Pvt. ,Ltd. ,Pvt ,Ltd ,Pte. ,Pte ,Co., Co, Private Limited, Inc. , Incorporated) from the string and it should capture rest of the data available.

I have tried using

"\(|\)|-|\.|Pvt|Ltd|Incorporated|Pte|Inc|Co|Private|\s"

but it's not working.

Example text:

0.5Bn FinHealth Pvt. Ltd.Inc. Pte.Co.Private Limited Incorporated,
0.5Bn FinHealth Ltd.,
1MG Technologies Pvt. Ltd.,

I need help to improve the regex.

CodePudding user response:

Maybe give the following pattern a try:

(?:\s*\b(?:(?:Pvt|Ltd|Pte|Co)\.?|Inc\.|Incorporated|Private Limited)) 

See an online demo

  • (?: - Open 1st non-capture group;
    • \s* - 0 (Greedy) whitespace characters;
    • \b - A word-boundary;
    • (?: - Open a nested 2nd non-capture group;
      • (?:Pvt|Ltd|Pte|Co) - A 3rd nested non-capture group with the alternatives that can have optional dot behind;
      • \.? - An optional literal dot;
      • | - Or;
      • Inc\. - Literally match 'Inc.';
      • | - Or;
      • Incorporated - Literally match 'Incorporated';
      • | - Or;
      • Private Limited - Literally match 'Private Limited';
      • )) - Close non-capture groups and match the 1st one 1 times.

Replace matches with empty string.

Note: I was unsure what you meant to do with \(|\)|-|\. but my guess is you want to replace certain stand-alone characters. If so, you can include a character-class, for example: [().-] to replace these in another alternation.

  •  Tags:  
  • Related