Home > Software design >  Regex: How to get rid of trailing newline symbol in the match?
Regex: How to get rid of trailing newline symbol in the match?

Time:01-07

I was looking for a regular expression that will match the first date that is NOT followed by a line that stars with 'R' letter (I want to get last commit date from git log, except for only path changing commits).

2021-12-07T16:39:43 01:00
M       test.md

2021-12-07T16:18:59 01:00
R100    old.test.md  test.md

2021-12-07T15:37:15 01:00
A       old.test.md

After a significant struggle, I get this regex (I don't care much about the date validity, I trust Git):

/20[\d-T:.Z ] (?:\r?\n)(?!R)/

It is working quite fine – but it leaves a trailing newline character \n. No big deal, it can be repaced; but safely chaining match with replace makes the code incredibly ugly (I am using Node.js):

import { execSync } from "child_process"

const allAuthorDates = execSync(
  `git log --follow --name-status --pretty=format:%aI -- test.md`
).toString()

const lastEditExceptPathChangeDateMatch = allAuthorDates.match(
  /20[\d-T:.Z ] (?:\r?\n)(?!R)/
)

console.log(lastEditExceptPathChangeDateMatch)

/*
[
  '2021-12-07T16:39:43 01:00\n', <-- how to get rid of that \n?
  index: 0,
  input: '2021-12-07T16:39:43 01:00\nA\ttest.md\n',
  groups: undefined
]
*/

Because of the trailing newline char, I am forced to use the horrible let result = (str.match(/…/) || [""])[0].replace(…) is there to prevent the runtime error in (probably unlikely) case the match will return null.

Would somebody be so kind to advice me how to edit the regex to get rid of the trailing \n while matching, to make the following replace unneccessary?

CodePudding user response:

You need to match the end of the line with $ (making sure you add the m flag) and then use the negative lookahead with the \r?\n pattern moved into the lookahead right before R:

/20[\d-T:.Z ] $(?!\r?\nR)/m

See this regex demo. Or, if you want to spell out the pattern:

/^20\d{2}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}[ -]\d{2}:\d{2}$(?!\r?\nR)/gm

Here, I also added g flag to match multiple occurrences. See this regex demo.

The $(?!\r?\nR) part matches the position before CR or LF char (carriage return or line feed) and then makes sure there is no optional CR char, an LF char and R immediately to the right of the current location.

  •  Tags:  
  • Related