Home > Software design >  Regular expression only finding first match
Regular expression only finding first match

Time:02-04

I'm working on something that is similar to other designs I've done, but for some reason, it's only finding the first key/value pair, whereas other ones found all of them. It looks good in regex101.com, which is where I typically test these.

I'm parsing c code to get what I need for a reference spreadsheet for error tracking across a system, and results go into a spreadsheet, or is used as a key to lookup info in another file. I do something similar for about 20 files, plus there's other data coming from a sql query, or access/mdb file. The data for this file looks like this:

m_ErrorMap.insert(make_pair(
    MAKEWORD(scError,seFatal),
    HOP_FATAL_ERROR ));
m_ErrorMap.insert(make_pair(
    MAKEWORD(scError,seNotSelected), 
    HOP_NOT_SELECTED));
m_ErrorMap.insert(make_pair(
    MAKEWORD(scError,seCoverOpen),
    HOP_COVER_OPEN ));
m_ErrorMap.insert(make_pair(
    MAKEWORD(scError,seLeverPosition),
    HOP_LEVER_POSITION ));
m_ErrorMap.insert(make_pair(
    MAKEWORD(scError,seJam),
    HOP_JAM ));

I read this as a string from the file (looks good), and feed it into this Function as $fileContent:

Function Get-Contents60{
  [cmdletbinding()]
  Param ([string]$fileContent)
  Process
  {
     
            #m_ErrorMap.insert(make_pair(
            #MAKEWORD(scError,seJam),
            #HOP_JAM ));

     # construct regex
     switch -Regex ($fileContent -split '\r?\n') {  #this is splitting on each line               test regex with https://regex101.com/
      'MAKEWORD["("][\w] ,(\w )[")"],' { #seJam
          # add relevant key to key collection
          $keys = $Matches[1] } #only match once
          ',(HOP.*?)[\s]' { #  HOP_JAM
          # we've reached the relevant error, set it for all relevant keys
          foreach($key in $keys){
              Write-Host "60 key: $key"
              Write-Host "Matches[0]: $($Matches[0]) Matches[1]: $($Matches[1])"
              $errorMap[$key] = $($Matches[1])
              Write-Host "60 key: $key ... value: $($errorMap[$key])"
          }
      }
      'break' {
          # reset/clear key collection
          $keys = @()
      }    
    }#switch


    #Write-Host "result:" $result -ForegroundColor Green
    #$result;
    return $errorMap


     
  }#End of Process
}#End of Function

I stepped through it in VSCode, and its finding the first key/value pair, and after that it's not finding anything. I looked at it in regex101.com, and it's finding line endings/breaks, and the MAKEWORD regex and HOP regex are finding what they should on each line it should.

I'm not sure if the issue is that they aren't all in the same line, and maybe I need to change it so it doesn't break on newline and breaks on something else for each key/value pair? I'm a little fuzzy on this.

I'm using powershell 5.1, and VSCode.

Update:

I modified Theo's answer and it worked great. I had simplified the class name from m_HopErrorMap to m_ErrorMap for this question, and the regular expression was grabbing that for each one. I modified that slightly, and Theo's works.

function Get-Contents60{
    [cmdletbinding()]
    Param ([string]$fileContent)

    # create an ordered hashtable to store the results
    $errorMap = [ordered]@{}
    # process the lines one-by-one
    switch -Regex ($fileContent -split '\r?\n') {
        'MAKEWORD\([^,] ,([^)] )\),' { # seJam, seFatal etc.
            $key = $matches[1]
        }
        '(HOP_[^)] )' {
            $errorMap[$key] = $matches[1].Trim()
        }
    }
    # output the completed data as object
    [PsCustomObject]$errorMap
    return $errorMap
}

CodePudding user response:

I would simplify your function to

function Get-Contents60{
    [cmdletbinding()]
    Param ([string]$fileContent)

    # create an ordered hashtable to store the results
    $errorMap = [ordered]@{}
    # process the lines one-by-one
    switch -Regex ($fileContent -split '\r?\n') {
        'MAKEWORD\([^,] ,([^)] )\),' { # seJam, seFatal etc.
            $key = $matches[1]
        }
        '(HOP[^)] )' {
            $errorMap[$key] = $matches[1].Trim()
        }
    }
    # output the completed data as object
    [PsCustomObject]$errorMap
}

Then, using your example text, for which I'm using a Here-string, but in real life you would load the file content with $c = Get-Content -Path 'X:\TheErrors.txt' -Raw you do

$result = Get-Contents60 -fileContent $c

To display on screen

$result | Format-Table -AutoSize

giving you

seFatal         seNotSelected    seCoverOpen    seLeverPosition    seJam  
-------         -------------    -----------    ---------------    -----  
HOP_FATAL_ERROR HOP_NOT_SELECTED HOP_COVER_OPEN HOP_LEVER_POSITION HOP_JAM
  •  Tags:  
  • Related