Home > Mobile >  Extract sentence containing specific word
Extract sentence containing specific word

Time:01-05

I've seen a variation of this question being asked; however, I cannot find my exact problem. I am trying to extract every sentence that contains a specific word and paste the sentence to the column on the right of A1. In the example below, the key word is cold.

Example

Column A1 - (What I have):
It is very cold outside. I want to go skiing. I love a cold vacation. I love the snow.

Column A2 - (what I want to see):
It is very cold outside. I love a cold vacation.

Can anyone assist? VBA appears to be best. I also wouldn't mind typing in my keyword in a cell and have a VBA code that extracts every sentence containing the keyword. But anything helps!

CodePudding user response:

FILTERXML() will assist you. Try-

=TEXTJOIN(". ",TRUE,FILTERXML("<t><s>"&SUBSTITUTE(A1,".","</s><s>")&"</s></t>","//s[contains(., 'cold')]"))

More about FILTERXML() enter image description here

CodePudding user response:

If you want a VBA solution, please use the next function:

Function extractSentences(strVal As String, keyWord As String) As Variant
   Dim arr, arrFini As Long, i As Long, k As Long
   If InStr(strVal, keyWord) = 0 Then extractSentences = Array(""): Exit Function
   arr = Split(strVal, ". ")
   If UBound(arr) = -1 Then extractSentences = Array(""): Exit Function
   ReDim arrFin(UBound(arr))
   For i = 0 To UBound(arr)
        If InStr(arr(i), keyWord) > 0 Then
            arrFin(k) = arr(i): k = k   1
        End If
   Next i
   If k > 0 Then
        ReDim Preserve arrFin(k - 1)
        If Right(arrFin(UBound(arrFin)), 1) <> "." Then arrFin(UBound(arrFin)) = arrFin(UBound(arrFin)) & "."
        extractSentences = arrFin
   End If
End Function

It can be used to analyze column A:A and return in B:B, in the next way:

Sub testExtractSentByWord()
   Dim sh As Worksheet, lastR As Long, arr, arrS, arrFin, searchWord As String, i As Long
   
   Set sh = ActiveSheet
   lastR = sh.Range("A" & sh.rows.Count).End(xlUp).row
   arr = sh.Range("A1:A" & lastR).Value 'place the range in an array for faster iteration
   ReDim arrFin(1 To UBound(arr), 1 To 1)
   searchWord = "cold"
   For i = 1 To UBound(arr)
        arrS = extractSentences(CStr(arr(i, 1)), searchWord)
        arrFin(i, 1) = Join(arrS, ". ")
        sh.Range("B1:B" & lastR).Value = arrFin
   Next i
End Sub

CodePudding user response:

An example using a Regular Expression.

Option Explicit
Sub Demo()

    Dim regex As Object, m As Object, ar
    Dim word As String, s As String
    Dim lastrow As Long, i As Long, n As Long
    
    word = "cold"
    
    Set regex = CreateObject("vbscript.regexp")
    With regex
        .Global = True
        .MultiLine = False
        .IgnoreCase = True
        .Pattern = "([^.]*\b" & word & "\b[^.]*)"
    End With
    
    With Sheets(1)
        lastrow = .Cells(.Rows.Count, "A").End(xlUp).Row
        For i = 1 To lastrow
            s = .Cells(i, "A")
            If regex.test(s) Then
                Set m = regex.Execute(s) '
                ReDim ar(1 To m.Count)
                For n = 1 To m.Count
                    ar(n) = Trim(m.Item(n - 1).submatches(0))
                Next
                .Cells(i, "B") = Join(ar, ". ")
            End If
        Next
    End With

End Sub
  •  Tags:  
  • Related