Home > OS >  Ignore one div class in BeautifulSoup find_all in Python 3
Ignore one div class in BeautifulSoup find_all in Python 3

Time:01-05

I want to ignore one class when using find_all. I've followed this solution Select all divs except ones with certain classes in BeautifulSoup

My divs are a bit different, I want to ignore description-0

<div >...</div>
<div >
    <div ></div>
    <div ></div>
    <div ></div>
</div>
<div >...</div>

Following is my code

classToIgnore = ["description-0"]
all = soup.find_all('div', class_=lambda x: x not in classToIgnore)

It is reading all divs on the page, instead of just the ones with "descriptions-n". How to fix it?

CodePudding user response:

Use regex, like this, for example:

import re

from bs4 import BeautifulSoup

sample_html = """<div >...</div>
<div ></div>
<div ></div>
<div ></div>
<div >...</div>"""

classes_regex = (
    BeautifulSoup(sample_html, "lxml")
    .find_all("div", {"class": (re.compile(r"description-[1-9]"))})
)
print(classes_regex)

Output:

[<div ></div>, <div ></div>]
  •  Tags:  
  • Related