How to get html tag value in python

I'm newbie to python. Here is my code working on python 2.7.5

import urllib2
import sys       

url ="mydomain.com"
usock = urllib2.urlopen[url]
data = usock.read[]
usock.close[]

print data

Getting HTML markup like that and it works.

What I want to do is, to get value from inside tag. for ex. I need data value from this example:

Data

How to do it?

asked Sep 6, 2013 at 11:38

1

You can use a HTML parser module such as BeautifulSoup:

from bs4 import BeautifulSoup as BS
url ="mydomain.com"
usock = urllib2.urlopen[url]
data = usock.read[]
usock.close[]
soup = BS[data]
print soup.find['font', {'class':'big'}].text

This finds a tag with a class="big". It then prints its content.

answered Sep 6, 2013 at 11:39

TerryATerryA

56.9k11 gold badges117 silver badges137 bronze badges

2

Using lxml:

import urllib2
import lxml.html

url ="mydomain.com"

usock = urllib2.urlopen[url]
data = usock.read[]
usock.close[]
for font in lxml.html.fromstring[data].cssselect['font.big']:
    print font.text
>>> import lxml.html
>>> root = lxml.html.fromstring['Data']
>>> [font.text for font in root.cssselect['font.big']]
['Data']

answered Sep 6, 2013 at 11:40

falsetrufalsetru

343k57 gold badges683 silver badges606 bronze badges

View Discussion

Improve Article

Save Article

  • Read
  • Discuss
  • View Discussion

    Improve Article

    Save Article

    Prerequisites: Beautifulsoup

    In this article, we will discuss how beautifulsoup can be employed to find a tag with the given attribute value in an HTML document.

    Approach:

    • Import module.
    • Scrap data from a webpage.
    • Parse the string scraped to HTML.
    • Use find[] function to find the attribute and tag.
    • Print the result.

    Syntax: find[attr_name=”value”]

    Below are some implementations of the above approach:

    Example 1: 

    Python3

    from bs4 import BeautifulSoup 

    markup =

    soup = BeautifulSoup[markup, 'html.parser'

    div_bs4 = soup.find[id = "container"

    print[div_bs4.name]

    Output:

    div

    Example 2:

    Python3

    from bs4 import BeautifulSoup 

    soup = BeautifulSoup[markup, 'html.parser'

    print[div_bs4.name]

    Output:

    a

    Example 3:

    Python3

    from bs4 import BeautifulSoup 

    markup =

    soup = BeautifulSoup[markup, 'html.parser'

    div_bs4 = soup.find[class_ = "gfg"

    print[div_bs4.name]

    Output:

    p

    Chủ Đề