Python 2.7 Tutorial Pt 14

I show you how to strip HTML tags from articles you got through Website Scraping using Python.

22 responses to “Python 2.7 Tutorial Pt 14”

    I actually use PHP most of the time, but with Python Beautiful Soup has improved lately and is quite good.

    hello again , its been a while… i was wondering which is the best method to use for web scrapping.. curl ? beautiful soap ? get_html? for example i can block the curl to my site through the confing.ini … so i wanna start scrapping but i dont know which is the right or best method to use …

    Hai Derek,
    i have a question how to pass the credentials to scrap website.

    from bs4 import beautifulSoup

    They may have changed the tags a bit. Take a look if the tag changed around the snippet maybe

    I use your exact code but I only get the links and the titles. The code fails to output the snippet of the article. Any help? Has the feed for Huffington Post changed?

    What'd you do to fix this error importing BS?

    I have a bunch of tutorials on scraping web pages with php. They are in my php tutorial playlist on my YouTube channel

    Hello! I am wondering whether you have or know of a tutorial to scrape from pages that are auto-generated with Javascript.

    Sorry, but I'd have to know more about how that information is checked.

    Since my network is behind a proxy, so when i open a webpage it asks me for username and password, is there any way that i can store username password in the program it self so that it doesn't asks me…..
    I searched and used urllib2 -> proxy handlers but got error

    Send me an email and I'll see if I can help

    Hi Derek. I need your help Do you have an email..I wll write a lot ..hope you answer

    figured it out now im just getting errors with re.findall giving an

    TypeError: Expected string or buffer

    Are you on a mac or pc

    my only question is how to make eclipse recognize the beautifulsoup download (I used 'python install' in terminal so were does these files have to go? Like where do I have to put the or other files that came with the install. As you would expect In eclipse I am getting an error
    Unresolved import: BeautifulSoup

    @entrevu To scrap anything you just need the basic concepts I covered here with a better understanding of regular expressions. I did a tutorial in PHP that covers advanced website scraping called Web Design and Programming Pt 24. The Regular Expression explanation is identical to regex in python. I hope that helps

    @ma1achite he's using Eclipse google it eclipse IDE

    @0Allhell Perform a view source in the browser to find out which tags you need to target. You can scrape anything that shows on the screen

    I am currently trying to scape a friends list for a gaming console. Only problem I think is it reads before the JavaScript is complete I think. Do you know a way to scrape it after? Thanks. Nice tutorials

    @ma1achite I use eclipse classic. It's free and works with most every language

