How to get data from inspect element of a webpage using Python
15 JANUARY 2019Question :
I'd like to get the data from inspect element using Python. I'm able to download
the source code using BeautifulSoup but now I need the text from inspect element
of a webpage. I'd truly appreciate if you could advise me how to do it.
Edit: By inspect element I mean, in google chrome, right click gives us an option
called inspect element which has code related to each element of that particular page.
I'd like to extract that code/ just its text strings.
If you want to automatically fetch a web page from Python in a way that runs Javascript, you should look into Selenium. It can automatically drive a web browser (even a headless web browser such as PhantomJS, so you don’t have to have a window open).
In order to get the HTML, you’ll need to evaluate some javascript. Simple sample code, alter to suit:
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get("http://google.com")
# This will get the initial html - before javascript
html1 = driver.page_source
# This will get the html after on-load javascript
html2 = driver.execute_script("return document.documentElement.innerHTML;")
Note 1:
If you want a specific element or elements, you actually have a couple of options – parse the HTML in Python, or write more specific JavaScript that returns what you want.
Note 2:
if you actually need specific information from Chrome’s tools that is not just dynamically generated HTML, you’ll need a way to hook into Chrome itself. No way around that.