raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message: using Selenium Python

I am a beginner in web scraping in selenium python. I am trying to scrape the data that shows the annual prices for the various drugs. However I am getting an error that says :

I am a beginner in web scraping in selenium python. I am trying to scrape the data that shows the annual prices for the various drugs. However I am getting an error that says :

Traceback (most recent call last): File "other.py", line 11, in paths = WebDriverWait(d,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".highcharts-grid highcharts-yaxis-grid path"))) File "wait.py", line 80, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message:

I am unsure what I have to do. The code I have so far is :

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains

url = 'http://abacus.realendpoints.com/ConsoleTemplate.aspx?act=qlrd&req=nav&mop=abacus!main&pk=ed5a81ad-9367-41c8-aa6b-18a08199ddcf&ab-eff=1000&ab-tox=0.1&ab-nov=1&ab-rare=1&ab-pop=1&ab-dev=1&ab-prog=1.0&ab-need=1&ab-time=1543102810'
d = webdriver.Chrome()
actions = ActionChains(d)
d.get(url)
paths = WebDriverWait(d,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".highcharts-grid highcharts-yaxis-grid path")))
results = []
for path in paths:
actions.move_to_element(path).perform()
actions.click_and_hold(path).perform()
items = d.find_elements_by_css_selector('#priceChart path + text tspan')
result = [item.text for item in items]
if result:
results.append(result)
d.close()
print(results)


Selenium Python

Selenium Python

Selenium Python - is one of the widely used tools used for Web Browser Automation, and offers a lot of functionality and power over a browser.

Originally published by  prince mudda at  dzone.com

Selenium Python

Today, companies and applications are widely structured on web-based systems. The information that these systems contain is dense and requires extensive processing. Various tasks are repeated, tedious and take significant amounts of time and money. These repetitive tasks can be taken care of by Web Automation. The common tasks in web automation include form filling, screen scraping, data collection, transferring between applications, website testing, and generating periodic reports.

Web Automation Tool

There are many tools available for automation. A variety of skill levels are required for the automation tool. A non-programmer might simply have to record some test scripts; whereas programmers and advanced testers need more advanced libraries and scripts.

Web browser automation tools work by logging the number of steps involved in the transaction and then playing that number back in the target web pages by injecting JavaScript and then tracking the results. The macro-like web automation tools are much more flexible and sophisticated.

One of the most popular web automation testing tools is Selenium. It was originally developed at Thought Works in 2004 by Jason Huggins as an internal tool. Selenium supports automation in various popular browsers, languages, and platforms.

It can be easily used on platforms such as Windows, Linux, Solaris, and Macintosh. It also supports OS for mobile applications, such as iOS, mobile Windows, and Android.

Selenium supports different programming languages using drivers specific to each language. Selenium-supported languages include C #, Java, Perl, PHP, Python, and Ruby. Test scripts in Selenium can be encoded using any supported languages and can run directly in almost all modern web browsers.

Let us take a look at the main advantages of this automation tool before going to the deeper sessions of Selenium.

  • Creates quicker reports.
  • Allows frequent testing of regressions.
  • Supports Agile.
  • Countless iterations can be done without impasse.
  • Easier documentation.
  • Errors in manual testing can be easily detected.
Initial Setup

We’ll have to do a couple of things to set up before we start. For functional tests to be written using Selenium Web Driver, a request must be sent to the Selenium server; test cases are then executed on different browsers. In our case, we’ll be working with Google Chrome. The very first step is to get chromedriver.exe to simulate the browser. The next step is to install the selenium package using pip. If your virtual environment is already there then simply type in the shell command line: pip install selenium

Now, we need to import the Selenium web driver to implement Python with Selenium. Before proceeding, we would like to understand more about Selenium Web Driver. It is a web-based automation testing framework that can test Web pages that have been initiated on different web browsers and operating systems.

Selenium WebDriver Client Library for Python allows us to use all the features of Selenium WebDriver and interact with Selenium Standalone Server in order to perform automated testing of both remote and distributed browser-based apps.

To import and configure dependencies to add libraries and functionalities, use the commands below to import Selenium Webdriver:

  • from selenium import webdriver
  • from selenium.webdriver.common.keys import keys
  • from selenium.import.*
Running Your First Selenium WebDriver Automation Script 

Let’s create a Python script with WebDriver, which uses Selenium classes and functions to automate browser interaction.

Here is a simple script to activate the browser:

driver = webdriver.Chrome()
driver.get("https://www.google.com")
assert "Google" in driver.title
element = driver.find_element_by_name("q")
element.send_keys("gogol")
element.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source
driver.close()

Running the above code will create an instance of Chrome WebDriver. The driver.get method navigates to the page address provided by the URL. The page is loaded fully before WebDriver returns control to the script. However, WebDriver may not know if the page is loaded completely if the page uses a lot of AJAX on load.

The next line asserts that the title contains the word “Google”:

assert “Google” in driver.title

The next statement tries to locate the input text by its name attribute using the "find_element_by_name" method.

element = driver.find_element_by_name("q")

Now, we send keys. It’s similar to using your keyboard to enter keys. Use the "Keys" class imported from "selenium.webdriver.common.keys" to send special keys.

element.send_keys("gogol")
element.send_keys(Keys.RETURN)

You will get the results if there is any once the submission of the page is done. Assertion can be made to make sure that results are found:

assert “No results found.” not in driver.page_source

The final step closes the window. In this script, the close() method is called, which will close one tab only. However, if only one tab was open, the browser will exit by default:

driver.close()
Locating Elements

Once the page is loaded, Selenium interacts with various elements on the page. There are a number of ways by which WebDriver finds elements using one of the "find_element_by_*" methods. We can use the most suitable method for our case.

Locating Element by Id

You can use the "findelementby_id" method to locate element by its id:

element = driver.find_element_by_id('element_id')

Locating Element by Name

To locate an element by name, you can use "findelementby_name" method:

element = driver.find_element_by_name('element_name')

Locating Element by XPath

If there is no appropriate id or name attribute for the item you want to locate, then you can use XPath. It can be used either to locate the element in absolute terms or relative to an element that has an id or name attribute. For example, let’s consider a contact form:

<html>
<body>
<form id="contactForm">
<input name="name" type="text" />
<input name="email" type="email" />
<input name="phone" type="tel" />
<input name="continue" type="submit" />
</form>
</body>
</html>

To locate the form elements, you can use XPath like this:

contact_form = driver.find_element_by_xpath("/html/body/form[1]")
contact_form = driver.find_element_by_xpath("//form[1]")
contact_form = driver.find_element_by_xpath("//form[@id='contactForm']")

Locating Hyperlinks by Link Text/partial Link Text

To locate a hyperlink, you can use "findelementby_link_text" or "find_element_by_partial_link_text." For example:

<a href="contact.html">Contact Us</a>
contact = driver.find_element_by_link_text('Contact Us')
contact = driver.find_element_by_partial_link_text('Cont')

Locating Elements by Tag Name

To locate an element by a tag, you can use "findelementby_tag." For example: <p>Lorem ipsum dolor</p>

element = driver.find_element_by_tag_name('p')

Locating Elements by Class Name

To locate an element by class name, you can use "findelementby_class_name." For example: <h1 class="heading">This is a heading</h1>

element = driver.find_element_by_class_name('heading')

Locating Elements by CSS Selectors

This method can be used when you need to locate an element by CSS selector syntax. For example: <p class="para"> Lorem ipsum dolor sit amet, consectetur adipiscing elit</p>

paragraph = driver.find_element_by_css_selector('p.para')

These are all the methods by which we can locate elements in a browser. Let’s write a test case using Selenium.

Writing a Test Case With Selenium

Selenium is primarily used to write test cases. However, there is no testing tool/framework provided by the Selenium package. We can use Python’s unittest module to write test cases. For our example, we will create an example unittest script and save it as google_search.py.

import unittest
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
class GoogleSearch(unittest.TestCase):
def setUp(self):
self.driver = webdriver.Chrome()
def test_search_in_google_com(self):
driver = self.driver
driver.get("http://www.google.com")
self.assertIn("Google", driver.title)
element = driver.find_element_by_name("g")
element.send_keys("gogol")
element.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source
def tearDown(self):
self.driver.close()
if __name__ == "__main__":
unittest.main()

First, we import the unit test module, which is a built-in Python module based on Java’s Junit. Then, we create a class called GoogleSearch, where the test case class is inherited from "unittest.TestCase."

class GoogleSearch(unittest.TestCase):

The "setUp" is part of the initialization process; this method will be called before each test function you write in this test case class. An instance of Chrome WebDriver is then created.

def setUp(self):
self.driver = webdriver.Chrome()

In the test case method, the first line creates a local reference to the driver object created in the "setUp" method.

def test_search_in_google_com(self):
driver = self.driver

After every test method, the "tearDown" method will be called. This is a place where all the cleanup actions can be done. The browser window is closed in the current method.

def tearDown(self):
self.driver.close()

The final lines can be used to run the test suite:

if __name__ == "__main__":
unittest.main()
Conclusion 

Selenium is one of the widely used Web Browser Automation tools, as it offers plenty of functionality and control over all major web browsers. Although it is mainly used as a testing/automation tool for production or integration environment, but it can also be used as a web scraper

Originally published by  prince mudda at  dzone.com

===========================================================

Thanks for reading :heart: If you liked this post, share it with all of your programming buddies! Follow me on Facebook | Twitter

Learn More

Complete Python Bootcamp: Go from zero to hero in Python 3

Building A Concurrent Web Scraper With Python and Selenium

JavaScript Testing using Selenium WebDriver, Mocha and NodeJS

Perform Actions Using JavaScript in Python Selenium WebDriver


selenium.common.exceptions.WebDriverException: Message: 'firefox' executable needs to be in PATH with GeckoDriver Firefox Selenium and Python

I am trying to open Firefox with selenium,i tried

I am trying to open Firefox with selenium,i tried

from selenium import webdriver
driver=webdriver.Firefox()

But i got the following error:

selenium.common.exceptions.WebDriverException: Message: 'firefox' executable needs to be in PATH.

Selenium using Python - Geckodriver executable needs to be in PATH

I tried

from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
binary = FirefoxBinary('/usr/bin/firefox')
browser = webdriver.Firefox(firefox_binary=binary)

Also tried

from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
caps = DesiredCapabilities.FIREFOX
caps['marionette'] = True
caps['binary'] = '/usr/bin/firefox'
d = webdriver.Firefox(capabilities=caps)

`but still did not work.

However, when i tried using the above code replacing the last line with

d=webdriver.Firefox(capabilities=caps,executable_path='/usr/bin/firefox') and having my Firefox closed from background it would open Firefox but I can't simply d.get("https://www.google.com") it gets stuck on Linux homepage and doesn't open anything.

After typing whereis firefox in terminal i got /usr/bin/firefox,also if it matters i use python 2.7

Note: I hope this isn't a duplicate of the above link because i tried the answers and it didn't fix it.

I installed geckodriver from github, and tried browser=webdriver.Firefox(executable_path="geckodriver") ,I have placed the driver is the same directory.

Web Scraping with Selenium And Python Tutorial for Beginners

Web Scraping with Selenium And Python Tutorial for Beginners

In this article on Web Scraping with Selenium And Python, you will learn about web scraping in brief and see how to extract data from a website ...

Web scraping is a fast, affordable and reliable way to get data when you need it. What is even better, the data is usually up-to-date. Now, bear in mind that when scraping a website, you might be violating its usage policy and can get kicked out of it. While scraping is mostly legal, there might be some exceptions depending on how you are going to use the data. So make sure you do your research before starting. For a simple personal or open-source project, however, you should be ok.

There are many ways to scrape data, but the one I prefer the most is to use Selenium. It is primarily used for testing as what it basically does is browser automation. In simple language, it creates a robot browser that does things for you: it can get HTML data, scroll, click buttons, etc. The great advantage is that we can tell specifically what HTML data we want so we can organize and store it appropriately.

Selenium is compatible with many programming languages, but this tutorial is going to focus on Python. Check this link to read Selenium (with Python) documentation.

First Steps

To download Selenium use this simple command in your command line:

pip install selenium

If you are working in a Jupyter Notebook, you can do it right there instead of the command line. Just add an exclamation mark in the beginning:

!pip install selenium

After that all you need to do is import the necessary modules:

from selenium.webdriver import Chrome, Firefox

Other browsers are also supported but these two are the most commonly used.

Two simple commands are needed to get started:

browser = Firefox()

(or browser = Chrome() depending on your preference)

This creates an instance of a Firefox WebDriver that will allow us to access all its useful methods and attributes. We assigned it to the variable browser but you are free to choose your own name. A new blank window of the Firefox browser will be automatically opened.

Next get the URL that you want to scrape:

browser.get('https://en.wikipedia.org/wiki/Main_Page')

The get() method will open the URL in the browser and will wait until it is fully loaded.

Now you can get all the HTML information you want from this URL.

Locating Elements

There are different ways to locate elements with Selenium. Which is the best one, depends on the HTML structure of the page you are scraping. It can be tricky to figure out what is the most efficient way to access the element you want. So take your time and inspect the HTML carefully.

You can either access a single element with a chosen search parameter (you will get the first element that corresponds to your search parameter) or all the elements that match the search parameter. To get a single one use these methods:

find_element_by_id()

find_element_by_name()

find_element_by_xpath()

find_element_by_link_text()

find_element_by_partial_link_text()

find_element_by_tag_name()

find_element_by_class_name()

find_element_by_css_selector()

To locate multiple elements just substitute element with elements in the above methods. You will get a list of WebDriver objects located by this method.

Scraping Wikipedia

So let’s see how it works with the already mentioned Wikipedia page https://en.wikipedia.org/wiki/Main_Page

We have already created browser variable containing an instance of the WebDriver and loaded the main Wikipedia page.

Let’s say we want to access the list of languages that this page can be translated to and store all the links to them.

After some inspection we can see that all elements have a similar structure: they are <li> elements of class 'interlanguage-link' that contain <a> with a URL and text:

<li class="interlanguage-link interwiki-bg">

<a href="https://l.morioh.com/b0a3f595aa?r=https://bg.wikipedia.org/wiki/" rel="nofollow" target="_blank" title="Bulgarian"
lang="bg" hreflang="bg" class="interlanguage-link-target">

   Български

</a>

</li>

So let’s first access all <li> elements. We can isolate them using class name:

languages = browser.find_elements_by_class_name('interlanguage-link')

languages is a list of WebDriver objects. If we print the first element of it with:

print(languages[0])

It will print something like this:

<selenium.webdriver.firefox.webelement.FirefoxWebElement (session="73e70f48-851a-764d-8533-66f738d2bcf6", element="2a579b98-1a03-b04f-afe3-5d3da8aa9ec1")>

So to actually see what’s inside, we will need to write a for loop to access each element from the list, then access it’s <a> child element and get <a>'s text and 'href' attribute.

To get the text we can use text attribute. To get the 'href' use get_attribute('attribute_name') method. So the code will look like this:

language_names = [language.find_element_by_css_selector('a').text
for language in languages]

links = [language.find_element_by_css_selector('a').get_attribute('href')
for language in languages]

You can print out language_names and links to see that it worked.

Scrolling

Sometimes not the whole page is loaded from the start. In this case we can make the browser scroll down to get HTML from the rest of the page. It is quite easy with execute_script() method that takes JavaScript code as a parameter:

scroll_down = "window.scrollTo(0, document.body.scrollHeight);"
browser.execute_script(scroll_down)

scrollTo(x-coord, y-coord) is a JavaScript method that scrolls to the given coordinates. In our case we are using document.body.scrollHeight which returns the height of the element (in this case body).

As you might have guessed, you can make the browser execute all kind of scripts with execute_script() method. So if you have experience with JavaScript, you have a lot of room to experiment.

Clicking

Clicking is as easy as selecting an element and applying click() method to it. In some cases if you know the URLs that you need to go to, you can make the browser load the page with URLs. Again, see what is more efficient.

To give an example of the click() method, let’s click on the 'Contents' link from the menu on the left.

The HTML of this link is the following:

<li id="n-contents">
<a href="/wiki/Portal:Contents" title="Guides to browsing Wikipedia">

    Contents

</a>
</li>

We have to find the <li> element with the unique id 'n-contents' first and then access its <a> child

content_element = browser.find_element_by_id('n-contents') 
.find_element_by_css_selector('a')

content_element.click()

You can see now that the browser loaded the 'Contents' page.

Downloading Images

Now what if we decide to download images from the page. For this we will use urllib library and a uuid generator. We will first locate all images with CSS selector 'img', then access its 'src' attribute, and then creating a unique id for each image download the images with urlretrieve('url', 'folder/name.jpg') method. This method takes 2 parameters: a URL of the image and a name we want to give it together with the folder we want to download to (if applicable).

from urllib.request import urlretrieve
from uuid import uuid4

get the main page again

browser.get('https://en.wikipedia.org/wiki/Main_Page')

locate image elements

images = browser.find_elements_by_css_selector('img')

access src attribute of the images

src_list = [img.get_attribute('src') for img in images]

for src in src_list:
# create a unique name for each image by using UUID generator
uuid = uuid4()

# retrieve umages using the URLs
urlretrieve(src, f"wiki_images/{uuid}.jpg")

Adding Waiting Time Between Actions

And lastly, sometimes it is necessary to introduce some waiting time between actions in the browser. For example, when loading a lot of pages one after another. It can be done with time module.

Let’s load 3 URLs from our links list and make the browser wait for 3 seconds before loading each page using time.sleep() method.

import time

urls = links[0:3]

for url in urls:
browser.get(url)
# stop for 3 seconds before going for the next page
time.sleep(3)

Closing the WebDriver

And finally we can close our robot browser’s window with

browser.close()

Don’t forget that browser is a variable that contains an instance of Firefox() method (see the beginning of the tutorial).

Code in GitHub

The code from this article is available in GitHub:

https://github.com/AnnaLara/scrapingwithselenium_basics

Further Reading

Perform Actions Using JavaScript in Python Selenium WebDriver

Building A Concurrent Web Scraper With Python and Selenium

Python Web Scraping Tutorial

Originally published by Anna Zubova  at dev.to

===========================================

Thanks for reading :heart: If you liked this post, share it with all of your programming buddies! Follow me on Facebook | Twitter