How to get all xml/dom as text under a parent web element using selenium/python?

I have a scenario that requires working on a UI object displayed as a grid but the rows and columns are separate web elements contained in the xml / dom hierarchy consisting of multiple xpaths that can be parsed using a common pattern. All these elements contain texts corresponding to the column type. Getting all these texts by webelement references one by one takes time. Is there a way to get all this xml as text (or for at least one row in a single shot) to save extraction time by parsing entire xml inline.

I have a scenario that requires working on a UI object displayed as a grid but the rows and columns are separate web elements contained in the xml / dom hierarchy consisting of multiple xpaths that can be parsed using a common pattern. All these elements contain texts corresponding to the column type. Getting all these texts by webelement references one by one takes time. Is there a way to get all this xml as text (or for at least one row in a single shot) to save extraction time by parsing entire xml inline.

For example, consider the bottom mentioned xml. How can i get all xml hierarchy underneath <div[@class='table']> as text to parse or even for entire first row in the table.

<div[@class='table']>
         <div[@class='rows']>
              <div[@class='row']>[0]:

This is the sample example:

<div[@class='table']>
     <div[@class='rows']>
          <div[@class='row']>
               <div[@class='col']>
                   <div[@class='element']>some_text1</div[@class='element']>
                   <div[@class='element']>some_text2</div[@class='element']>
                   <div[@class='element']>some_text3</div[@class='element']>
                   ...
               </div[@class='col']>
          </div[@class='row']>
          <div[@class='row']>
               <div[@class='col']>
                   <div[@class='element']>some_text1</div[@class='element']>
                   <div[@class='element']>some_text2</div[@class='element']>
                   <div[@class='element']>some_text3</div[@class='element']>
                   ...
               </div[@class='col']>
          </div[@class='row']>
          <div[@class='row']>
               ...
          </div[@class='row']>
          <div[@class='row']>
               ...
          </div[@class='row']>
          <div[@class='row']>
               ...
          </div[@class='row']>
     </div[@class='rows']>
</div[@class='table']>


Selenium Python

Selenium Python

Selenium Python - is one of the widely used tools used for Web Browser Automation, and offers a lot of functionality and power over a browser.

Originally published by  prince mudda at  dzone.com

Selenium Python

Today, companies and applications are widely structured on web-based systems. The information that these systems contain is dense and requires extensive processing. Various tasks are repeated, tedious and take significant amounts of time and money. These repetitive tasks can be taken care of by Web Automation. The common tasks in web automation include form filling, screen scraping, data collection, transferring between applications, website testing, and generating periodic reports.

Web Automation Tool

There are many tools available for automation. A variety of skill levels are required for the automation tool. A non-programmer might simply have to record some test scripts; whereas programmers and advanced testers need more advanced libraries and scripts.

Web browser automation tools work by logging the number of steps involved in the transaction and then playing that number back in the target web pages by injecting JavaScript and then tracking the results. The macro-like web automation tools are much more flexible and sophisticated.

One of the most popular web automation testing tools is Selenium. It was originally developed at Thought Works in 2004 by Jason Huggins as an internal tool. Selenium supports automation in various popular browsers, languages, and platforms.

It can be easily used on platforms such as Windows, Linux, Solaris, and Macintosh. It also supports OS for mobile applications, such as iOS, mobile Windows, and Android.

Selenium supports different programming languages using drivers specific to each language. Selenium-supported languages include C #, Java, Perl, PHP, Python, and Ruby. Test scripts in Selenium can be encoded using any supported languages and can run directly in almost all modern web browsers.

Let us take a look at the main advantages of this automation tool before going to the deeper sessions of Selenium.

  • Creates quicker reports.
  • Allows frequent testing of regressions.
  • Supports Agile.
  • Countless iterations can be done without impasse.
  • Easier documentation.
  • Errors in manual testing can be easily detected.
Initial Setup

We’ll have to do a couple of things to set up before we start. For functional tests to be written using Selenium Web Driver, a request must be sent to the Selenium server; test cases are then executed on different browsers. In our case, we’ll be working with Google Chrome. The very first step is to get chromedriver.exe to simulate the browser. The next step is to install the selenium package using pip. If your virtual environment is already there then simply type in the shell command line: pip install selenium

Now, we need to import the Selenium web driver to implement Python with Selenium. Before proceeding, we would like to understand more about Selenium Web Driver. It is a web-based automation testing framework that can test Web pages that have been initiated on different web browsers and operating systems.

Selenium WebDriver Client Library for Python allows us to use all the features of Selenium WebDriver and interact with Selenium Standalone Server in order to perform automated testing of both remote and distributed browser-based apps.

To import and configure dependencies to add libraries and functionalities, use the commands below to import Selenium Webdriver:

  • from selenium import webdriver
  • from selenium.webdriver.common.keys import keys
  • from selenium.import.*
Running Your First Selenium WebDriver Automation Script 

Let’s create a Python script with WebDriver, which uses Selenium classes and functions to automate browser interaction.

Here is a simple script to activate the browser:

driver = webdriver.Chrome()
driver.get("https://www.google.com")
assert "Google" in driver.title
element = driver.find_element_by_name("q")
element.send_keys("gogol")
element.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source
driver.close()

Running the above code will create an instance of Chrome WebDriver. The driver.get method navigates to the page address provided by the URL. The page is loaded fully before WebDriver returns control to the script. However, WebDriver may not know if the page is loaded completely if the page uses a lot of AJAX on load.

The next line asserts that the title contains the word “Google”:

assert “Google” in driver.title

The next statement tries to locate the input text by its name attribute using the "find_element_by_name" method.

element = driver.find_element_by_name("q")

Now, we send keys. It’s similar to using your keyboard to enter keys. Use the "Keys" class imported from "selenium.webdriver.common.keys" to send special keys.

element.send_keys("gogol")
element.send_keys(Keys.RETURN)

You will get the results if there is any once the submission of the page is done. Assertion can be made to make sure that results are found:

assert “No results found.” not in driver.page_source

The final step closes the window. In this script, the close() method is called, which will close one tab only. However, if only one tab was open, the browser will exit by default:

driver.close()
Locating Elements

Once the page is loaded, Selenium interacts with various elements on the page. There are a number of ways by which WebDriver finds elements using one of the "find_element_by_*" methods. We can use the most suitable method for our case.

Locating Element by Id

You can use the "findelementby_id" method to locate element by its id:

element = driver.find_element_by_id('element_id')

Locating Element by Name

To locate an element by name, you can use "findelementby_name" method:

element = driver.find_element_by_name('element_name')

Locating Element by XPath

If there is no appropriate id or name attribute for the item you want to locate, then you can use XPath. It can be used either to locate the element in absolute terms or relative to an element that has an id or name attribute. For example, let’s consider a contact form:

<html>
<body>
<form id="contactForm">
<input name="name" type="text" />
<input name="email" type="email" />
<input name="phone" type="tel" />
<input name="continue" type="submit" />
</form>
</body>
</html>

To locate the form elements, you can use XPath like this:

contact_form = driver.find_element_by_xpath("/html/body/form[1]")
contact_form = driver.find_element_by_xpath("//form[1]")
contact_form = driver.find_element_by_xpath("//form[@id='contactForm']")

Locating Hyperlinks by Link Text/partial Link Text

To locate a hyperlink, you can use "findelementby_link_text" or "find_element_by_partial_link_text." For example:

<a href="contact.html">Contact Us</a>
contact = driver.find_element_by_link_text('Contact Us')
contact = driver.find_element_by_partial_link_text('Cont')

Locating Elements by Tag Name

To locate an element by a tag, you can use "findelementby_tag." For example: <p>Lorem ipsum dolor</p>

element = driver.find_element_by_tag_name('p')

Locating Elements by Class Name

To locate an element by class name, you can use "findelementby_class_name." For example: <h1 class="heading">This is a heading</h1>

element = driver.find_element_by_class_name('heading')

Locating Elements by CSS Selectors

This method can be used when you need to locate an element by CSS selector syntax. For example: <p class="para"> Lorem ipsum dolor sit amet, consectetur adipiscing elit</p>

paragraph = driver.find_element_by_css_selector('p.para')

These are all the methods by which we can locate elements in a browser. Let’s write a test case using Selenium.

Writing a Test Case With Selenium

Selenium is primarily used to write test cases. However, there is no testing tool/framework provided by the Selenium package. We can use Python’s unittest module to write test cases. For our example, we will create an example unittest script and save it as google_search.py.

import unittest
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
class GoogleSearch(unittest.TestCase):
def setUp(self):
self.driver = webdriver.Chrome()
def test_search_in_google_com(self):
driver = self.driver
driver.get("http://www.google.com")
self.assertIn("Google", driver.title)
element = driver.find_element_by_name("g")
element.send_keys("gogol")
element.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source
def tearDown(self):
self.driver.close()
if __name__ == "__main__":
unittest.main()

First, we import the unit test module, which is a built-in Python module based on Java’s Junit. Then, we create a class called GoogleSearch, where the test case class is inherited from "unittest.TestCase."

class GoogleSearch(unittest.TestCase):

The "setUp" is part of the initialization process; this method will be called before each test function you write in this test case class. An instance of Chrome WebDriver is then created.

def setUp(self):
self.driver = webdriver.Chrome()

In the test case method, the first line creates a local reference to the driver object created in the "setUp" method.

def test_search_in_google_com(self):
driver = self.driver

After every test method, the "tearDown" method will be called. This is a place where all the cleanup actions can be done. The browser window is closed in the current method.

def tearDown(self):
self.driver.close()

The final lines can be used to run the test suite:

if __name__ == "__main__":
unittest.main()
Conclusion 

Selenium is one of the widely used Web Browser Automation tools, as it offers plenty of functionality and control over all major web browsers. Although it is mainly used as a testing/automation tool for production or integration environment, but it can also be used as a web scraper

Originally published by  prince mudda at  dzone.com

===========================================================

Thanks for reading :heart: If you liked this post, share it with all of your programming buddies! Follow me on Facebook | Twitter

Learn More

Complete Python Bootcamp: Go from zero to hero in Python 3

Building A Concurrent Web Scraper With Python and Selenium

JavaScript Testing using Selenium WebDriver, Mocha and NodeJS

Perform Actions Using JavaScript in Python Selenium WebDriver


raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message: using Selenium Python

I am a beginner in web scraping in selenium python. I am trying to scrape the data that shows the annual prices for the various drugs. However I am getting an error that says :

I am a beginner in web scraping in selenium python. I am trying to scrape the data that shows the annual prices for the various drugs. However I am getting an error that says :

Traceback (most recent call last): File "other.py", line 11, in paths = WebDriverWait(d,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".highcharts-grid highcharts-yaxis-grid path"))) File "wait.py", line 80, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message:

I am unsure what I have to do. The code I have so far is :

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains

url = 'http://abacus.realendpoints.com/ConsoleTemplate.aspx?act=qlrd&req=nav&mop=abacus!main&pk=ed5a81ad-9367-41c8-aa6b-18a08199ddcf&ab-eff=1000&ab-tox=0.1&ab-nov=1&ab-rare=1&ab-pop=1&ab-dev=1&ab-prog=1.0&ab-need=1&ab-time=1543102810'
d = webdriver.Chrome()
actions = ActionChains(d)
d.get(url)
paths = WebDriverWait(d,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".highcharts-grid highcharts-yaxis-grid path")))
results = []
for path in paths:
actions.move_to_element(path).perform()
actions.click_and_hold(path).perform()
items = d.find_elements_by_css_selector('#priceChart path + text tspan')
result = [item.text for item in items]
if result:
results.append(result)
d.close()
print(results)


selenium.common.exceptions.WebDriverException: Message: 'firefox' executable needs to be in PATH with GeckoDriver Firefox Selenium and Python

I am trying to open Firefox with selenium,i tried

I am trying to open Firefox with selenium,i tried

from selenium import webdriver
driver=webdriver.Firefox()

But i got the following error:

selenium.common.exceptions.WebDriverException: Message: 'firefox' executable needs to be in PATH.

Selenium using Python - Geckodriver executable needs to be in PATH

I tried

from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
binary = FirefoxBinary('/usr/bin/firefox')
browser = webdriver.Firefox(firefox_binary=binary)

Also tried

from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
caps = DesiredCapabilities.FIREFOX
caps['marionette'] = True
caps['binary'] = '/usr/bin/firefox'
d = webdriver.Firefox(capabilities=caps)

`but still did not work.

However, when i tried using the above code replacing the last line with

d=webdriver.Firefox(capabilities=caps,executable_path='/usr/bin/firefox') and having my Firefox closed from background it would open Firefox but I can't simply d.get("https://www.google.com") it gets stuck on Linux homepage and doesn't open anything.

After typing whereis firefox in terminal i got /usr/bin/firefox,also if it matters i use python 2.7

Note: I hope this isn't a duplicate of the above link because i tried the answers and it didn't fix it.

I installed geckodriver from github, and tried browser=webdriver.Firefox(executable_path="geckodriver") ,I have placed the driver is the same directory.