Skip to Content
Course content

11.2.2 Automating browser actions with Selenium

Selenium is a powerful tool for automating web browser interactions. It allows you to programmatically control a browser to simulate user actions like clicking buttons, entering text, navigating between pages, and scraping web content. Selenium is widely used for web application testing, web scraping, and automation of repetitive tasks.

1. Introduction to Selenium

Selenium is a browser automation framework that supports multiple programming languages, including Python, Java, C#, and Ruby. It allows you to simulate browser actions, such as clicking links, filling forms, or taking screenshots.

Selenium consists of several components:

  • Selenium WebDriver: The core of Selenium, which provides a programming interface to control browsers.
  • Selenium IDE: A browser extension for recording and playback of tests.
  • Selenium Grid: Used for running tests in parallel across multiple machines and browsers.

In Python, the primary library to interact with Selenium is selenium package, which you need to install before using.

2. Installing Selenium

To get started with Selenium in Python, you need to install the selenium package and a WebDriver for your browser (such as ChromeDriver for Chrome, GeckoDriver for Firefox).

Install Selenium:

pip install selenium

Download WebDriver:

  • Chrome: Download ChromeDriver from here
  • Firefox: Download GeckoDriver from here

Ensure that the downloaded WebDriver is in your system's PATH or specify the location when initializing the browser.

3. Automating Browser Actions with WebDriver

To begin automating browser actions, follow these steps:

Step 1: Importing Selenium and Setting up WebDriver

from selenium import webdriver
from selenium.webdriver.common.by import By

# Initialize the WebDriver (For Chrome, use ChromeDriver)
driver = webdriver.Chrome(executable_path='path/to/chromedriver')

# Maximize the browser window
driver.maximize_window()

# Navigate to a URL
driver.get("https://www.example.com")

4. Interacting with Web Elements

Selenium allows you to interact with web elements like buttons, text fields, links, and images. You can locate elements using various methods (ID, Name, XPath, CSS Selector, etc.).

Locating Elements:

  • By ID: driver.find_element(By.ID, "element_id")
  • By Name: driver.find_element(By.NAME, "element_name")
  • By XPath: driver.find_element(By.XPATH, "xpath_expression")
  • By CSS Selector: driver.find_element(By.CSS_SELECTOR, "css_selector")
  • By Class Name: driver.find_element(By.CLASS_NAME, "class_name")

Example 1: Filling Out a Form

# Find the input field by its name attribute and enter text
username_field = driver.find_element(By.NAME, "username")
username_field.send_keys("my_username")

password_field = driver.find_element(By.NAME, "password")
password_field.send_keys("my_password")

# Submit the form (if it's a button)
submit_button = driver.find_element(By.NAME, "submit")
submit_button.click()

5. Other Actions in Selenium

Selenium allows you to perform several browser interactions:

Clicking a Button:

button = driver.find_element(By.XPATH, "//button[@id='submit']")
button.click()

Extracting Text:

heading = driver.find_element(By.TAG_NAME, "h1")
print(heading.text)  # Prints the text inside the <h1> tag

Getting the Current URL:

current_url = driver.current_url
print(current_url)

Taking a Screenshot:

driver.save_screenshot("screenshot.png")

6. Handling Alerts and Pop-ups

Web applications often use pop-up alerts. Selenium provides methods to handle alerts and confirm dialogs.

Handling Alerts:

# Switch to the alert and accept it
alert = driver.switch_to.alert
alert.accept()

# To dismiss an alert
alert.dismiss()

# To retrieve the text from an alert
alert_text = alert.text
print(alert_text)

7. Waits in Selenium

Sometimes, elements take time to load, so you may need to wait before interacting with them. Selenium provides implicit and explicit waits.

Implicit Wait:

# Set an implicit wait to wait for elements to load
driver.implicitly_wait(10)  # Waits for 10 seconds before throwing an exception if an element is not found

Explicit Wait:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Wait until a specific element is visible
element = WebDriverWait(driver, 10).until(
    EC.visibility_of_element_located((By.ID, "element_id"))
)

8. Closing the Browser

Once you are done automating tasks, you should close the browser.

# Close the current window
driver.close()

# Close all windows and end the WebDriver session
driver.quit()

9. Example: Automating Google Search

Here’s a complete example where we automate a Google search.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

# Set up the WebDriver (use the correct path to your WebDriver)
driver = webdriver.Chrome(executable_path='path/to/chromedriver')

# Open Google
driver.get("https://www.google.com")

# Find the search box and enter a query
search_box = driver.find_element(By.NAME, "q")
search_box.send_keys("Selenium Python")
search_box.send_keys(Keys.RETURN)  # Press Enter

# Wait for results to load and print the title of the first result
first_result = driver.find_element(By.XPATH, "//h3")
print(first_result.text)

# Close the browser
driver.quit()

10. Summary

With Selenium, you can easily automate browser tasks like filling out forms, navigating between pages, extracting data, and interacting with web elements. It provides powerful methods for web automation, making it a valuable tool for testing, scraping, and automating repetitive web-based tasks.

Key Takeaways:

  • WebDriver: Controls the browser and simulates user actions.
  • Locating Elements: Use methods like ID, Name, XPath, etc., to find elements.
  • Automation: Automate form submissions, clicks, text extraction, and more.
  • Waits: Use implicit and explicit waits for handling dynamic content.

Commenting is not enabled on this course.