Giter Site home page Giter Site logo

web-driver-support's Introduction

Web Driver Support

WebDriver utility library for Kotlin which encapsulates common logic into simple syntax.

  • Exclude WebDriver reference from code - of course we use WebDriver, no need to write it every time.
  • Remove verbose syntax like WebDriverWait(driver, ...).until(ExpectedConditions.xxx(by)), driver.findElement(By.xxx(...)).
  • Add built-in waitning for presence of element instead of writing waiting code yourself.
  • Add OCR support - for elements that are not represented by text
  • Add Template Matching support - for elements that cannot be found by standard By implementation
class ExampleScenario(driver: WebDriver) : ExtendedWebDriver(driver) {

    fun execute() {

        maximize()

        open("https://github.com/xinaiz/web-driver-support")

        "commits".className.find().trimmedText

        val treeFiles = "file-wrap".className.waitUntilClickable().findAll("tr".tag)

        treeFiles.forEachIndexed { index, elem -> println("$index: ${elem.text}") }

        // Easily get BufferedScreenshot from element
        treeFiles[4].getBufferedScreenshot()

        // Easily find by element attributes
        "g p".attr("data-hotkey").clickWhenClickable()

        // Wait until element is clickable then click it
        "New pull request".linkText.clickWhenClickable()

        open("$currentUrl/master...develop")

        println("blankslate".className.textWhenPresent())

    }

}

Migration guide:

Search

Old syntax New syntax
driver.findElement(By.xxx("abc")) "abc".xxx.find()
driver.findElements(By.xxx("abc")) "abc".xxx.findAll()
try { driver.findElement(By.xxx("abc")) } catch(ex: Throwable) { null } "abc".xxx.findOrNull()

Child Search

Old syntax New syntax
parentElement.findElement(By.xxx("abc")) parentElement.find("abc".xxx) or "abc".xxx.find(parentElement)
parentElement.findElements(By.xxx("abc")) parentElement.findAll("abc".xxx) or "abc".xxx.findAll(parentElement)
try { parentElement.findElement(By.xxx("abc")) } catch(ex: Throwable) { null } webElement.findOrNull("abc".xxx) or "abc".xxx.findOrNull(parentElement)

Driver Methods

All WebDriver methods are available via this context. In addition, many nested method have been flattened for simplier access. For example:

Old syntax New syntax
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS) implicitWait = 10 to TimeUnit.SECONDS
driver.navigate().back() navigateBack()

Executing JavaScript

Old syntax New syntax
(driver as JavascriptExecutor).executeScript("script", args) executeScript("script", args)
(driver as JavascriptExecutor).executeAsyncScript("script", args) executeScriptAsync("script", args)
(driver as JavascriptExecutor).executeScript("functionName(arguments[0], arguments[1])", 42, "hello") runFunction("functionName", 42, "hello")

TODO: document remaining WebElement JavaScript utility functions

Waiting

Note: New wait methods throw just like original WebDriverWait does during timeout. To avoid that, it's required to use .orNull() syntax. When timeout occurres, instead of exception, null will be returned as waiting result.

Old syntax New syntax
WebDriverWait(webDriver, 10).until(ExpectedConditions.presenceOfElementLocated(By.xxx("abc"))) "abc".xxx.waitUntilPresent(10)
try { WebDriverWait(webDriver, 10).until(ExpectedConditions.presenceOfElementLocated(By.xxx("abc"))) } catch(ex: Throwable) { } "abc".xxx.wait().orNull().untilPresent()

Common tips

There are many utility functions that simplify common expressions. Of course complex syntaxes are still available. For example:

Full expression Shorter expression
"avatar".id.findOrNull() != null "avatar".id.isPresent()
"button".id.wait(15).untilClickable().click()" "button".id.clickWhenClickable(15)
"button".id.wait(15).orNull().untilClickable()?.click()" "button".id.clickWhenClickableOrNull(15)

Template matching support

If you have canvas element on your page with inner controls, you can't normally click specific control, because they are not present in the DOM. This library has some support for that case.

Setup

To use this functionality, OpenCV library must be present. You can add it yourself to the project, or use dependency that handles that for you. For example https://github.com/openpnp/opencv.

How to use

Example below shows how to use this functionality:

Let's assume that canvas element has id frame. To find it, you would write:

val canvas = "frame".id.find()
canvas.getBufferedScreenshot()

The screenshot:

screenshot 1

Let's say you need to find the guy face. You need to take screenshot, and crop it for future use. I will call it "template":

screenshot 2

Then, you can find that element just like that:

val guyFaces = "/images/guy_face.png".template(canvas).findAll()

Now you are left with 2 elements:

screenshot 3

At this point you can click them, or search deeper! Another template:

screenshot 4

This time search inside first guy face instead of whole canvas:

val guySmile = "/images/guy_smile.png".template(guyFaces[0]).find()

screenshot 5

Using cached screenshot

By default, when you search using template matching method, a screenshot is taken by WebDriver each time. Taking screenshot may be slow if done often. If you don't need updated state of canvas everytime you search, you can store screenshot in utility class ScreenCache:

val canvas = "frame".id.find()
var screenCache = canvas.cacheScreen() // screenshot taken
"/images/notification.png".template(screenCache).click() // close some notification
"/images/home_button.png".template(screenCache).click() // click some button

// canvas changed (navigated to different content), screenCache is no longer valid
// now, depending on canvas content implementation, we might need to wait until new page appears
"/images/some_icon_on_home_page.png".template(canvas).waitUntilPresent()

screenCache = canvas.cacheScreen() // create new cache
if(!"images/statistics_title.png".template(screenCache).isPresent()) {
    "/images/statistics_button.png".template(screenCache).click()
}

// etc

Handling blurry / distorted content

There are many cases that canvas content is not static - animation, overlay effects, lighting changes. In that case pixel-perfect template matching will fail miserably. To overcome this, image similarity can be specified. There are currently 5 predefined thresholds:

Name Value Description
Similarity.EXACT 1.0 Pixel-perfect match
Similarity.PRECISE 0.9 A bit distored image, small overlay effects
Similarity.DEFAULT 0.8 Default similarity, handles common overlay effects
Similarity.DISTORTED 0.7 Highly distored image, but still recognizable
Similarity.LOW 0.5 Danger zone - might find something else

Custom similarity can be also specified:

"/button.png".template(canvas, similarity = Constants.Similarity.EXACT.value).find()
"/button.png".template(canvas, similarity = Constants.Similarity.PRECISE.value).find()
"/button.png".template(canvas).find()
"/button.png".template(canvas, similarity = Constants.Similarity.DISTORTED.value).find()
"/button.png".template(canvas, similarity = Constants.Similarity.LOW.value).find()
"/button.png".template(canvas, similarity = 0.95).find()
"/button.png".template(canvas, similarity = 0.40).find()

Important If you use similarity lower than Similarity.LOW, you might find fish instead of elephant.

Optical Character Recognition (OCR) support

There are multiple occastions when text cannot be accessed, because it's rendered inside canvas or is part of an image. Because of that, this library also supports recognizing text from images. Currently Tesseract API is used by default, but there is also generic support for any API that converts BufferedImage to String.

Setup

As mentioned above, we use Tesseract Api as default OCR engine. Installation instructions can be found on Tesseract github page - https://github.com/tesseract-ocr/tesseract. To use it with Web Driver Support, initialization is required:

init {
    ocr.setDatapath("D:\\<tesseract-installation-folder>\\Tesseract-OCR\\tessdata")
    ocr.setConfigs(listOf("quiet")) // disable logs
}

Property ocr is defined in ExtendedWebDriver, and it can be initalized in init block of class that extends it.

How to use

OCR functionality is exposed by both ExtendedWebDriver and ExtendedWebElement classes, but latter is preferred. OCR is performed in bounds of target element:

Default OCR, no additional image processing is performed:

"body".tag.find().doOCR() // perform OCR on whole visible page

Treshold OCR. Convert image to binary (black and white). All pixels below lightness treshold 180 (scale 0-255) will be black, all above will be white.

"canvas".id.find().doBinaryOCR(treshold = 180) 

Use case:

upperbound treshold

For very indistinguishable text (blending with background) or if parts of background are both brighter and darker than text, both lower and upper lightness bounds can be specified. Pixels between bounds will become white, and other will become black.

"canvas".id.find().doBinaryOCR(tresholdMin = 150, tresholdMax = 160) 

Use case:

lower and upper bound threshold

Using OCRMode

OCR is not perfect, and might mistake some characters - for example 8 and B. For that, OCRMode can be specified. It defines which characters are allowed. Currently there are 3 modes:

Name Description Allowed characters
OCRMode.TEXT All asci characters All ascii characters
OCRMode.DIGITS All digits 0123456789
OCRMode.CUSTOM Custom range For example OCRMode.CUSTOM("abcde12345")

It can be used as follows:

"image".id.find().doOCR(ocrMode = OCRMode.DIGITS)

New search methods

Other than template matching, there are other new search methods. All of them are defined in ExtendedBy class, which extends Selenium's By class (seriously):

Search method Description Example
ExtendedBy.classNameList(String) Classic WebDriver doesn't allow searching by multiple class names ExtendedBy.classNameList("unicode audiolink")
ExtendedBy.attribute(String, String) Search by attribute and it's value ExtendedBy.attribute("value", "quit")
ExtendedBy.template(...) Search by image from resources (string path), or by existing BufferedImage ExtendedBy.template(Example::class.java, "/images/face.png")
ExtendedBy.value(String) Search by value of attribute value ExtendedBy.value("quit")
ExtendedBy.position(Point) Returns element found by position from top left corner (using javascript) ExtendedBy.position(Point(100, 200))

Other than that, there are also methods that return proxy elements which are not actually real WebElement's, but are useful in composition with Template Matching, OCR, and position related code:

Search method Description Example
ExtendedBy.rectangle(Rectangle) Returns element that is proxy of real WebElement, but is bounded by rectangle inside it. It's very useful when performing Template Matching / OCR is specific area of parent element ExtendedBy.rectangle(Rectangle(20, 50, 100, 200))
ExtendedBy.point(Point) Similar to ExtendedBy.rectangle, but is defined only by a Point. Not suitable for Template Matching / OCR, but suitable for clicking at specific location inside other WebElement ExtendedBy.point(Point(200, 300))
ExtendedBy.percentRectangle(RectangleF) Similar to ExtendedBy.rectangle, but is relation to the parent element in a percentage way (all parameters - x, y, width and height ExtendedBy.percentRectangle(RectangleF(0.1f, 0.2f, 0.5f, 0.3f))
ExtendedBy.percentPoint(PointF) Similar to ExtendedBy.percentRectangle, but is defined only by a point. Point(0.5f, 0.5f) is center of parent element ExtendedBy.percentPoint(PointF(0.3f, 0.4f))
ExtendedBy.twoPointRectangle(TwoPointRectangle) Results exactly the same as ExtendedBy.rectangle, but is defined by two points - top left and bottom right ExtendedBy.twoPointRectangle(TwoPointRectangle(Point(100, 200), Point(200, 400)))
ExtendedBy.twoPointPercentRectangle(TwoPointRectangleF) Results exactly the same as ExtendedBy.percentRectangle, but is defined by two percentage points - top left and bottom right ExtendedBy.twoPointRectangle(TwoPointRectangleF(PointF(0.1f, 0.2f), PointF(0.4f, 0.3f)))

web-driver-support's People

Contributors

domazey avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.