Zum Hauptinhalt springen

ocrGetElementPositionByText

Get the position of a text on the screen. The command will search for the provided text and try to find a match based on Fuzzy Logic from Fuse.js. This means that if you might provide a selector with a typo, or the found text might not be a 100% match it will still try to give you back an element. See the logs below.

Usage

const result = await browser.ocrGetElementPositionByText("Username");

console.log("result = ", JSON.stringify(result, null, 2));

Output

Result

result = {
"dprPosition": {
"left": 373,
"top": 606,
"right": 439,
"bottom": 620
},
"filePath": ".tmp/ocr/desktop-1716658199410.png",
"matchedString": "Started",
"originalPosition": {
"left": 373,
"top": 606,
"right": 439,
"bottom": 620
},
"score": 85.71,
"searchValue": "Start3d"
}

Logs

# Still finding a match even though we searched for "Start3d" and the found text was "Started"
[0-0] 2024-05-25T17:29:59.179Z INFO webdriver: COMMAND ocrGetElementPositionByText(<object>)
......................
[0-0] 2024-05-25T17:29:59.993Z INFO @wdio/ocr-service:ocrGetElementPositionByText: Multiple matches were found based on the word "Start3d". The match "Started" with score "85.71%" will be used.

Options

text

  • Type: string
  • Mandatory: yes

The text you want to search for to click on.

Example

await browser.ocrGetElementPositionByText({ text: "WebdriverIO" });

contrast

  • Type: number
  • Mandatory: no
  • Default: 0.25

The higher the contrast, the darker the image and vice versa. This can help to find text in an image. It accepts values between -1 and 1.

Example

await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
contrast: 0.5,
});

haystack

  • Type: number
  • Mandatory: WebdriverIO.Element | ChainablePromiseElement | Rectangle

This is the search area in the screen where the OCR needs to look for text. This can be an element or a rectangle containing x, y, width and height

Example

await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
haystack: $("elementSelector"),
});

// OR
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
haystack: await $("elementSelector"),
});

// OR
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
haystack: {
x: 10,
y: 50,
width: 300,
height: 75,
},
});

language

  • Type: string
  • Mandatory: No
  • Default: eng

The language that Tesseract will recognize. More info can be found here and the supported languages can be found here.

Example

import { SUPPORTED_OCR_LANGUAGES } from "@wdio/ocr-service";
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
// Use Dutch as a language
language: SUPPORTED_OCR_LANGUAGES.DUTCH,
});

fuzzyFindOptions

You can alter the fuzzy logic to find text with the following options. This might help find a better match

fuzzyFindOptions.distance

  • Type: number
  • Mandatory: no
  • Default: 100

Determines how close the match must be to the fuzzy location (specified by location). An exact letter match which is distance characters away from the fuzzy location would score as a complete mismatch. A distance of 0 requires the match to be at the exact location specified. A distance of 1000 would require a perfect match to be within 800 characters of the location to be found using a threshold of 0.8.

Example
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
fuzzyFindOptions: {
distance: 20,
},
});

fuzzyFindOptions.location

  • Type: number
  • Mandatory: no
  • Default: 0

Determines approximately where in the text is the pattern expected to be found.

Example
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
fuzzyFindOptions: {
location: 20,
},
});

fuzzyFindOptions.threshold

  • Type: number
  • Mandatory: no
  • Default: 0.6

At what point does the matching algorithm give up. A threshold of 0 requires a perfect match (of both letters and location), a threshold of 1.0 would match anything.

Example
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
fuzzyFindOptions: {
threshold: 0.8,
},
});

fuzzyFindOptions.isCaseSensitive

  • Type: boolean
  • Mandatory: no
  • Default: false

Whether the search should be case sensitive.

Example
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
fuzzyFindOptions: {
isCaseSensitive: true,
},
});

fuzzyFindOptions.minMatchCharLength

  • Type: number
  • Mandatory: no
  • Default: 2

Only the matches whose length exceeds this value will be returned. (For instance, if you want to ignore single character matches in the result, set it to 2)

Example
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
fuzzyFindOptions: {
minMatchCharLength: 5,
},
});

fuzzyFindOptions.findAllMatches

  • Type: number
  • Mandatory: no
  • Default: false

When true, the matching function will continue to the end of a search pattern even if a perfect match has already been located in the string.

Example
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
fuzzyFindOptions: {
findAllMatches: 100,
},
});

Welcome! How can I help?

WebdriverIO AI Copilot