ocrGetElementPositionByText
Get the position of a text on the screen. The command will search for the provided text and try to find a match based on Fuzzy Logic from Fuse.js. This means that if you might provide a selector with a typo, or the found text might not be a 100% match it will still try to give you back an element. See the logs below.
Usage
const result = await browser.ocrGetElementPositionByText("Username");
console.log("result = ", JSON.stringify(result, null, 2));
Output
Result
result = {
"dprPosition": {
"left": 373,
"top": 606,
"right": 439,
"bottom": 620
},
"filePath": ".tmp/ocr/desktop-1716658199410.png",
"matchedString": "Started",
"originalPosition": {
"left": 373,
"top": 606,
"right": 439,
"bottom": 620
},
"score": 85.71,
"searchValue": "Start3d"
}
Logs
# Still finding a match even though we searched for "Start3d" and the found text was "Started"
[0-0] 2024-05-25T17:29:59.179Z INFO webdriver: COMMAND ocrGetElementPositionByText(<object>)
......................
[0-0] 2024-05-25T17:29:59.993Z INFO @wdio/ocr-service:ocrGetElementPositionByText: Multiple matches were found based on the word "Start3d". The match "Started" with score "85.71%" will be used.
Options
text
- Type:
string
- Mandatory: yes
The text you want to search for to click on.
Example
await browser.ocrGetElementPositionByText({ text: "WebdriverIO" });
contrast
- Type:
number
- Mandatory: no
- Default:
0.25
The higher the contrast, the darker the image and vice versa. This can help to find text in an image. It accepts values between -1
and 1
.
Example
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
contrast: 0.5,
});
haystack
- Type:
number
- Mandatory:
WebdriverIO.Element | ChainablePromiseElement | Rectangle
This is the search area in the screen where the OCR needs to look for text. This can be an element or a rectangle containing x
, y
, width
and height
Example
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
haystack: $("elementSelector"),
});
// OR
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
haystack: await $("elementSelector"),
});
// OR
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
haystack: {
x: 10,
y: 50,
width: 300,
height: 75,
},
});
language
- Type:
string
- Mandatory: No
- Default:
eng
The language that Tesseract will recognize. More info can be found here and the supported languages can be found here.
Example
import { SUPPORTED_OCR_LANGUAGES } from "@wdio/ocr-service";
await browser.ocrGetElementPositionByText({
text: "WebdriverIO",
// Use Dutch as a language
language: SUPPORTED_OCR_LANGUAGES.DUTCH,
});
fuzzyFindOptions
You can alter the fuzzy logic to find text with the following options. This might help find a better match