Getting Started
Installation
The easiest way is to keep @wdio/ocr-service
as a dependency in your package.json
via.
- npm
- Yarn
- pnpm
npm install @wdio/ocr-service --save-dev
yarn add @wdio/ocr-service --dev
pnpm add @wdio/ocr-service --save-dev
Instructions on how to install WebdriverIO
can be found here.
This module uses Tesseract as an OCR engine. By default, it will verify if you have a local installation of Tesseract installed on your system, if so, it will use that. If not, it will use the Node.js Tesseract.js module which is automatically installed for you.
If you want to speed up the image processing then the advice is to use a locally installed version of Tesseract. See also Test execution time.
Instruction on how to install Tesseract as a system dependency on your local system can be found here.
For installation questions/errors with Tesseract please refer to the Tesseract project.
Typescript support
Ensure that you add @wdio/ocr-service
to your tsconfig.json
configuration file.
{
"compilerOptions": {
"types": ["node", "@wdio/globals/types", "@wdio/ocr-service"]
}
}
Configuration
To use the service you need to add ocr
to your services array in wdio.conf.ts
// wdio.conf.js
exports.config = {
//...
services: [
// your other services
[
"ocr",
{
contrast: 0.25,
imagesFolder: ".tmp/",
language: "eng",
},
],
],
};
Configuration Options
contrast
- Type:
number
- Mandatory: No
- Default:
0.25
The higher the contrast, the darker the image and vice versa. This can help to find text in an image. It accepts values between -1
and 1
.
imagesFolder
- Type:
string
- Mandatory: No
- Default:
{project-root}/.tmp/ocr
The folder where the OCR results are stored.
If you provide a custom imagesFolder
, then the service will automatically add the subfolder ocr
to it.
language
- Type:
string
- Mandatory: No
- Default:
eng
The language that Tesseract will recognize. More info can be found here and the supported languages can be found here.
Logs
This module will automatically add extra logs to the WebdriverIO logs. It writes to the INFO
and WARN
logs with the name @wdio/ocr-service
.
Examples can be found below.
...............
[0-0] 2024-05-24T06:55:12.739Z INFO @wdio/ocr-service: Adding commands to global browser
[0-0] 2024-05-24T06:55:12.750Z INFO @wdio/ocr-service: Adding browser command "ocrGetText" to browser object
[0-0] 2024-05-24T06:55:12.751Z INFO @wdio/ocr-service: Adding browser command "ocrGetElementPositionByText" to browser object
[0-0] 2024-05-24T06:55:12.751Z INFO @wdio/ocr-service: Adding browser command "ocrWaitForTextDisplayed" to browser object
[0-0] 2024-05-24T06:55:12.751Z INFO @wdio/ocr-service: Adding browser command "ocrClickOnText" to browser object
[0-0] 2024-05-24T06:55:12.751Z INFO @wdio/ocr-service: Adding browser command "ocrSetValue" to browser object
...............
[0-0] 2024-05-24T06:55:13.667Z INFO @wdio/ocr-service:getData: Using system installed version of Tesseract
[0-0] 2024-05-24T06:55:14.019Z INFO @wdio/ocr-service:getData: It took '0.351s' to process the image.
[0-0] 2024-05-24T06:55:14.019Z INFO @wdio/ocr-service:getData: The following text was found through OCR:
[0-0]
[0-0] IQ Docs API Blog Contribute Community Sponsor Next-gen browser and mobile automation Welcome! How can | help? i test framework for Node.js Get Started Why WebdriverI0? View on GitHub Watch on YouTube
[0-0] 2024-05-24T06:55:14.019Z INFO @wdio/ocr-service:getData: OCR Image with found text can be found here:
[0-0]
[0-0] .tmp/ocr/desktop-1716533713585.png
[0-0] 2024-05-24T06:55:14.019Z INFO @wdio/ocr-service:ocrGetElementPositionByText: We searched for the word "Get Started" and found one match "Started" with score "63.64
...............