If I have a table image and I would like to get Campaign, Date sent, Sent to, Open from the table given several conditions:
Table width varies but the longest it can extend is the viewport width
The data in the table varies as well, in some cases, I might need RPA to scroll down and get the remaining data from the table.
The table header will remain the same, so maybe there is a way to get all the data I wanted using the table header as a pivot point?
Sometimes table might not have any lines in between as well
Is there an efficient way to do the parsing given the information above?
Thank you very much 😊😊
This is considered an advanced problem to solve using Robotiive Computer Vision feature. In order to solve this we need first to understand the tools and features to help us solve this problem:
1. Doodle
Doodle is an advanced Computer Vision in Robotiive. Doodle is mainly divided into 2 parts:
Doodle template (Where you design Computer Vision component flow to solve your computer vision problem)
Run Doodle (an Action to execute the Doodle Template created)
2. Call Task
3. Loop
More details on Doodle templates:
If you imported the script zip file to Robotiive, you would be able to open and view a template named "SetVariableFromOCR_EXTERNAL".
The primary purpose of this template is to get all OCR text from a region on your screen.
The screenshot block will take screenshots of your entire screen
SetImage block will get a region from user input and produce a set of coordinate relative to the screenshot
The crop image block will receive a set of coordinates from the setImage block and crop out a region to be processed by the next blocks
external OCR block will use an external OCR engine in which information is stored in Robotiive global config, in this case, we will use the Paddle OCR engine to extract the OCR text from the image it received from the previous block. This block will return a Recognized Rectangle data type (data type that contains all ocr text and its coordinates)
save to variable block let us store our data to a variable to be used in our Robotiive script outside of Doodle
Recognized Rectangle Data type (consisting of POI, Rectangle, and text)
[ { "poi": { "x": 0, "y": 0 }, "rectangle": { "max": { "x": 1146, "y": 506 }, "min": { "x": 1098, "y": 491 } }, "score": 0, "text": "YoY:" }, { "poi": { "x": 0, "y": 0 }, "rectangle": { "max": { "x": 788, "y": 506 }, "min": { "x": 716, "y": 491 } }, "score": 0, "text": "Datesent" } ]
Rectangle data type:
"rectangle": { "max": { "x": 788, "y": 506 }, "min": { "x": 716, "y": 491 } }
Using this nested data type returned by Doodle, we can use flow control logic to find all connections within our table.
Doodle Input/Output:
at the bottom of the Doodle template screenshot we can see two sections:
1. Doodle Input
We can let the user determine the input of our Doodle template when using RunDoodle Action in the script. As we can see in the screenshot, the input we want the user to give us are:
OCR Region (SetImage block), will let the user use their screen capture and give us a region on the screen using the screen capture tool in Run Doodle
Target String (external OCR), will let users decide what text they want OCR to search in the region captured earlier. (if the user leaves it blank, external OCR block will return everything it finds in the region)
use regex and Regex Flag will let the user decide whether or not to use Regular Expression to filter OCR results returned
2. Doodle Output
Script flow:
1. Get All text recognized rectangle list using Run Doodle
get all coordinate and text using OCR engine (for more detail refer to Doodle explanation above)
2. Find all Table header Rectangle, using the getTableHeader task
3. After finding all table header, we can use its coordinate to find all text that is in the same column
4. Joining all the result we found so far to a maplist variable
MapList Data type example:
[ [ { "key": "campaign", "value": "BoxingDay Sale2020" }, { "key": "dateSent", "value": "05/12/20" }, { "key": "sentTo", "value": "43,291" }, { "key": "open", "value": "8,658" }, { "key": "click", "value": "216" }, { "key": "yoy", "value": "+32%" } ], [ { "key": "campaign", "value": "Thanksgiving2020" }, { "key": "dateSent", "value": "30/11/20" }, { "key": "sentTo", "value": "43,291" }, { "key": "open", "value": "7,792" }, { "key": "click", "value": "194" }, { "key": "yoy", "value": "-11%" } ], [ { "key": "campaign", "value": "BackToSchool2020" }, { "key": "dateSent", "value": "27/09/20" }, { "key": "sentTo", "value": "28,017" }, { "key": "open", "value": "4,202" }, { "key": "click", "value": "105" }, { "key": "yoy", "value": "-2%" } ] ]
As we can see in the example, every map in the list would represent a row in the table with its header as the key.
Note on how to use the script after importing:
Recapture the table image
2. Set up Paddle OCR
- Name: Paddle
- URL: http://localhost:48774/ExecuteOCR
- Timeout: 200
- ImageScaleRatio: 2
Hope this instruction would help you solve the table problem using the Robotiive Computer Vision feature. If you have any questions, feel free to contact me at calvin@iscoollab.com