Scraping Information With Yarado Client

Hi topic readers,

I was wondering how I can scrape information from a website with the Yarado Client. Can someone maybe show me a screen recording so I and the rest of the community can use the scraping tool ourselves?

I want to create an use case for students. The idea is to scrape their timetables and set an alarm for the next day depending on their first lecture. It sounds like RPA can help students out with this use case, but I don’t know how to assemble the Yarabot in order to run this task… Any suggestions?

Kind regards,
Ivar Hermens

@BasKroon or @johan.van.der.blom can you help Ivar out and share the approach and outcomes here while at it? tnx

1 Like

Hey Ivar,

Scraping text is used before in a workshop, the zip file of the workshop can be found here: Training Exercise - VAT task - Learn / Training Files - Yarado Community
The zip file attached to this community post will include a PDF. The PDF includes 2 tasks, where Task 2 explains the way to scrape text of a website

I will try to make a screen recording tomorrow, that will explain how to scrape the text and make it usable in Yarado

3 Likes

Hey Ivar,

For this example I will use a company with opening hours.
SPAR | SPAR van Santen
As you can see, this page contains 2 types of opening hours. One for this week and one for next week

We will use the function “Mouse Action” and right click on this step, choose actions and input text.
We need to copy everything of the page in order to select the right hours to the right variables.
The input text is: {CTRL A} {CTRL C}
Now the whole page is on our Clipboard.
We need a function to get everything in a variable: “Set Variable Value” and select input as “Clipboard” and output as %scraped-text%

image


%scraped-text% is now filled with all the information of the webpage
We need to separate these value in order to get the right opening hours for the right day
We double click the value of %scraped-text% in order to the content. We search for Maandag to find out how we can get the value between these lines. We can see that the value of Maandag is between the word “Maandag” and the word “Dinsdag”

image


We use the function “Return Lines Between Text”
And fill it with these values, it should look like this:
image

Now the variable %maandag% will be filled with
image

Which contains 4 rows(lines)
We know that line 2 equals the hours of this week and line 3 of the next week. We can separate these with the function Get Text from Line Number

This week:
image

Next week:
image
Where line 2 is for this week and line 3 is for the next week

The important part for your question was getting the opening hour of starting hour in order to set a alarm, so the only part we need of “08:00 tot 19:00” is everything in front of “ tot “
So we split the text and return everything in front of the delimiter:
image


Now the variable %ma_thisweek% contains only “08:00”
Repeat this for every day of the week and the scraping part of the task is done!

Goodluck

2 Likes

It’s pretty fast as well

4 Likes

Thank you for this easy to understand tutorial. I’ll pick up the assemblage of the task I discribed above and come back on this topic pretty soon .

3 Likes