Skip to main content

Article Extraction

🔗 Original page — Source of this material


Description

Allows you to extract the main article from a resource page.

image-20200816-163409

Where can it be used:

  • Parsing content from websites
  • Working with text

How to add this action to your project?

Using the context menu, select Add ActionContent AnalysisArticle Extraction

image-20200816-163601

Or use the ❗→ Smart Search.

How to work with the action?

image-20200816-164002

  1. Tab with the loaded page: a) Active - the tab currently displayed to you. b) First - the first window on the left. c) By Name - specify the tab name or variable (case-sensitive). d) By Number - set the tab number. Tabs are numbered from left to right starting at 0. If you need to close the very first tab, enter zero in the field; subsequent tabs are 1, 2, 3, and so on.
  2. Result variable.
Example

You need to extract the main article from the homepage https://zennolab.com/ru/

image-20200816-164644

Since we're working in a single tab, select *Active

Once the action is complete, the article will be stored in the *text variable

image-20200816-164847

Example use case:

Go to the page and extract the text. Put the resulting content into a list for further processing.

image-20200816-165140

  1. Go to the page.
  2. Extract the main article and put it into a variable.
  3. Save it to a list.

This allows you to quickly parse text without the need for multiple tools.

  1. ❗→ Content Creation
  2. ❗→ Context Recognition
  3. ❗→ Variables Window
  4. ❗→ Tab Management