Skip to main content
The Extract Text from TXT, XML, RSS, and JSON step converts structured or raw files into plain text, removing technical elements such as tags, keys, and syntax. With this, Tess transforms complex data into readable content ready for analysis by AI agents.

What is the Step?

This step is part of the Document Processing category, responsible for cleaning and simplifying data from different formats. In practice, it:
  • Reads TXT, XML, RSS, and JSON files
  • Removes:
    • XML tags
    • JSON structures (keys, arrays)
    • RSS metadata
  • Keeps only the relevant semantic content
  • Delivers a clean block of text in the agent’s context

Where to find it

  1. Go to AI Studio
  2. Click on Add AI Step
  3. Select Document Processing
  4. Choose Extract Text from TXT, XML, RSS, and JSON
Image

How to use?

Configuration fields

FieldRequiredDescription
Step NameYesInternal step name (alphanumeric characters only). Used to reference the output in prompts
File URLYesDirect URL of the file (TXT, XML, RSS, or JSON) or input variable (e.g.: {{json}})

About the Output

The result is a continuous block of plain text, without any original technical structure.

What is kept:

  • Semantic content (names, descriptions, values)
  • All text relevant for human reading

What is removed:

  • XML tags (<tag>)
  • JSON structures ({}, [])
  • RSS metadata
  • Technical syntax
Important:The original structure (hierarchy, nesting) is lost — the content becomes linear.

Deeper explanation

This step acts as a normalizer of technical data into natural language.

Flow

File (TXT / XML / RSS / JSON) → Step removes technical structureClean text is generated → Agent interprets semantically
Note:
  • The AI focuses on the content, not the structure
  • Ideal for inputs that were not originally designed for human reading

Practical examples

Prompt:
“Summarize the main news of the day and identify relevant market trends.”
Usage:
  • RSS feed from news portals
  • Agent generates automatic curation
Prompt:
“Analyze the logs and identify the main contact reasons and customer sentiment.”
Usage:
  • Chat or support logs
  • Automatic issue classification
Prompt:
“Based on the extracted data, generate a personalized prospecting email for each lead.”
Usage:
  • JSON export from CRM
  • AI transforms into a commercial action
Prompt:
“Organize the extracted information and highlight the main indicators.”
Usage:
  • API responses
  • Transform technical data into insights
Best practices
  • Use direct file URLs: avoid links that open web pages (HTML)
  • Combine with structured prompts: e.g., “extract name, role, and company”
  • Be careful with structure loss: nested JSON may lose logical context
  • Use the Step Name in the prompt: e.g., “Based on the data from step dados_crm…”
  • Combine with other steps: Extract → analysis → save to Sheets/Drive

Important notes

  • The URL must be public and direct (no login required)
  • Original hierarchical structure is lost
  • The step does not preserve data formatting or organization
  • Large files may impact the context window
Extract Text from TXT, XML, RSS, and JSON is the bridge between technical data and artificial intelligence. It enables transforming APIs, feeds, and structured files into usable information, unlocking analysis, automations, and content generation based on data that previously required manual processing.