Objective: Enable smart scraping and data extraction through LLM-generated scripts executed within AVM’s sandbox.

Data Extraction

Delegate web scraping logic to an LLM, execute safely on AVM nodes, and obtain structured CSV/JSON without local risk.

Use Cases

Web2: Shopify Product APIs

Extract product data, pricing, and inventory information from e-commerce platforms.

Web3: CoinGecko, Twitter, Dune

Scrape cryptocurrency data, social sentiment, and blockchain analytics for comprehensive market analysis.

Scenario: Bulk Scraping

Extract account balances pages from a DeFi dashboard in parallel.

Implementation: Two-Stage Extraction

Fetch HTML
Retrieve page content locally.
Parse with LLM
Prompt the LLM to extract table data via BeautifulSoup.
Run in AVM
Execute parsing code with the runPython tool.
Aggregate Results
Combine CSV outputs for all URLs.

Example (TypeScript)

import { runPythonTool } from "@avm-ai/avm-vercel-ai";
import fetch from "node-fetch";
import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";

const tool = runPythonTool();

async function scrape(url: string) {
  const html = await fetch(url).then(r => r.text());
  const prompt = `
Write execute(input) that uses BeautifulSoup to parse input["data"] HTML 
and outputs CSV.`;
  const { text: code } = await generateText({
    model: openai("gpt-4o"),
    prompt,
    tools: { runPython: tool },
  });
  const result = await tool.exec({ code, input: { data: html } });
  return result.output.stdout;
}

Next Steps

Parallelize jobs via MCP concurrency.
Store results on IPFS.
Add retry and throttling mechanisms.

Getting Started

Developer

Use Cases

Investor

Other

Data Extraction

Data Extraction

Use Cases

Web2: Shopify Product APIs

Web3: CoinGecko, Twitter, Dune

Scenario: Bulk Scraping

Implementation: Two-Stage Extraction

Example (TypeScript)

Next Steps

Getting Started

Developer

Use Cases

Investor

Other

​Data Extraction

​Use Cases

​Web2: Shopify Product APIs

​Web3: CoinGecko, Twitter, Dune

​Scenario: Bulk Scraping

​Implementation: Two-Stage Extraction

​Example (TypeScript)

​Next Steps

Data Extraction

Use Cases

Web2: Shopify Product APIs

Web3: CoinGecko, Twitter, Dune

Scenario: Bulk Scraping

Implementation: Two-Stage Extraction

Example (TypeScript)

Next Steps