Ecommerce Data Scraping: What Retailers Are Actually Watching

Retailers used to send people to competitors' stores with clipboards. This was called a shelf check. They'd write down prices, note what was in stock, sketch the shelf layout. Weekly, usually. Monthly if the category moved slowly. The data was out of date before they left the parking lot.

That practice still exists, in a technical sense. What changed is what "the shelf" means.

The Things That Aren't Price

When people think about ecommerce data scraping, they think about price. Price is the obvious signal — the number that triggers the fastest response. But the businesses doing this at any kind of scale stopped treating price as the only signal years ago.

Tracking competitor prices manually? SiteScoop extracts them into a spreadsheet in seconds - no code, no uploads, nothing leaves your browser.

Try SiteScoop free →

Review counts are often more interesting than the reviews themselves. A product that goes from 847 reviews to 1,240 reviews in six weeks is gathering velocity. It's selling, and selling fast. A competitor's SKU that gains 400 reviews between Monday and Thursday almost certainly ran a promotion. That information is on the product page. It's public. For retailers using manual monitoring, it's invisible.

Stock status is another signal. When a competitor's top-selling item goes out of stock, something predictable happens in the category — other sellers see increased traffic, prices drift upward, the Buy Box on Amazon shifts. Retailers who track stock changes across competitor listings see these patterns in near real-time. Retailers who don't see them in their own sales numbers two weeks later, without knowing what caused the spike.

Then there's the product description data. Category managers at larger retailers scrape competitor product copy — not to plagiarise it, but to understand how competitors are positioning similar products. What features they're leading with. What language they're using for specifications. Whether they've added new variants. It's competitive intelligence, and the source is the product listing, not a market research report.

2.5 Million Changes a Day

Amazon makes an estimated 2.5 million price changes per day across its catalog, according to analysis by Boomerang Commerce, a retail pricing firm. That works out to roughly 29 changes per second.

The businesses tracking those changes range from enterprise retail operations with dedicated pricing teams and six-figure software subscriptions, to individual sellers monitoring a handful of competing ASINs from a spreadsheet. The gap between those two ends of the market used to be vast. Enterprise retailers had the tools, the technical resources, and the data infrastructure. Smaller operations had the clipboard model, updated to include occasional manual checks of competitor listings.

No-code web scraping tools narrowed that gap considerably. The same data that required a Python developer and a hosted scraping pipeline five years ago can now be collected by someone with a browser extension and an afternoon.

What the Data Actually Gets Used For

Pricing decisions are the obvious use case. But the downstream uses of ecommerce scraping are broader than most people outside retail realise.

Assortment planning: category managers use scraped data to track which SKUs competitors are adding, discontinuing, or repositioning. New product launches often show up in competitor listings before press releases do.

Promotional monitoring: scraping category pages on a schedule reveals promotional patterns. When a competitor consistently discounts a product every fourth weekend, that pattern shows up in the data before it shows up in anyone's intuition.

MAP compliance: brands that sell through multiple retailers use scraping to monitor Minimum Advertised Price agreements. When a retailer undercuts MAP, it tends to appear in the listing before anyone reports it.

Review mining: the text of competitor reviews — not just the scores — is a source of product development intelligence. What customers complain about, what they specifically praise, what features they wish existed. Aggregated across hundreds of reviews, this is market research. It's publicly available, and it requires collection at scale to be useful.

The Smaller End of the Market

Most writing about ecommerce data scraping describes enterprise operations: dedicated pricing teams, real-time dashboards, automated repricing systems feeding directly into inventory software. That end of the market is real, and it accounts for a significant portion of the price intelligence software market.

But the majority of businesses doing competitive monitoring are not enterprise retailers. They're mid-sized ecommerce operations, Amazon sellers, D2C brands, and procurement teams at companies where the IT department has nothing to do with any of this.

For those businesses, the tool that matters is whatever actually gets used. A platform costing thousands of dollars a month doesn't get adopted by a five-person team managing 200 SKUs. A browser extension that pulls current prices from twelve competitor listings into a spreadsheet in four minutes does get used. Every Monday. Without fail.

The data collection is less sophisticated. The coverage is narrower. But it happens consistently, which matters more than sophistication.

What Changes When You Can See the Data

The shelf check had a structural problem beyond the obvious one of being slow. It measured what someone chose to measure, when they remembered to measure it. The selection bias was significant — teams checked the products they were already worried about, which were rarely the products that needed watching.

Automated collection removes that bias. Everything gets checked on the same schedule. The anomaly nobody was looking for gets flagged alongside the SKU everyone was watching. That's usually where the useful insight is: the thing that wasn't on the radar until the data put it there.

The shelf never closed. It just moved to a browser tab.