Skip to main content
Back to projects
Web Scraping

Web Scraping and Catalog Import on WooCommerce with Variations

585 products with variations, photos, and descriptions migrated to WooCommerce in record time — a custom scraping script that eliminated months of manual data entry.

Discuss
Screenshot of the Web Scraping and Catalog Import on WooCommerce with Variations project — Web Scraping

About this project

Migrating 585 products with full variations from a legacy site to a brand-new WooCommerce store — without a single line of manual data entry

Catalog migration is one of those projects that looks deceptively simple from the outside and turns out to be one of the most error-prone jobs in e-commerce. A store with 585 products, each with multiple sizes, colors, photos, and rich descriptions, represents thousands of individual data points. Done by hand, it takes months of tedious copy-pasting, and every mistake silently damages the new store's SEO or customer experience. Done by script, it takes days — if the engineering is right.

This project is a textbook example of that second path: a custom-built Python scraping and import pipeline that moved the entire catalog of payperwear.com to a fresh WooCommerce installation, variations included, categories included, images included, with the design finalized for optimal browsing.

The scope of the migration

The client needed the complete contents of their existing site re-created on a new WooCommerce platform. Specifically:

  • 585 products in total, across multiple categories.
  • For each product: a full set of variations (sizes, colors, sometimes material or fit), each variation tied to its own SKU and stock status.
  • Rich product descriptions including formatting, lists, and product-specific marketing copy.
  • Photos in multiple resolutions, aligned with each color variation.
  • A coherent category tree matching the structure of the source site to preserve navigation logic and SEO topical relevance.

The technical approach

Custom scraping script A Python-based scraper was written to walk the entire source site, extract structured data from each product page, download the associated images, and normalize the whole catalog into a clean intermediate format. Because the source site had inconsistencies between product templates, the parser was built to tolerate variation in HTML structure without skipping records.

Variation handling WooCommerce's variation model has its quirks. Each product needed its attributes (size, color, etc.) registered first, then the parent product created, then every variation inserted one by one with the right SKU, price, stock, and image. The script automated this multi-step dance flawlessly across all 585 products.

Image pipeline All images were downloaded, renamed for SEO, re-uploaded to the WooCommerce media library, and attached to the correct variation. This alone would have taken weeks by hand; the script handled it in hours.

Category and tag setup Categories were re-created on the new site with the same hierarchy as the source, and each product was assigned to its correct nodes. Category-level SEO text and attributes were migrated alongside.

Storefront design configuration Beyond the data migration itself, we configured the WooCommerce theme for smooth navigation: filters by size and color, clear product grids, responsive design, fast-loading category pages. The result is not just "the same catalog elsewhere" — it is an upgraded version of it.

The hardest technical problems solved

  • Massive data import without timeouts: the import ran in carefully sized batches to avoid killing the PHP process or exhausting WooCommerce's memory.
  • Variation deduplication: same-attribute variations coming from different source product pages were matched and merged where appropriate.
  • Category filter configuration: custom attribute taxonomies were tuned so the filter plugin displayed the right options on the right category pages.
  • Data consistency: the script includes pre-flight checks comparing the count of imported products and variations against the source, catching any silent drop.

The delivered outcome

  • A fully functional WooCommerce site, visually and functionally equivalent to the original, with a modernized UX.
  • Zero manual data entry for the client's team during the whole migration.
  • SEO-friendly URLs, alt tags, and metadata preserved or improved versus the legacy site.
  • A reusable migration framework that has since been adapted for other catalog migrations on WooCommerce, Shopify, and Magento.

Need targeted B2B data or a catalog migration? Discover our Web Scraping & Data service →

Technologies used

Web ScrapingWooCommerceWordPressData MigrationCustom Scripting

Related Services

Got a similar project? Explore our offers.

A similar project?

Let's discuss your need and build something exceptional together.

Let's talk

Similar projects

Explore other case studies in the same category.

Related blog articles

Dive deeper into the topic with our guides and tutorials.