Web Scraping and Catalog Import on WooCommerce with Variations
585 products with variations, photos, and descriptions migrated to WooCommerce in record time — a custom scraping script that eliminated months of manual data entry.
DiscussAbout this project
Migrating 585 products with full variations from a legacy site to a brand-new WooCommerce store — without a single line of manual data entry
Catalog migration is one of those projects that looks deceptively simple from the outside and turns out to be one of the most error-prone jobs in e-commerce. A store with 585 products, each with multiple sizes, colors, photos, and rich descriptions, represents thousands of individual data points. Done by hand, it takes months of tedious copy-pasting, and every mistake silently damages the new store's SEO or customer experience. Done by script, it takes days — if the engineering is right.
This project is a textbook example of that second path: a custom-built Python scraping and import pipeline that moved the entire catalog of payperwear.com to a fresh WooCommerce installation, variations included, categories included, images included, with the design finalized for optimal browsing.
The scope of the migration
The client needed the complete contents of their existing site re-created on a new WooCommerce platform. Specifically:
- 585 products in total, across multiple categories.
- For each product: a full set of variations (sizes, colors, sometimes material or fit), each variation tied to its own SKU and stock status.
- Rich product descriptions including formatting, lists, and product-specific marketing copy.
- Photos in multiple resolutions, aligned with each color variation.
- A coherent category tree matching the structure of the source site to preserve navigation logic and SEO topical relevance.
The technical approach
Custom scraping script A Python-based scraper was written to walk the entire source site, extract structured data from each product page, download the associated images, and normalize the whole catalog into a clean intermediate format. Because the source site had inconsistencies between product templates, the parser was built to tolerate variation in HTML structure without skipping records.
Variation handling WooCommerce's variation model has its quirks. Each product needed its attributes (size, color, etc.) registered first, then the parent product created, then every variation inserted one by one with the right SKU, price, stock, and image. The script automated this multi-step dance flawlessly across all 585 products.
Image pipeline All images were downloaded, renamed for SEO, re-uploaded to the WooCommerce media library, and attached to the correct variation. This alone would have taken weeks by hand; the script handled it in hours.
Category and tag setup Categories were re-created on the new site with the same hierarchy as the source, and each product was assigned to its correct nodes. Category-level SEO text and attributes were migrated alongside.
Storefront design configuration Beyond the data migration itself, we configured the WooCommerce theme for smooth navigation: filters by size and color, clear product grids, responsive design, fast-loading category pages. The result is not just "the same catalog elsewhere" — it is an upgraded version of it.
The hardest technical problems solved
- Massive data import without timeouts: the import ran in carefully sized batches to avoid killing the PHP process or exhausting WooCommerce's memory.
- Variation deduplication: same-attribute variations coming from different source product pages were matched and merged where appropriate.
- Category filter configuration: custom attribute taxonomies were tuned so the filter plugin displayed the right options on the right category pages.
- Data consistency: the script includes pre-flight checks comparing the count of imported products and variations against the source, catching any silent drop.
The delivered outcome
- A fully functional WooCommerce site, visually and functionally equivalent to the original, with a modernized UX.
- Zero manual data entry for the client's team during the whole migration.
- SEO-friendly URLs, alt tags, and metadata preserved or improved versus the legacy site.
- A reusable migration framework that has since been adapted for other catalog migrations on WooCommerce, Shopify, and Magento.
Need targeted B2B data or a catalog migration? Discover our Web Scraping & Data service →
Technologies used
Similar projects
Explore other case studies in the same category.
B2B Lead Generation for Beauty Services in France
Hyper-targeted B2B lead list of French beauty institutes outside major cities — names, phones, addresses, and website-verified services, extracted automatically.
View projectData Scraping Pipeline "Adrien_78"
A surgical B2B lead machine for German-speaking Switzerland: multi-source scraping, hidden email extraction via regex, Hunter.io validation, and zero-duplicate output.
View projectRelated blog articles
Dive deeper into the topic with our guides and tutorials.
Supabase vs Google Sheets: The Mistake 90% of Automation Builders Make (And How to Avoid It)
Google Sheets is every beginner's favorite automation tool — and the leading cause of production disasters. Rate limits, corrupted data, zero security: here's why Supabase changes everything, and how to decide which tool to use in your context.
Read articleCustom Website Pricing 2026: WordPress, Next.js, SaaS
How much does a professional website cost in 2026? Complete pricing guide for WordPress, Next.js, landing pages and custom SaaS, with hidden costs and what drives quotes.
Read article