Content cleanup during migration
In large government site migrations, keep the only good stuff and clean as you go
At Florida DrupalCamp 2019, Steve Wirt, an engineer and migration expert at CivicActions, shows how content cleanup can be a programmatic part of large content migrations. Steve enjoys helping government clients transform their technologies and teaching best practices for keeping websites clean and healthy.
In this session, Steve demonstrates techniques to grab only the content that is needed from old legacy pages, fixing markup issues as part of the migration. Keeping just the good stuff allows agencies to save time and money — while making sure their content isn’t stale or outdated.
The approach described is platform-agnostic, but the resources and examples are from Drupal 8.
Highlights
1:45 — Types of “baggage” you find during large site migrations
2:33 — Reasons to address content cleanup during migration
4:20 — Why migrations are like moving from one house to another
7:00 — How to make the tough decisions on deciding what content to keep
7:20 — Using Google sheets for content audits (and using analytics data)
8:03 — Moving content via the Migrate API (hint: move only what you NEED)
10:20 — What to do when pages across the site have different markup
13:48 — Using Migration Tools module to obtain, normalize, and validate data
22:05 — Using DOM Modifiers to reduce content noise (clean-up what you migrate)
25:18 — Order of operations: timing of moving and cleanup activities using DOM Operations
31:38 — When hands-on cleanup is necessary, and how to do it
Resources
- Deck: Content Cleanup During Migration
- Migrate
- Migrate Plus
- Migrate API
- Migration Tools module
- QueryPath
- AGL Association, a nonprofit helping government modernize