Resource hub

Help center, FAQ, and changelog in one place.

Catch the latest guidance on matching datasets, admin packs, and sessions. Browse the help center, skim the workflow FAQ, and review recent releases—all from the resource hub.

Help center

Start with the guide that matches your task: preparing a matching dataset, curating an admin pack, or navigating the pipeline.

Matching datasets

Data prep playbook: shape the sheet, name geography columns clearly, keep codes separate, and lock detected columns before matching.

Admin packs

Normalized GeoParquet requirements, alias hygiene, and how versioning/metadata keep packs trustworthy in crosswalk exports.

Session pipeline

From profiling through QA/export: what each stage saves, when to rerun, and how sessions stay reproducible.

Matching dataset guide

Clean, clearly labeled columns make profiling instant and keep every session reusing the same mapping. Prep the sheet first, then upload from the Matching Dataset hero CTA.

Prep steps

Make detection trivial

  • Shape: one header row, one geography concept per column, no merged cells. Delete empty helper columns and extra Excel tabs.
  • Name: spell out the level (province/regency/district/village). Keep codes in their own column; avoid mixing codes with names.
  • Anchor: latitude and longitude as two numeric WGS84 columns; quick scan for swapped values or commas instead of dots.
  • Lock: after upload, pick the correct geography columns in the Matching Dataset page and lock them so every session reuses the same mapping.

Recommended sheet spec

Upload profile

FormatCSV/TSV up to ~200 MB
ExcelBest under ~100k rows; export wide sheets
Geography1 column per level; optional code column
Coordinateslat / lon numeric, WGS84

Common fixes

  • Rename vague headers like "location" to their actual level before re-upload.
  • Split "province / district / village" cells into separate columns.
  • If detection guesses wrong, select the right columns once and lock them for all sessions.

What stays in the bundle

The crosswalk bundle keeps your cleaned headers, locked geography columns, alias rules, and QA notes. Prepping the file up front means exports stay consistent across runs.

Workflow FAQ

Actionable answers for the three flows you run most: preparing a matching dataset, curating an admin pack, and keeping sessions reproducible.

Matching datasetUpload & profiling

Get the file ready to match

Three rules to avoid re-uploads: one location concept per column, no merged headers, and separate latitude/longitude.

Headers that never confuse profiling

Keep a single header row, avoid merged cells, and trim whitespace. Split free-form 'province / district / village' cells into dedicated columns so GeoMatcher can detect the hierarchy cleanly.

File format and size sweet spot

CSV/TSV up to ~200 MB load fastest. XLSX works best under ~100k rows; if the sheet is wide, export to CSV first and remove empty columns. Keep lat/lon as two numeric columns in decimal degrees.

If column detection picks the wrong fields

On the Matching Dataset page, choose the geography columns manually and lock them before starting a session. Rename vague headers like 'location' to their actual level (province, district, etc.) for the next upload.

Open Matching Dataset
Admin packBoundary data

Publish a pack people can trust

Treat your admin pack like a product: predictable schema, clear versioning, and a handful of real-world aliases.

Must-have fields

Flat/normalized rows only: one admin unit per row with geometry, `admin_level` (starting at 1 for the top level), `code`, and `name`. Include `parent_code` when possible. Wide tables with per-level columns are rejected.

How many aliases should I keep?

Include 3-5 common spellings or abbreviations per feature (e.g., 'DKI Jakarta', 'Jakarta', 'Jakarta Raya'). Keep them in an 'aliases' column as a JSON array or comma-separated text; avoid duplicates or unrelated nicknames.

Versioning and replacing packs

Upload a new pack with an updated version string instead of overwriting. Sessions record the pack hash, so you always know which version drove a crosswalk. Note changes (e.g., alias refresh, geometry corrections) in the description.

Go to Admin Packs
Sessions & QAReproducible runs

Keep matching outcomes predictable

Think of a session as a frozen experiment: it stores the admin pack, thresholds, alias rules, and QA notes you used.

When to start a new session

Spin up a new one when you change the admin pack, tweak thresholds, or want to test fresh alias rules without losing the previous run. Keep the original session for auditability.

Rescuing unmatched rows quickly

Filter for unmatched rows, add alias rules for repeating patterns, then rerun matching so the rules apply everywhere. If codes exist in your dataset, map those first to boost scores.

Exporting and sharing

Finish QA, then download the crosswalk bundle from the Session page. It includes the crosswalk CSV, cleaned dataset, QA summary, and the admin pack hash used for that run.

View Sessions

Changelog

Latest releases and improvements to GeoMatcher.

2025.12.05 · Session pipeline and QA

December 2, 2025

Workspace
  • End-to-end session runner now keeps you in-browser: column detection, fuzzy/coordinate matching, QA checks, and crosswalk export live in one flow.
  • Session creation defaults to core geography columns, auto-selects the remaining dataset fields, and lets you tune quality thresholds plus fuzzy fallback.
  • Gallery cards track staged → in review → ready to export states with QA notes and bundle shortcuts so you can resume or clear runs confidently.

2025.11.12 · Matching dataset + admin packs

November 17, 2025

Workspace
  • Matching Dataset hub profiles uploads with delimiter checks, row/column counts, and an in-browser table preview—no server roundtrips needed.
  • Geography column selector now persists detected province/regency/district/village or lat/lon choices, ready for reuse when starting sessions.
  • Admin Pack library ships Jakarta and West Java GeoParquet downloads, plus load/remove/preview flows under the Loaded/Core/Extended tabs.