Problem
Search engines expect a sitemap at https://yoursite/sitemap.xml
listing every public URL with its last-modified timestamp. The
content already lives in the page tree; you just need an XML
view of it.
Recipe
sitemap.xml is not a page (it returns application/xml, not
HTML), so the ?Page slots on PageResolving and RouteNotFound
do not fit. The right shape is a plugin that subscribes to
RouteNotFound, recognises the path, emits the XML directly via
header() + echo + exit, and never touches the resolution slot.
The listener short-circuits Scriptor's pipeline before the 404
fires.
namespace Acme\Sitemap;
use League\Container\Container;
use Scriptor\Boot\Events\Frontend\RouteNotFound;
use Scriptor\Boot\Frontend\PageRepository;
use Scriptor\Boot\Plugin\Plugin as ScriptorPlugin;
use Scriptor\Boot\Plugin\PluginContext;
final class Plugin implements ScriptorPlugin
{
public function __construct(private readonly Container $container) {}
public function register(PluginContext $context): void
{
$context->subscribe(RouteNotFound::class, [$this, 'onUnresolved']);
}
public function version(): string { return '0.1.0'; }
public function onUnresolved(RouteNotFound $event): void
{
$path = '/' . $event->urlSegments->path(false);
if ($path !== '/sitemap.xml') return;
$pages = $this->container->get(PageRepository::class)->findAll();
$siteUrl = rtrim(self::detectSiteUrl(), '/');
$xml = '<?xml version="1.0" encoding="UTF-8"?>' . "\n";
$xml .= '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">' . "\n";
foreach ($pages as $page) {
if (! $page->active()) continue;
$loc = $siteUrl . self::pathFor($page);
$xml .= sprintf(
" <url><loc>%s</loc><lastmod>%s</lastmod></url>\n",
htmlspecialchars($loc, ENT_XML1 | ENT_QUOTES, 'UTF-8'),
date('c', $page->updated()),
);
}
$xml .= '</urlset>' . "\n";
header('Content-Type: application/xml; charset=utf-8');
echo $xml;
exit;
}
private static function pathFor($page): string
{
return $page->slug === '' ? '/' : '/' . $page->slug . '/';
}
private static function detectSiteUrl(): string
{
$scheme = (($_SERVER['HTTPS'] ?? '') === 'on') ? 'https' : 'http';
return $scheme . '://' . ($_SERVER['HTTP_HOST'] ?? 'localhost');
}
}
Three pieces a reader cannot infer:
- No resolution slot is filled. The listener short-circuits
with
exit; the slot stays null butthrow404()never runs because the request is already over. Same pattern as the legacy-redirect variant of the Replace 404 with a fallback handler recipe. pathFor()honours the empty-slug home convention. Since Scriptor's home page is identified byslug = ''(not by id = 1), the sitemap entry for the home page reads/, not//. Themes that use$site->getPageUrl($page)get the same shape for free.detectSiteUrl()reads the live request rather than the container-boundSite::siteUrl. The Site instance is never constructed when the listener short-circuits; this small helper re-derives the base URL from$_SERVER. For sites with a pinned canonical hostname, hardcode it instead.
Variants
Filter pages by template or pagetype
Public sitemap entries usually exclude internals (legal pages, preview-only templates). Gate the loop:
$publicTemplates = ['basic', 'longform', 'post'];
foreach ($pages as $page) {
if (! $page->active()) continue;
if (! in_array($page->template, $publicTemplates, strict: true)) continue;
// ...
}
Add changefreq + priority
The sitemaps.org schema accepts <changefreq> and <priority>
when you can derive them. Most sites do not bother; when you do:
$xml .= sprintf(
" <url><loc>%s</loc><lastmod>%s</lastmod>"
. "<changefreq>%s</changefreq><priority>%s</priority></url>\n",
htmlspecialchars($loc, ENT_XML1 | ENT_QUOTES, 'UTF-8'),
date('c', $page->updated()),
$page->template === 'post' ? 'weekly' : 'monthly',
$page->slug === '' ? '1.0' : '0.7',
);
Search engines treat these as hints, not directives, so the sitemap stays accurate enough even when the values drift.
Cache the output
The full-tree iteration is cheap for the page counts most
Scriptor sites carry (hundreds, not hundreds of thousands), so
runtime regeneration on every request is fine. For larger sites,
write the XML to a file on the first request of the day and serve
the file from a static path; the entire listener becomes a
file-existence check plus a readfile + exit.
See also
- RSS / Atom feed from a page tree: the same
shape (RouteNotFound + emit XML + exit) with a different schema
and a
findInTimeRange()filter instead offindAll() - Replace 404 with a fallback handler: the legacy-redirect variant uses the same short-circuit pattern this recipe relies on
PageRepository:findAll(),findInTimeRange(),findActiveByParent()are the three iteration entry pointsPage:slug,active(),updated()are the three fields this recipe readsRouteNotFound: the event + itsurlSegmentsaccessorSite:getPageUrl()already implements the empty-slug conventionpathFor()reproduces inline