Giter Site home page Giter Site logo

grav-plugin-sitemap's Introduction

Grav Sitemap Plugin

Sitemap is a Grav Plugin that generates a map of your pages in XML format that is easily understandable and indexable by Search engines.

Installation

Installing the Sitemap plugin can be done in one of two ways. Our GPM (Grav Package Manager) installation method enables you to quickly and easily install the plugin with a simple terminal command, while the manual method enables you to do so via a zip file.

GPM Installation (Preferred)

The simplest way to install this plugin is via the Grav Package Manager (GPM) through your system's Terminal (also called the command line). From the root of your Grav install type:

bin/gpm install sitemap

This will install the Sitemap plugin into your /user/plugins directory within Grav. Its files can be found under /your/site/grav/user/plugins/sitemap.

Manual Installation

To install this plugin, just download the zip version of this repository and unzip it under /your/site/grav/user/plugins. Then, rename the folder to sitemap. You can find these files either on GitHub or via GetGrav.org.

You should now have all the plugin files under

/your/site/grav/user/plugins/sitemap

NOTE: This plugin is a modular component for Grav which requires Grav, the Error and Problems plugins, and a theme to be installed in order to operate.

Usage

The sitemap plugin works out of the box. You can just go directly to http://yoursite.com/sitemap and you will see the generated XML.

Config Defaults

enabled: true
route: '/sitemap'
ignore_external: true
ignore_protected: true
ignore_redirect: true
ignores:
  - /blog/blog-post-to-ignore
  - /ignore-this-route
  - /ignore-children-of-this-route/.*
include_news_tags: false
standalone_sitemap_news: false
sitemap_news_path: '/sitemap-news.xml'
news_max_age_days: 2
news_enabled_paths:
  - /blog
whitelist:
html_support: false
urlset: 'http://www.sitemaps.org/schemas/sitemap/0.9'
urlnewsset: 'http://www.google.com/schemas/sitemap-news/0.9'
short_date_format: true
include_changefreq: true
changefreq: daily
include_priority: true
priority: !!float 1
additions:
  -
    location: /something-special
    lastmod: '2020-04-16'
    changefreq: hourly
    priority: 0.3
  -
    location: /something-else
    lastmod: '2020-04-17'
    changefreq: weekly
    priority: 0.2

You can ignore your own pages by providing a list of routes to ignore. You can also use a page's Frontmatter to signal that the sitemap should ignore it:

sitemap:
    ignore: true

Multi-Language Support

The latest Sitemap v3.0 includes all new multi-language support utilizing the latest Google Search SEO Recomendations which creates bi-directional hreflang entries for each language available.

This is handled automatically based on your Grav multi-language System configuration.

News Support

New in version 4.0 of the plugin is support for Google's News Sitemap Extension that uses a specific tags under a <news:news></news:news> tag to provide Google News specific data. When enabled, the news extensions will be enabled when an item is in one of the configured news paths (/ by default, so all), and if the published date is not older than the configured max age (default of 2 per Googles recommendations).

The output of the news tags is controlled by an overridable sitemap-extensions/news.html.twig template.

The default behavior when Include News Tags is enabled, is to include the news tags directly in the primary sitemap.xml file. However, if you enabled the Standalone News URLs option, news tags will not be added to the primary sitemap.xml, rather, they will be available in standalone paths that contain only the pages in the designated news paths.

For example, the default behavior is to enable /blog as a news path. If this path exists, you have content in subfolders of this page, and that content is less than the defined "News Max Age" (2 days recommended by Google), then that sitemap-news-specific sitemap would be available via:

https://yoursite.com/blog/sitemap-news.xml

You can change the "News Path" to be something other than sitemap-news.xml if you wish.

Images

You can add images to the sitemap by adding an entry in the page's Frontmatter.

sitemap:
    images:
        your_image:
            loc: your-image.png
            caption: A caption for the image
            geoloc: Amsterdam, The Netherlands
            title: The title of your image
            license: A URL to the license of the image.

For more info on images in sitemaps see Google image sitemaps.

Only allow access to the .xml file

If you want your sitemap to only be accessible via sitemap.xml for example, set the route to /sitemap and add this to your .htaccess file:

Redirect 301 /sitemap /sitemap.xml

HTML Support

As of Sitemap version 3.0.1 you can enable html_support in the configuration and then when you go to /sitemap or /sitemap.html you will view an HTML version of the sitemap per the templates/sitemap.html.twig template.

You can copy and extend this Twig template in your theme to customize it for your needs.

Manually add pages to the sitemap

You can manually add URLs to the sitemap using the Admin settings, or by adding entries to your sitemap.yaml with this format:

additions:
  -
    location: /something-special
    lastmod: '2020-04-16'
    changefreq: hourly
    priority: 0.3

Note that Regex support is available: Just append .* to a path to ignore all of it's children.

Dynamically adding pages to the sitemap

If you have some dynamic content being added to your site via another plugin, or perhaps a 3rd party API, you can now add them dynamically to the sitemap with a simple event:

Make sure you are subscribed to the onSitemapProcessed event then add simply add your entry to the sitemap like this:

    public function onSitemapProcessed(\RocketTheme\Toolbox\Event\Event $e)
    {
        $sitemap = $e['sitemap'];
        $location = \Grav\Common\Utils::url('/foo-location', true);
        $sitemap['/foo'] = new \Grav\Plugin\Sitemap\SitemapEntry($location, '2020-07-02', 'weekly', '2.0');
        $e['sitemap'] = $sitemap;
    }

The use Utils::url() method allow us to easily create the correct full URL by passing it a route plus the optional true parameter.

grav-plugin-sitemap's People

Contributors

chouchen avatar csixtyfour avatar flaviocopes avatar fnetx avatar giansi avatar hydraner avatar itsecmedia avatar jonata avatar lufog avatar mahagr avatar memurame avatar nhayward avatar pospiemi avatar rhukster avatar robertbak avatar rotzbua avatar ryanmpierson avatar schliflo avatar spamrakuen avatar tomone avatar tomzx avatar w00fz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grav-plugin-sitemap's Issues

Google Webmaster complains of sitemap error

The error is:

The XML Sitemap cannot be parsed because it contains one or more unbound namespace prefixes. For example, this error is generated when xhtml:link is found in a Sitemap without prior xmlns:xhtml="http://www.w3.org/1999/xhtml".

When the site is multi-language, it adds this lines to each page:

    <xhtml:link rel="alternate" hreflang="en" href="https://www.astroprint.com/astroprint-compatible-printers" />
    <xhtml:link rel="alternate" hreflang="es" href="https://www.astroprint.com/es/astroprint-compatible-printers" />

These seem to be problem.

Content-type is text/html

Don't know if it's the plugin or my server-config but my sitemap is sent with header:

Content-Type: text/html;charset=UTF-8

This should be application/xml

Better look & different structure (sitemaps for taxonomys)

Is there any way to change the look of the sitemap?
output on new line will be enough(resolved adding .xml at end /sitemap.xml)

  • Can we have a different sitemap structure and divide sitemap in smaller parts
    example (related to one website i'm building):
    sitemap index
    |-themes
    |-authors
    |-quotes (one page with x lines)

Support for SSL

The generated urls for the sitemap when the site is ssl not have https. The "force_ssl" setting has no effect.

Validation failed: Invalid input in "Sitemap Priority"

On each page, I've set the sitemap "Priority" and "Frequency". I've hit save and the changes applied. When I'm trying to edit the page in "Expert" mode, an error message comes up "Validation failed: Invalid input in "Sitemap Priority", but it works OK with the "Normal" mode.

Ignores - question

I'm unclear how the ignores works... I have some folders of content named _09.defunct_pages - non-routable. Sitemap still includes it, so I have tried

ignores:

  • /_09.defunct_pages
  • /_09.defunct_pages/*

and other combinations to no avail...

Ideas?

Thanks!

Different behaviour/display in different system for sitemap output

I have using the GRAV module in my local Windows machine and on a linux server using the XAMPP.
I have also installed the sitemap plugin on both system and both the system have been kept in sync.
And I have made a small change in the plugin -

  1. Commented the sorting line in the sitemap.php (ksort($routes); --> line 64)
  2. Changed the line 75 in sitemap.php from '$entry->location = $page->canonical();' ---> '$entry->location = $this->grav['base_url_relative'] . $page->routecanonical();'

The output of these changes are completely different in both the system.
Can you please help me out in this as this has a dependency in my requirement and want it to be in proper order.

Thanks in advance

Regards,
Kumaravel

Multi-lang sitemap

Having a multi-lang site I would expect to see entries for each supported language (en, de) but only see en.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
        <url>
                <loc>.../en</loc>
                <lastmod>2015-07-29</lastmod>
                <changefreq>monthly</changefreq>
                <priority>1.0</priority>
        </url>
        <url>
                <loc>.../en</loc>
                <lastmod>2015-07-29</lastmod>
                <changefreq>monthly</changefreq>
                <priority>1.0</priority>
        </url>
        <url>
                <loc>.../en/...</loc>
                <lastmod>2015-07-29</lastmod>
        </url>
        <url>

sitemap URL and stylesheet XSL support

Just installed the sitemap and it was accessible using URL http://example.com/sitemap and http://example.com/sitemap.xml

Is there a way to limit the URL? I mean just allow to be accessible using the second URL (sitemap.xml)

I don't want someone tried to visit the url and seen a page that seems to be forgotten to be styled.

Also I tried to create XSL stylesheet and attached it to sitemap.xml.twig that I've copied into my theme folder and it is not working unless I've stripped out the xmlns protocol in urlset tag. What is the right way of doing this?

Thanks in advance.

ignore paths not saved in admin config panel

When I add new paths to the ignores on the admin config panel for the plugin and press 'save', the new path is not saved to the list. I do get a "Successfully saved" notification.

an example of a path I've tried to add:
/chapters/3-70000-obeti

Adding the paths manually in user/config/plugins/sitemap/sitemap.yaml works.

I'm using OSX Chrome and running latest Grav with Sitemap v1.7.0

Possible conflict with a particular config on Related Pages plugin

Hi,
It seems like sitemap can not be generated when related pages plugin is set with @self.siblings whereas it works with @self or @taxonomy.
I got the message Call to a member function children() on null.

Here is my config:
Related pages v1.1.3
Sitemap v1.6.2
Grav v1.1.0 - Admin v1.1.1
(Had the same problem with Grav v1.0.10)

Page.php

                            break;
                        case 'all':
                            $results = $this->children();
                            break;
                        case 'parent':
                            $collection = new Collection();
                            $results = $collection->addPage($this->parent());
                            break;
                        case 'siblings':
                            $results = $this->parent()->children()->remove($this->path());
                            break;
                        case 'descendants':
                            $results = $pages->all($this)->remove($this->path())->nonModular();
                            break;
                    }
                }

                $results = $results->published();
                break;

Here is the yaml just in case:

enabled: true
limit: 3
show_score: false
score_threshold: 20
filter:
  items: '@self.siblings'
  order:
    by: date
    dir: desc
page_in_filter: false
explicit_pages:
  process: true
  score: 100
taxonomy_match:
  taxonomy: tag
  taxonomy_taxonomy:
    process: true
    score_scale:
      1: '50'
      2: '75'
      3: '100'
  taxonomy_content:
    process: true
    score_scale:
      1: '20'
      2: '30'
      3: '45'
      4: '60'
      5: '70'
      6: '80'
      7: '90'
      8: '100'
content_match:
  process: true

And obviously the yaml for sitemap:

enabled: true
route: '/sitemap'
ignores:
  - /blog/blog-post-to-ignore
  - /ignore-this-route

Changefrequency float validation error

Setting the changefrequency to 1 causes an error:

"cannot resolve a node with !<tag:yaml.org,2002:float> explicit tag at line ..., column ...: priority: !!float 1"

sitemap:
    changefreq: yearly
    priority: !!float 1

array_key_exists exception on 1.7-rc10 with Gantry latest

Installed Sitemap plugin, enabled, browsed to '/sitemap', got the array_key_exists(): The first argument should be either a string or an integer error on line 78 of sitemap.php:

$lang_available = (empty($page_languages) || array_key_exists($current_lang, $page_languages));

(Auto)Ignore external Links

Hi,

would be great to (auto)exclude external links as it may be a problem with searchengines (will become invalid) - with external links I mean www.mysite.com is the main site and www.google.com is an external link / URL

https://productforums.google.com/forum/#!topic/webmasters/zDCMIOHZBgU;context-place=topicsearchin/webmasters/authorid$3AAPn2wQdYSQNTGlByz1-QOv-Py6U_jV9D7JF1wTOnWzTqSWU3P9A5ECifpzHnTh6e6krJOBLNEOv-%7Csort:date%7Cspell:false

Or even to have a checkbox at the pages level.

Wrong URL used in xhtml:link rel="alternate"

In case of page with added page-specific "default route" original address used in xhtml:link instead of correct route

    <loc>site/special/collection</loc>
    <xhtml:link rel="alternate" hreflang="ru" href="site/tech" />

Definition in <loc> is good and correct, in <xhtml:link> - no (/tech must be /special/collection always)

Multilanguage sets unpublished pages as alternates

So I have a lot of multilanguage pages, but for quite a lot of them the parent language is set to published: false. What I do see in my sitemap.xml is that these pages are still included there (and generate a 404). What I expect is that pages which are not published don't end up as alternates in the sitemap.xml.

Current situation where the EN version is set to published: false.

<url>
    <loc>http://www.md.dev/nl-nl/blog/blog-post-in-dutch</loc>
    <xhtml:link rel="alternate" hreflang="en" href="http://www.md.dev/blog/blog-post-in-dutch" />
    <xhtml:link rel="alternate" hreflang="nl-nl" href="http://www.md.dev/nl-nl/blog/blog-post-in-dutch" />
    <lastmod>2017-05-03</lastmod>
    <changefreq>daily</changefreq>
    <priority>1.0</priority>
  </url>

What I expect to happen:

<url>
    <loc>http://www.md.dev/nl-nl/blog/blog-post-in-dutch</loc>
    <xhtml:link rel="alternate" hreflang="nl-nl" href="http://www.md.dev/nl-nl/blog/blog-post-in-dutch" />
    <lastmod>2017-05-03</lastmod>
    <changefreq>daily</changefreq>
    <priority>1.0</priority>
  </url>

Also I'm not 100% sure if it should set rel="alternate" if there is only one URL (and I guess no alternates). But I do expect the unpublished page not to be there.

add_header X-Robots-Tag "noindex"

I have had the content and url of sitemap show up on Google. Is there a way to make the header
X-Robots-Tag "noindex" show up when calling /sitemap?

Wildcard ignores

It would be useful to be able to do wildcard or regex ignores like:

/boring_chapter/*
*/boring/*

instead of

/boring_chapter/page1
/boring_chapter/page2
/boring_chapter/page3
/chapter1/boring/page1
/chapter1/boring/page2
/chapter2/boring/page1
/chapter2/boring/page2

Extra spaces

I've got error:

>   <?xml version="1.0" encoding="UTF-8"?>
> --^

Someone have idea from where this two spaces may come?

Sitemap missing url elements of page translations

According to sitemap specification
https://support.google.com/webmasters/answer/189077?hl=en
If page X links to page Y, page Y must link back to page X. If this is not the case for all pages that use hreflang annotations, those annotations may be ignored or not interpreted correctly.

It means that sitemap must generate URL elements for all pages, including all translations.
As it is in example (on the same page).

But currently it generates only all pages once with translations added as alternate links.

Attaching file with fix which works on multilingual site
sitemap.php.zip

Basically I have added another loop to create url enties for all translations too

See:

            $entry_translated = $page->translatedLanguages();
            foreach($entry_translated as $trans_lang => $trans_page_route) {
            
                $entry = new SitemapEntry();
                //$entry->location = $page->canonical();
                $entry_route = $this->grav['language']->getLanguageURLPrefix($trans_lang).'/'.$trans_page_route;
                $entry->location = $rootUrl.$entry_route;

And changed url generation for alternate links, because instead of page url it was using the route which may be different for pages if slug is entered (which is not the same as folder)

                    foreach($entry_translated as $lang => $page_route) {
                        //$page_route = $page->rawRoute();
                        if ($page->home()) {
                            $page_route = '';
                        }

                        $entry->translated[$lang] = '/'.$page_route;

Ignore not visible pages

Hi, and first of all, thanks for this very useful plugin :)

I always add this behavior to the plugin : Ignore pages which are not visible.
It's very useful to me because when you hide a page in Grav, this is basically to be able to edit it and preview it from the admin. What do you think about it ?

To enable this behavior, I just added this line :
if ( isset($header->visible) && !$header->visible ) $page_ignored = true;
In sitemap.php after line 76.

It may be more useful to add an option to the blueprint.

Thanks

HTML tags are added to sitemap.xml

I don't know in which version this issue started; but there are some wrong tags in sitemap.xml; it contains 'html' and 'body' at the start and end

ย <?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="/user/plugins/sitemap/sitemap.xsl"?><html><body><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">

Schermafbeelding 2020-08-09 om 16 20 41

Seems like not working...

after installing, on the onelang non en site i see:
http://mysite.name/sitemap

This page contains the following errors:
error on line 1 at column 8: XML declaration allowed only at the start of the document
Below is a rendering of the page up to the first error.

and there is nothing below...

Add checkbox to ignore page on the page Options

The same way that you can specify frequency and priority on a page level, it would be useful to have a checkbox also to ignore that page from the sitemap.

This is useful because currently there's no way to tes a page that's not published. Sometimes when building a page, one has to publish it (but have it not visible) and it's not desired that the page shows up on the sitemap.

While it's already possible to add an ignore on the plugin config itself, I think a page level checkbox would be more convenient.

Default route

The plugin probably should use a page's default route, for example / not /home

Version 1.9.3 appears to not work at all

After updating to 1.9.3, going to /sitemap or /sitemap.xml on my site yields a 404. Tried with a clean install and got the same result.

The debugger doesn't seem to show on 404 responses, so I'm not sure where else to start looking for issues.

Exclude tracy code?

Not really a massive issue as tracy would be off in production but the code gets outputted at the bottom of the sitemap page

Set weights on paths

Hi,

For wordpress, yoast's SEO plugin will set weights on different pages in the sitemap according to some automatic rules. That would be good.

In addition, setting weights manually would also be useful. Random example: you make a website selling something so you want the front page and the key features pages to have 0.9 to 1.0 weight and things like blog posts to have < 0.5 weight so google lists things you want people to click on higher in results pages even if it is just for the search "site:mysite.com".

Possibility to exclude pages

I'd like to have a possibility to exclude pages from the sitemap, e.g. with a config option in the page header (sitemap: false or similar).
Thanks! :-)

External URLs in sitemap

Folllow-up for #47

Some subset of external links from site, referenced in main page content, may appear in sitemap (with default setting of plugin).
Only one possible correlation is reliably discovered in addition to old (still open) issue from Jul 2017:

  • Sites on clean Grav core doesn't have external URLs in sitemap, while site with Gantry-based theme got some external URLs in sitemap

Full list of my externals and status of inclusion:

twitter.com/rockettheme
facebook.com/rockettheme
rockettheme.com/product-updates?rss
rockettheme.com/docs/grav/themes/hadron YES
rockettheme.com/forum/grav-theme-hadron YES
docs.gantry.org/gantry5/particles/logo
rockettheme.com/docs/grav/themes/hadron/demo.md
rockettheme.com/grav/themes/hadron YES
chartjs.org/
github.com/nnnick/Chart.js
rockettheme.com/docs/joomla/basic/responsive_support_classes.md
twitter.com/davegandy
learn.getgrav.org/basics/installation
opensource.org/licenses/mit-license.html
fontawesome.io/icons/
docs.gantry.org/gantry5/basics/installation
rockettheme.com/docs/grav/start/rocketlauncher.md
rockettheme.com
w3schools.com/html/html5_canvas.asp
scripts.sil.org/OFL
rockettheme.com/docs/grav/themes/hadron/comingsoon.md

Add link rel="sitemap" by default to head of pages

It's more themes question, but enhancement in order "to be just perfect" for plugin, which can do it automatically
<link rel="sitemap" type="application/xml" title="Sitemap" href="<site>/sitemap.xml">

Correct URL if using slug for pages

First of all thank you for the great plugin! It was super easy to install and it is working.

My website is using two languages (hu and en). hu is the default and I'm using slugs for the en pages. Now I see both of the languages in the sitemap XML, but the en pages URL is not the ones with slugs.

Here is the sitemap: http://learn.co3app.com/sitemap

How can I create the sitmap with the correct URL (containing the slug) for the en pages?

Any help would be appreciated!

Redirecting sitemap to home page

When I load sitemap.xml on my website, it redirects to /, the root. When I try to load it again in a browser, it works, but Google Webmaster is having problems locating the xml as it gets redirected to the home page. How can I fix this?

Call to undefined method Grav\Common\Page\Page::canonical()

After enabling the plugin I get an error on the /sitemap page. I think the name of the method has changed in the most recent version of grav. Now you need to call routeCanonical instead. For backwards compatibility you might want to check if canonical or routeCanonical exists.

bildschirmfoto 2017-05-15 um 12 08 02

Cheers, Chris

Sitemap plugin shall honor grav page header flag visible

Hi,

the sitemap plugin lists also invisible pages. Pages with the header "visible: false" probably should not be listed. They might be ignored by sitemap plugin configuration though.

Fix seems rather simple.

Best regards,
Mario

Plugin Broken on RC4

Hi

After installing RC4 today sitemap.xml loads as

Whoops \ Exception \ ErrorException (E_WARNING)
in_array() expects parameter 2 to be array, null given

Open: /var/www/houseofbhangra.com/system/src/Grav/Common/Grav.php

    /** @var Config $config */
    $config = $this['config'];

    $uri_extension = $uri->extension();

    // Only allow whitelisted types to fallback
    if (!in_array($uri_extension, $config->get('system.pages.fallback_types'))) {
        return;
    }

I was previously on one of the 0.9 versions..

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.