Giter Site home page Giter Site logo

kjvarga / sitemap_generator Goto Github PK

View Code? Open in Web Editor NEW

This project forked from adamsalter/sitemap_generator

2.4K 42.0 277.0 1.27 MB

SitemapGenerator is a framework-agnostic XML Sitemap generator written in Ruby with automatic Rails integration. It supports Video, News, Image, Mobile, PageMap and Alternate Links sitemap extensions and includes Rake tasks for managing your sitemaps, as well as many other great features.

License: MIT License

Ruby 100.00%
sitemaps

sitemap_generator's Introduction

SitemapGenerator

CircleCI

SitemapGenerator is the easiest way to generate Sitemaps in Ruby. Rails integration provides access to the Rails route helpers within your sitemap config file and automatically makes the rake tasks available to you. Or if you prefer to use another framework, you can! You can use the rake tasks provided or run your sitemap configs as plain ruby scripts.

Sitemaps adhere to the Sitemap 0.9 protocol specification.

Features

  • Framework agnostic
  • Supports News sitemaps, Video sitemaps, Image sitemaps, Mobile sitemaps, PageMap sitemaps and Alternate Links
  • Supports read-only filesystems like Heroku via uploading to a remote host like Amazon S3
  • Compatible with all versions of Rails and Ruby
  • Adheres to the Sitemap 0.9 protocol
  • Handles millions of links
  • Customizable sitemap compression
  • Notifies search engines (Google) of new sitemaps
  • Ensures your old sitemaps stay in place if the new sitemap fails to generate
  • Gives you complete control over your sitemap contents and naming scheme
  • Intelligent sitemap indexing

Show Me

This is a simple standalone example. For Rails installation see the Rails instructions in the Install section.

Install:

gem install sitemap_generator

Create sitemap.rb:

require 'rubygems'
require 'sitemap_generator'

SitemapGenerator::Sitemap.default_host = 'http://example.com'
SitemapGenerator::Sitemap.create do
  add '/home', :changefreq => 'daily', :priority => 0.9
  add '/contact_us', :changefreq => 'weekly'
end
SitemapGenerator::Sitemap.ping_search_engines # Not needed if you use the rake tasks

Run it:

ruby sitemap.rb

Output:

In /Users/karl/projects/sitemap_generator-test/public/
+ sitemap.xml.gz                                           3 links /  364 Bytes
Sitemap stats: 3 links / 1 sitemaps / 0m00s

Successful ping of Google

Contents

Contribute

Does your website use SitemapGenerator to generate Sitemaps? Where would you be without Sitemaps? Probably still knocking rocks together. Consider donating to the project to keep it up-to-date and open source.

Click here to lend your support to: SitemapGenerator and make a donation at www.pledgie.com !

Foreword

Adam Salter first created SitemapGenerator while we were working together in Sydney, Australia. Unfortunately, he passed away in 2009. Since then I have taken over development of SitemapGenerator.

Those who knew him know what an amazing guy he was, and what an excellent Rails programmer he was. His passing is a great loss to the Rails community.

The canonical repository is: http://github.com/kjvarga/sitemap_generator

Installation

Ruby

gem install 'sitemap_generator'

To use the rake tasks add the following to your Rakefile:

require 'sitemap_generator/tasks'

The Rake tasks expect your sitemap to be at config/sitemap.rb but if you need to change that call like so: rake sitemap:refresh CONFIG_FILE="path/to/sitemap.rb"

Rails

SitemapGenerator works with all versions of Rails and has been tested in Rails 2, 3 and 4.

Add the gem to your Gemfile:

gem 'sitemap_generator'

Alternatively, if you are not using a Gemfile add the gem to your config/application.rb file config block:

config.gem 'sitemap_generator'

Note: SitemapGenerator automatically loads its Rake tasks when used with Rails. You do not need to require the sitemap_generator/tasks file.

Getting Started

Preventing Output

To disable all non-essential output you can pass the -s option to Rake, for example rake -s sitemap:refresh, or set the environment variable VERBOSE=false when calling as a Ruby script.

To disable output in-code use the following:

SitemapGenerator.verbose = false

Rake Tasks

  • rake sitemap:install will create a config/sitemap.rb file which is your sitemap configuration and contains everything needed to build your sitemap. See Sitemap Configuration below for more information about how to define your sitemap.

  • rake sitemap:refresh will create or rebuild your sitemap files as needed. Sitemaps are generated into the public/ folder and by default are named sitemap.xml.gz, sitemap1.xml.gz, sitemap2.xml.gz, etc. As you can see, they are automatically GZip compressed for you. In this case, sitemap.xml.gz is your sitemap "index" file.

    rake sitemap:refresh will output information about each sitemap that is written including its location, how many links it contains, and the size of the file.

Pinging Search Engines

Using rake sitemap:refresh will notify Google to let them know that a new sitemap is available. To generate new sitemaps without notifying search engines, use rake sitemap:refresh:no_ping.

If you want to customize the hash of search engines you can access it at:

SitemapGenerator::Sitemap.search_engines

Usually you would be adding a new search engine to ping. In this case you can modify the search_engines hash directly. This ensures that when SitemapGenerator::Sitemap.ping_search_engines is called, your new search engine will be included.

If you are calling ping_search_engines manually, then you can pass your new search engine directly in the call, as in the following example:

SitemapGenerator::Sitemap.ping_search_engines(newengine: 'http://newengine.com/ping?url=%s')

The key gives the name of the search engine, as a string or symbol, and the value is the full URL to ping, with a string interpolation that will be replaced by the CGI escaped sitemap index URL. If you have any literal percent characters in your URL you need to escape them with %%.

If you are calling SitemapGenerator::Sitemap.ping_search_engines from outside of your sitemap config file, then you will need to set SitemapGenerator::Sitemap.default_host and any other options that you set in your sitemap config which affect the location of the sitemap index file. For example:

SitemapGenerator::Sitemap.default_host = 'http://example.com'
SitemapGenerator::Sitemap.ping_search_engines

Alternatively, you can pass in the full URL to your sitemap index, in which case we would have just the following:

SitemapGenerator::Sitemap.ping_search_engines('http://example.com/sitemap.xml.gz')

Crontab

To keep your sitemaps up-to-date, setup a cron job. Make sure to pass the -s option to silence rake. That way you will only get email if the sitemap build fails.

If you're using Whenever, your schedule would look something like this:

# config/schedule.rb
every 1.day, :at => '5:00 am' do
  rake "-s sitemap:refresh"
end

Robots.txt

You should add the URL of the sitemap index file to public/robots.txt to help search engines find your sitemaps. The URL should be the complete URL to the sitemap index. For example:

Sitemap: http://www.example.com/sitemap.xml.gz

Ruby Modules

If you need to include a module (e.g. a rails helper), you must include it in the sitemap interpreter class. The part of your sitemap configuration that defines your sitemaps is run within an instance of the SitemapGenerator::Interpreter:

SitemapGenerator::Interpreter.send :include, RoutingHelper

Deployments & Capistrano

To include the capistrano tasks just add the following to your Capfile:

require 'capistrano/sitemap_generator'

Configurable options:

set :sitemap_roles, :web # default

Available capistrano tasks:

sitemap:create   #Create sitemaps without pinging search engines
sitemap:refresh  #Create sitemaps and ping search engines
sitemap:clean    #Clean up sitemaps in the sitemap path

Generate sitemaps into a directory which is shared by all deployments.

You can set your sitemaps path to your shared directory using the sitemaps_path option. For example if we have a directory public/shared/ that is shared by all deployments we can have our sitemaps generated into that directory by setting:

SitemapGenerator::Sitemap.sitemaps_path = 'shared/'

Sitemaps with no Index File

The sitemap index file is created for you on-demand, meaning that if you have a large site with more than one sitemap file, you will have a sitemap index file to reference those sitemap files. If however you have a small site with only one sitemap file, you don't require an index and so no index will be created. In both cases the index and sitemap file's name, respectively, is sitemap.xml.gz.

You may want to always create an index, even if you only have a small site. Or you may never want to create an index. For these cases, you can use the create_index option to control index creation. You can read about this option in the Sitemap Options section below.

To always create an index:

SitemapGenerator::Sitemap.create_index = true

To never create an index:

SitemapGenerator::Sitemap.create_index = false

Your sitemaps will still be called sitemap.xml.gz, sitemap1.xml.gz, sitemap2.xml.gz, etc.

And the default "intelligent" behaviour:

SitemapGenerator::Sitemap.create_index = :auto

Upload Sitemaps to a Remote Host using Adapters

This section needs better documentation. Please consider contributing.

Sometimes it is desirable to host your sitemap files on a remote server, and point robots and search engines to the remote files. For example, if you are using a host like Heroku, which doesn't allow writing to the local filesystem. You still require some write access, because the sitemap files need to be written out before uploading. So generally a host will give you write access to a temporary directory. On Heroku this is tmp/ within your application directory.

Supported Adapters

SitemapGenerator::FileAdapter

Standard adapter, writes out to a file.

SitemapGenerator::FogAdapter

Uses Fog::Storage to upload to any service supported by Fog.

You must require 'fog' in your sitemap config before using this adapter, or require another library that defines Fog::Storage.

SitemapGenerator::S3Adapter

Uses Fog::Storage to upload to Amazon S3 storage.

You must require 'fog-aws' in your sitemap config before using this adapter.

An example of using this adapter in your sitemap configuration:

SitemapGenerator::Sitemap.adapter = SitemapGenerator::S3Adapter.new(options)

Where options is a Hash with any of the following keys:

  • aws_access_key_id [String] Your AWS access key id
  • aws_secret_access_key [String] Your AWS secret access key
  • fog_provider [String]
  • fog_directory [String]
  • fog_region [String]
  • fog_path_style [String]
  • fog_storage_options [Hash] Other options to pass to Fog::Storage
  • fog_public [Boolean] Whether the file is publicly accessible

Alternatively you can use an environment variable to configure each option (except fog_storage_options). The environment variables have the same name but capitalized, e.g. FOG_PATH_STYLE.

SitemapGenerator::AwsSdkAdapter

Uses Aws::S3::Resource to upload to Amazon S3 storage. Includes automatic detection of your AWS credentials and region.

You must require 'aws-sdk-s3' in your sitemap config before using this adapter, or require another library that defines Aws::S3::Resource and Aws::Credentials.

An example of using this adapter in your sitemap configuration:

SitemapGenerator::Sitemap.adapter = SitemapGenerator::AwsSdkAdapter.new('s3_bucket',
  acl: 'public-read', # Optional. This is the default.
  cache_control: 'private, max-age=0, no-cache', # Optional. This is the default.
  access_key_id: 'AKIAI3SW5CRAZBL4WSTA',
  secret_access_key: 'asdfadsfdsafsadf',
  region: 'us-east-1',
  endpoint: 'https://sfo2.digitaloceanspaces.com'
)

Where the first argument is the S3 bucket name, and the rest are keyword argument options. Options :acl and :cache_control configure access and caching of the uploaded files; all other options are passed directly to the AWS client.

See the SitemapGenerator::AwsSdkAdapter docs, and https://docs.aws.amazon.com/sdk-for-ruby/v2/api/Aws/S3/Client.html#initialize-instance_method for the full list of supported options.

SitemapGenerator::WaveAdapter

Uses CarrierWave::Uploader::Base to upload to any service supported by CarrierWave, for example, Amazon S3, Rackspace Cloud Files, and MongoDB's GridF.

You must require 'carrierwave' in your sitemap config before using this adapter, or require another library that defines CarrierWave::Uploader::Base.

Some documentation exists on the wiki page.

SitemapGenerator::GoogleStorageAdapter

Uses Google::Cloud::Storage to upload to Google Cloud storage.

You must require 'google/cloud/storage' in your sitemap config before using this adapter.

An example of using this adapter in your sitemap configuration with options:

SitemapGenerator::Sitemap.adapter = SitemapGenerator::GoogleStorageAdapter.new(
  acl: 'public', # Optional.  This is the default value.
  bucket: 'name_of_bucket'
  credentials: 'path/to/keyfile.json',
  project_id: 'google_account_project_id',
)

Also, inline with Google Authentication options, it can also pick credentials from environment variables. All supported environment variables can be used, for example: GOOGLE_CLOUD_PROJECT and GOOGLE_CLOUD_CREDENTIALS. An example of using this adapter with the environment variables is:

SitemapGenerator::Sitemap.adapter = SitemapGenerator::GoogleStorageAdapter.new(
  bucket: 'name_of_bucket'
)

All options other than the :bucket and :acl options are passed to the Google::Cloud::Storage.new initializer giving you maximum configurability. See the Google Cloud Storage initializer for supported options.

An Example of Using an Adapter

  1. Please see this wiki page for more information about setting up SitemapGenerator to upload to a remote host.

  2. This example uses the CarrierWave adapter. It shows some common settings that are used when the hostname hosting the sitemaps differs from the hostname of the sitemap links.

    # Your website's host name
    SitemapGenerator::Sitemap.default_host = "http://www.example.com"
    
    # The remote host where your sitemaps will be hosted
    SitemapGenerator::Sitemap.sitemaps_host = "http://s3.amazonaws.com/sitemap-generator/"
    
    # The directory to write sitemaps to locally
    SitemapGenerator::Sitemap.public_path = 'tmp/'
    
    # Set this to a directory/path if you don't want to upload to the root of your `sitemaps_host`
    SitemapGenerator::Sitemap.sitemaps_path = 'sitemaps/'
    
    # The adapter to perform the upload of sitemap files.
    SitemapGenerator::Sitemap.adapter = SitemapGenerator::WaveAdapter.new
  3. Update your robots.txt file to point robots to the remote sitemap index file, e.g:

    Sitemap: http://s3.amazonaws.com/sitemap-generator/sitemaps/sitemap.xml.gz
    

    You generate your sitemaps as usual using rake sitemap:refresh.

    Note that SitemapGenerator will automatically turn off include_index in this case because the sitemaps_host does not match the default_host. The link to the sitemap index file that would otherwise be included would point to a different host than the rest of the links in the sitemap, something that the sitemap rules forbid.

  4. Verify to Google that you own the S3 url

    In order for Google to use your sitemap, you need to prove you own the S3 bucket through google webmaster tools. In the example above, you would add the site http://s3.amazonaws.com/sitemap-generator/sitemaps. Once you have verified you own the directory, then add your sitemap index to the list of sitemaps for the site.

Generating Multiple Sitemaps

Each call to create creates a new sitemap index and associated sitemaps. You can call create as many times as you want within your sitemap configuration.

You must remember to use a different filename or location for each set of sitemaps, otherwise they will overwrite each other. You can use the filename, namer and sitemaps_path options for this.

In the following example we generate three sitemaps each in its own subdirectory:

%w(google bing apple).each do |subdomain|
  SitemapGenerator::Sitemap.default_host = "https://#{subdomain}.mysite.com"
  SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/#{subdomain}"
  SitemapGenerator::Sitemap.create do
    add '/home'
  end
end

Outputs:

+ sitemaps/google/sitemap1.xml.gz             2 links /  822 Bytes /  328 Bytes gzipped
+ sitemaps/google/sitemap.xml.gz           1 sitemaps /  389 Bytes /  217 Bytes gzipped
Sitemap stats: 2 links / 1 sitemaps / 0m00s
+ sitemaps/bing/sitemap1.xml.gz               2 links /  820 Bytes /  330 Bytes gzipped
+ sitemaps/bing/sitemap.xml.gz             1 sitemaps /  388 Bytes /  217 Bytes gzipped
Sitemap stats: 2 links / 1 sitemaps / 0m00s
+ sitemaps/apple/sitemap1.xml.gz              2 links /  820 Bytes /  330 Bytes gzipped
+ sitemaps/apple/sitemap.xml.gz            1 sitemaps /  388 Bytes /  214 Bytes gzipped
Sitemap stats: 2 links / 1 sitemaps / 0m00s

If you don't want to have to generate all the sitemaps at once, or you want to refresh some more often than others, you can split them up into their own configuration files. Using the above example we would have:

# config/google_sitemap.rb
SitemapGenerator::Sitemap.default_host = "https://google.mysite.com"
SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/google"
SitemapGenerator::Sitemap.create do
  add '/home'
end

# config/apple_sitemap.rb
SitemapGenerator::Sitemap.default_host = "https://apple.mysite.com"
SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/apple"
SitemapGenerator::Sitemap.create do
  add '/home'
end

# config/bing_sitemap.rb
SitemapGenerator::Sitemap.default_host = "https://bing.mysite.com"
SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/bing"
SitemapGenerator::Sitemap.create do
  add '/home'
end

To generate each one specify the configuration file to run by passing the CONFIG_FILE option to rake sitemap:refresh, e.g.:

rake sitemap:refresh CONFIG_FILE="config/google_sitemap.rb"
rake sitemap:refresh CONFIG_FILE="config/apple_sitemap.rb"
rake sitemap:refresh CONFIG_FILE="config/bing_sitemap.rb"

Sitemap Configuration

A sitemap configuration file contains all the information needed to generate your sitemaps. By default SitemapGenerator looks for a configuration file in config/sitemap.rb - relative to your application root or the current working directory. (Run rake sitemap:install to have this file generated for you if you have not done so already.)

If you want to use a non-standard configuration file, or have multiple configuration files, you can specify which one to run by passing the CONFIG_FILE option like so:

rake sitemap:refresh CONFIG_FILE="config/geo_sitemap.rb"

A Simple Example

So what does a sitemap configuration look like? Let's take a look at a simple example:

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
  add '/welcome'
end

A few things to note:

  • SitemapGenerator::Sitemap is a lazy-initialized sitemap object provided for your convenience.
  • Every sitemap must set default_host. This is the hostname that is used when building links to add to the sitemap (and all links in a sitemap must belong to the same host).
  • The create method takes a block with calls to add to add links to the sitemap.
  • The sitemaps are written to the public/ directory in the directory from which the script is run. You can specify a custom location using the public_path or sitemaps_path option.

Now let's see what is output when we run this configuration with rake sitemap:refresh:no_ping:

In /Users/karl/projects/sitemap_generator-test/public/
+ sitemap.xml.gz                                           2 links /  347 Bytes
Sitemap stats: 2 links / 1 sitemaps / 0m00s

Weird! The sitemap has two links, even though we only added one! This is because SitemapGenerator adds the root URL / for you by default. You can change the default behaviour by setting the include_root or include_index option.

Now let's take a look at the file that was created. After uncompressing and XML-tidying the contents we have:

  • public/sitemap.xml.gz
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
  <url>
    <loc>http://www.example.com/</loc>
    <lastmod>2011-05-21T00:03:38+00:00</lastmod>
    <changefreq>weekly</changefreq>
    <priority>1.0</priority>
  </url>
  <url>
    <loc>http://www.example.com/welcome</loc>
    <lastmod>2011-05-21T00:03:38+00:00</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.5</priority>
  </url>
</urlset>

The sitemaps conform to the Sitemap 0.9 protocol. Notice the value for priority and changefreq on the root link, the one that was added for us? The values tell us that this link is the highest priority and should be checked regularly because it are constantly changing. You can specify your own values for these options in your call to add.

In this example no sitemap index was created because we have so few links, so none was needed. If we run the same example above and set create_index = true we can take a look at what an index file looks like:

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create_index = true
SitemapGenerator::Sitemap.create do
  add '/welcome'
end

And the output:

In /Users/karl/projects/sitemap_generator-test/public/
+ sitemap1.xml.gz                                          2 links /  347 Bytes
+ sitemap.xml.gz                                        1 sitemaps /  228 Bytes
Sitemap stats: 2 links / 1 sitemaps / 0m00s

Now if we look at the uncompressed and formatted contents of sitemap.xml.gz we can see that it is a sitemap index and sitemap1.xml.gz is a sitemap:

  • public/sitemap.xml.gz
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd">
  <sitemap>
    <loc>http://www.example.com/sitemap1.xml.gz</loc>
    <lastmod>2013-05-01T18:10:26-07:00</lastmod>
  </sitemap>
</sitemapindex>

Adding Links

You call add in the block passed to create to add a path to your sitemap. add takes a string path and optional hash of options, generates the URL and adds it to the sitemap. You only need to pass a path because the URL will be built for us using the default_host we specified. However, if we want to use a different host for a particular link, we can pass the :host option to add.

Let's see another example:

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
  add '/contact_us'
  Content.find_each do |content|
    add content_path(content), :lastmod => content.updated_at
  end
end

In this example first we add the /contact_us page to the sitemap and then we iterate through the Content model's records adding each one to the sitemap using the content_path helper method to generate the path for each record.

The Rails URL/path helper methods are automatically made available to us in the create block. This keeps the logic for building our paths out of the sitemap config and in the Rails application where it should be. You use those methods just like you would in your application's view files.

In the example about we pass a lastmod (last modified) option with the value of the record's updated_at attribute so that search engines know to only re-index the page when the record changes.

Looking at the output from running this sitemap, we see that we have a few more links than before:

+ sitemap.xml.gz                   12 links /     2.3 KB /  365 Bytes gzipped
Sitemap stats: 12 links / 1 sitemaps / 0m00s

From this example we can see that:

  • The create block can contain Ruby code
  • The Rails URL/path helper methods are made available to us, and
  • The basic syntax for adding paths to the sitemap using add

You can read more about add in the XML Specification.

Supported Options to add

For other options be sure to check out the Sitemap Extensions section below.

  • changefreq - Default: 'weekly' (String).

    Indicates how often the content of the page changes. One of 'always', 'hourly', 'daily', 'weekly', 'monthly', 'yearly' or 'never'. Example:

add '/contact_us', :changefreq => 'monthly'
  • lastmod - Default: Time.now (Integer, Time, Date, DateTime, String).

    The date and time of last modification. Example:

add content_path(content), :lastmod => content.updated_at
  • host - Default: default_host (String).

    Host to use when building the URL. It's not technically valid to specify a different host for a link in a sitemap according to the spec, but this facility exists in case you have a need. Example:

add '/login', :host => 'https://securehost.com'
  • priority - Default: 0.5 (Float).

    The priority of the URL relative to other URLs on a scale from 0 to 1. Example:

add '/about', :priority => 0.75
add '/about', :expires => Time.now + 2.weeks

Adding Links to the Sitemap Index

Sometimes you may need to manually add some links to the sitemap index file. For example if you are generating your sitemaps incrementally you may want to create a sitemap index which includes the files which have already been generated. To achieve this you can use the add_to_index method which works exactly the same as the add method described above.

It supports the same options as add, namely:

  • changefreq

  • lastmod

  • host

    The value for host defaults to whatever you have set as your sitemaps_host. Remember that the sitemaps_host is the host where your sitemaps reside. If your sitemaps are on the same host as your default_host, then the value for default_host is used. Example:

add_to_index '/mysitemap1.xml.gz', :host => 'http://sitemaphostingserver.com'
  • priority

An example:

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
  add_to_index '/mysitemap1.xml.gz'
  add_to_index '/mysitemap2.xml.gz'
  # ...
end

When you add links in this way, an index is always created, unless you've explicitly set create_index to false.

Accessing the LinkSet instance

Sometimes you need to mess with the internals to do custom stuff. If you need access to the LinkSet instance from within create() you can use the sitemap method to do so.

In this example, say we have already pre-generated three sitemap files: sitemap1.xml.gz, sitemap2.xml.gz, sitemap3.xml.gz. Now we want to start the sitemap generation at sitemap4.xml.gz and create a bunch of new sitemaps. There are a few ways we can do this, but this is an easy way:

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.namer = SitemapGenerator::SimpleNamer.new(:sitemap, :start => 4)
SitemapGenerator::Sitemap.create do
  (1..3).each do |i|
    add_to_index "sitemap#{i}.xml.gz"
  end
  add '/home'
  add '/another'
end

The output looks something like this:

In /Users/karl/projects/sitemap_generator-test/public/
+ sitemap4.xml.gz                                          3 links /  355 Bytes
+ sitemap.xml.gz                                        4 sitemaps /  242 Bytes
Sitemap stats: 3 links / 4 sitemaps / 0m00s

Speeding Things Up

For large ActiveRecord collections with thousands of records it is advisable to iterate through them in batches to avoid loading all records into memory at once. For this reason in the example above we use Content.find_each which is a batched iterator available since Rails 2.3.2, rather than Content.all.

Customizing your Sitemaps

SitemapGenerator supports a number of options which allow you to control every aspect of your sitemap generation. How they are named, where they are stored, the contents of the links and the location that the sitemaps will be hosted from can all be set.

The options can be set in the following ways.

On SitemapGenerator::Sitemap:

SitemapGenerator::Sitemap.default_host = 'http://example.com'
SitemapGenerator::Sitemap.sitemaps_path = 'sitemaps/'

These options will apply to all sitemaps. This is how you set most options.

Passed as options in the call to create:

SitemapGenerator::Sitemap.create(
    :default_host => 'http://example.com',
    :sitemaps_path => 'sitemaps/') do
  add '/home'
end

This is useful if you are setting a lot of options.

Finally, passed as options in a call to group:

SitemapGenerator::Sitemap.create(:default_host => 'http://example.com') do
  group(:filename => :somegroup, :sitemaps_path => 'sitemaps/') do
    add '/home'
  end
end

The options passed to group only apply to the links and sitemaps generated in the group. Sitemap Groups are useful to group links into specific sitemaps, or to set options that you only want to apply to the links in that group.

Sitemap Options

The following options are supported.

  • :create_index - Supported values: true, false, :auto. Default: :auto. Whether to create a sitemap index file. If true an index file is always created regardless of how many sitemap files are generated. If false an index file is never created. If :auto an index file is created only when you have more than one sitemap file (i.e. you have added more than 50,000 - SitemapGenerator::MAX_SITEMAP_LINKS - links).

  • :default_host - String. Required. Host including protocol to use when building a link to add to your sitemap. For example http://example.com. Calling add '/home' would then generate the URL http://example.com/home and add that to the sitemap. You can pass a :host option in your call to add to override this value on a per-link basis. For example calling add '/home', :host => 'https://example.com' would generate the URL https://example.com/home, for that link only.

  • :filename - Symbol. The base name for the files that will be generated. The default value is :sitemap. This yields files with names like sitemap.xml.gz, sitemap1.xml.gz, sitemap2.xml.gz, sitemap3.xml.gz etc. If we now set the value to :geo the files would be named geo.xml.gz, geo1.xml.gz, geo2.xml.gz, geo3.xml.gz etc.

  • :include_index - Boolean. Whether to add a link pointing to the sitemap index to the current sitemap. This points search engines to your Sitemap Index to include it in the indexing of your site. 2012-07: This is now turned off by default because Google may complain about there being 'Nested Sitemap indexes'. Default is false. Turned off when sitemaps_host is set or within a group() block.

  • :include_root - Boolean. Whether to add the root url i.e. '/' to the current sitemap. Default is true. Turned off within a group() block.

  • :public_path - String. A full or relative path to the public directory or the directory you want to write sitemaps into. Defaults to public/ under your application root or relative to the current working directory.

  • :sitemaps_host - String. Host including protocol to use when generating a link to a sitemap file i.e. the hostname of the server where the sitemaps are hosted. The value will differ from the hostname in your sitemap links. For example: 'http://amazon.aws.com/'. Note that include_index is automatically turned off when the sitemaps_host does not match default_host. Because the link to the sitemap index file that would otherwise be added would point to a different host than the rest of the links in the sitemap. Something that the sitemap rules forbid.

  • :namer - A SitemapGenerator::SimpleNamer instance for generating sitemap names. You can read about Sitemap Namers by reading the API docs. Allows you to set the name, extension and number sequence for sitemap files, as well as modify the name of the first file in the sequence, which is often the index file. A simple example if we want to generate files like 'newname.xml.gz', 'newname1.xml.gz', etc is SitemapGenerator::SimpleNamer.new(:newname).

  • :sitemaps_path - String. A relative path giving a directory under your public_path at which to write sitemaps. The difference between the two options is that the sitemaps_path is used when generating a link to a sitemap file. For example, if we set SitemapGenerator::Sitemap.sitemaps_path = 'en/' and use the default public_path sitemaps will be written to public/en/. The URL to the sitemap index would then be http://example.com/en/sitemap.xml.gz.

  • :verbose - Boolean. Whether to output a sitemap summary describing the sitemap files and giving statistics about your sitemap. Default is false. When using the Rake tasks verbose will be true unless you pass the -s option.

  • :adapter - Instance. The default adapter is a SitemapGenerator::FileAdapter which simply writes files to the filesystem. You can use a SitemapGenerator::WaveAdapter for uploading sitemaps to remote servers - useful for read-only hosts such as Heroku. Or you can provide an instance of your own class to provide custom behavior. Your class must define a write method which takes a SitemapGenerator::Location and raw XML data.

  • :compress - Specifies which files to compress with gzip. Default is true. Accepted values:

    • true - Boolean; compress all files.
    • false - Boolean; Do not compress any files.
    • :all_but_first - Symbol; leave the first file uncompressed but compress all remaining files.

    The compression setting applies to groups too. So :all_but_first will have the same effect (the first file in the group will not be compressed, the rest will). So if you require different behaviour for your groups, pass in a :compress option e.g. group(:compress => false) { add('/link') }

  • :max_sitemap_links - Integer. The maximum number of links to put in each sitemap. Default is SitemapGenerator::MAX_SITEMAPS_LINKS, or 50,000.

Sitemap Groups

Sitemap Groups is a powerful feature that is also very simple to use.

  • All options are supported except for public_path. You cannot change the public path.
  • Groups inherit the options set on the default sitemap.
  • include_index and include_root are false by default in a group.
  • The sitemap index file is shared by all groups.
  • Groups can handle any number of links.
  • Group sitemaps are finalized (written out) as they get full and at the end of each group.
  • It's a good idea to name your groups

A Groups Example

When you create a new group you pass options which will apply only to that group. You pass a block to group. Inside your block you call add to add links to the group.

Let's see an example that demonstrates a few interesting things about groups:

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
  add '/rss'

  group(:sitemaps_path => 'en/', :filename => :english) do
    add '/home'
  end

  group(:sitemaps_path => 'fr/', :filename => :french) do
    add '/maison'
  end
end

And the output from running the above:

In /Users/karl/projects/sitemap_generator-test/public/
+ en/english.xml.gz                                        1 links /  328 Bytes
+ fr/french.xml.gz                                         1 links /  329 Bytes
+ sitemap1.xml.gz                                          2 links /  346 Bytes
+ sitemap.xml.gz                                        3 sitemaps /  252 Bytes
Sitemap stats: 4 links / 3 sitemaps / 0m00s

So we have two sitemaps with one link each and one sitemap with two links. The sitemaps from the groups are easy to spot by their filenames. They are english.xml.gz and french.xml.gz. They contain only one link each because include_index and include_root are set to false by default in a group.

On the other hand, the default sitemap which we added /rss to has two links. The root url was added to it when we added /rss. If we hadn't added that link sitemap1.xml.gz would not have been created. So when we are using groups, the default sitemap will only be created if we add links to it.

The sitemap index file is shared by all groups. You can change its filename by setting SitemapGenerator::Sitemap.filename or by passing the :filename option to create.

The options you use when creating your groups will determine which and how many sitemaps are created. Groups will inherit the default sitemap when possible, and will continue the normal series. However a group will often specify an option which requires the links in that group to be in their own files. In this case, if the default sitemap were being used it would be finalized before starting the next sitemap in the series.

If you have changed your sitemaps physical location in a group, then the default sitemap will not be used and it will be unaffected by the group. Group sitemaps are finalized as they get full and at the end of each group.

Using group without a block

In some circumstances you may need to conditionally add records to a group or perform some other more complicated logic. In these cases you can instantiate a group instance, add links to it and finalize it manually.

When called with a block, any partial sitemaps are automatically written out for you when the block terminates. Because this does not happen when instantiating manually, you must call finalize! on your group to ensure that it is written out and gets included in the sitemap index file. Note that group sitemaps will still automatically be finalized (written out) as they become full; calling finalize! is to handle the case when a sitemap is not full.

An example:

SitemapGenerator::Sitemap.verbose = true
SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
  odds = group(:filename => :odds)
  evens = group(:filename => :evens)

  (1..20).each do |i|
    if (i % 2) == 0
      evens.add i.to_s
    else
      odds.add i.to_s
    end
  end

  odds.finalize!
  evens.finalize!
end

And the output from running the above:

In '/Users/kvarga/Projects/sitemap_generator-test/public/':
+ odds.xml.gz                                             10 links /  371 Bytes
+ evens.xml.gz                                            10 links /  371 Bytes
+ sitemap.xml.gz                                        2 sitemaps /  240 Bytes
Sitemap stats: 20 links / 2 sitemaps / 0m00s

Sitemap Extensions

News Sitemaps

A news item can be added to a sitemap URL by passing a :news hash to add. The hash must contain tags defined by the News Sitemap specification.

Example

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
  add('/index.html', :news => {
      :publication_name => "Example",
      :publication_language => "en",
      :title => "My Article",
      :keywords => "my article, articles about myself",
      :stock_tickers => "SAO:PETR3",
      :publication_date => "2011-08-22",
      :access => "Subscription",
      :genres => "PressRelease"
  })
end

Supported options

  • :news - Hash
    • :publication_name
    • :publication_language
    • :publication_date
    • :genres
    • :access
    • :title
    • :keywords
    • :stock_tickers

Image Sitemaps

Images can be added to a sitemap URL by passing an :images array to add. Each item in the array must be a Hash containing tags defined by the Image Sitemap specification.

Example

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
  add('/index.html', :images => [{
    :loc => 'http://www.example.com/image.png',
    :title => 'Image' }])
end

Supported options

  • :images - Array of hashes
    • :loc Required, location of the image
    • :caption
    • :geo_location
    • :title
    • :license

Video Sitemaps

A video can be added to a sitemap URL by passing a :video Hash to add(). The Hash can contain tags defined by the Video Sitemap specification.

To add more than one video to a url, pass an array of video hashes using the :videos option.

Example

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
  add('/index.html', :video => {
    :thumbnail_loc => 'http://www.example.com/video1_thumbnail.png',
    :title => 'Title',
    :description => 'Description',
    :content_loc => 'http://www.example.com/cool_video.mpg',
    :tags => %w[one two three],
    :category => 'Category'
  })
end

Supported options

  • :video or :videos - Hash or array of hashes, respectively
    • :thumbnail_loc - Required. String, URL of the thumbnail image.
    • :title - Required. String, title of the video.
    • :description - Required. String, description of the video.
    • :content_loc - Depends. String, URL. One of content_loc or player_loc must be present.
    • :player_loc - Depends. String, URL. One of content_loc or player_loc must be present.
    • :allow_embed - Boolean, attribute of player_loc.
    • :autoplay - Boolean, default true. Attribute of player_loc.
    • :duration - Recommended. Integer or string. Duration in seconds.
    • :expiration_date - Recommended when applicable. The date after which the video will no longer be available.
    • :rating - Optional
    • :view_count - Optional. Integer or string.
    • :publication_date - Optional
    • :tags - Optional. Array of string tags.
    • :tag - Optional. String, single tag.
    • :category - Optional
    • :family_friendly- Optional. Boolean
    • :gallery_loc - Optional. String, URL.
    • :gallery_title - Optional. Title attribute of the gallery location element
    • :uploader - Optional.
    • :uploader_info - Optional. Info attribute of uploader element
    • :price - Optional. Only one price supported at this time
      • :price_currency - Required. In ISO_4217 format.
      • :price_type - Optional. rent or own
      • :price_resolution - Optional. HD or SD
    • :live - Optional. Boolean.
    • :requires_subscription - Optional. Boolean.

PageMap Sitemaps

Pagemaps can be added by passing a :pagemap hash to add. The hash must contain a :dataobjects key with an array of dataobject hashes. Each dataobject hash contains a :type and :id, and an optional array of :attributes. Each attribute hash can contain two keys: :name and :value, with string values. For more information consult the official documentation on PageMaps.

Supported options

  • :pagemap - Hash
    • :dataobjects - Required, array of hashes
      • :type - Required, string, type of the object
      • :id - String, ID of the object
      • :attributes - Array of hashes
        • :name - Required, string, name of the attribute.
        • :value - String, value of the attribute.

Example:

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
  add('/blog/post', :pagemap => {
    :dataobjects => [{
      :type => 'document',
      :id   => 'hibachi',
      :attributes => [
        { :name => 'name',   :value => 'Dragon' },
        { :name => 'review', :value => '3.5' },
      ]
    }]
  })
end

Alternate Links

A useful feature for internationalization is to specify alternate links for a url.

Alternate links can be added by passing an :alternate Hash to add. You can pass more than one alternate link by passing an array of hashes using the :alternates option.

Check out the Google specification here.

Example

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
  add('/index.html', :alternate => {
    :href => 'http://www.example.de/index.html',
    :lang => 'de',
    :nofollow => true
  })
end

Supported options

  • :alternate/:alternates - Hash or array of hashes, respectively

Alternates Example

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
 add('/index.html', :alternates => [
        {
            :href => 'http://www.example.de/index.html',
            :lang => 'de',
            :nofollow => true
        },
        {
            :href => 'http://www.example.es/index.html',
            :lang => 'es',
            :nofollow => true
        }
    ])
end

Mobile Sitemaps

Mobile sitemaps include a specific <mobile:mobile/> tag.

Check out the Google specification here.

Example

SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.create do
  add('/index.html', :mobile => true)
end

Supported options

  • :mobile - Presence of this option will turn on the mobile flag regardless of value.

Compatibility

Compatible with all versions of Rails and Ruby. Tested up to Ruby 3.1 and Rails 7.0. Ruby 1.9.3 support was dropped in Version 6.0.0.

Licence

Released under the MIT License. See the (MIT-LICENSE)[MIT-LICENSE] file.

MIT. See the LICENSE.md file.

Copyright (c) Karl Varga released under the MIT license

sitemap_generator's People

Contributors

adamsalter avatar amatsuda avatar apsoto avatar benmorganio avatar brchristian avatar charlespeach avatar coryosborn avatar dlackty avatar dominikgrygiel avatar ehoch avatar envek avatar jasoncodes avatar kjvarga avatar manuelmeurer avatar markprovan avatar rab avatar rarce avatar raviy06 avatar recurser avatar richievos avatar rjhancock avatar sagarjunnarkar avatar samuelpismel avatar scotterc avatar sealocal avatar takatoshi-maeda avatar tiagoamaro avatar tomk32 avatar xymbol avatar yxmatic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sitemap_generator's Issues

rake aborts on any task, when sitemap_generator is included in rakefile

I traced a problem with rake aborting (rake aborted! undefined method text_area' for classActionView::Base') down to the point, that when I uncomment the Rakefile entry for sitemap_generator it works just fine... - My guess is, that things are being pulled in the wrong order. Can you make sense of that?

Usage of i18n_routng gem with sitemap generator

I need to generate sitemap with localized urls. I'm using i18n_routing gem to do translations.
But sitemap generator doesn't know anything about translated paths.

It neither generates routes in my language by defalult (pl)

Article.find_each do |article|
sitemap.add article_path(article), :lastmod => article.updated_at
end

nor generates them when I use helpers provided by i18n_routing gem:

Article.find_each do |article|
sitemap.add pl_article_path(article), :lastmod => article.updated_at
end

In the second case I'm getting 'no method error'.

Is it possible to use this generator with i18n_routing?

Only One Sitemap created and its for localhost:3000?

Hey all,

I'm having this issue: #45 where only one sitemap is created, even though the terminal output shows more.

I'm also only getting the index file to create with localhost:3000 as the address, I've removed and regenerated numerous times from local and for Heroku and I can't for the life of me to get it to work.

Any ideas what might be causing these issues?

Thanks!

Geoff

Add link to other pre-generated sitemaps

Hi, it has been a pleasure to use this nice gem.

I had this issue tonight where I have two sitemaps I need to include in my sitemap_index.xml.gz file, but only one of which is generated by sitemap_generator. Is there a way to just "link" to a second sitemap? I couldn't figure it out and didn't see this issue discussed in this issues database so I am posting here. I tried using an empty group but nothing seems to get generated since I didn't add any links to the group.

The reason why I do this is because my main site is ruby/rails at website.com and I have my wordpress php blog at website.com/blog. The wordpess blog sitemap can't be created by this gem. So I've given up and allow a wordpress plugin to create the blog sitemap and I just want to link to it in the sitemap_index.xml.gz along with the link to the main site's sitemap.

One workaround would be for me to put my blog at blog.website.com, but my entire site is SSL and (correct me if I'm wrong), but I didn't want to buy 2 certificates, one for website.com and one for blog.website.com. Besides, I prefer it at /blog anyway.

My actual current workaround is to manually create a sitemap_index.xml file. Then after a capistrano deploy, I have a task to remove the sitemap_generator-created sitemap_index.xml.gz file and then I gzip my sitemap_index.xml file to take its place. There's a slight race condition in this case as the refresh has already sent out the pings to alert search engine robots and then I do the rm and cp/gzip for my real index file but whatever.

Thoughts? Could we add a link_only attribute for a new group or have an add_index_file method?

subdomain routing support?

Chris writes:

I've been using sitemap_generator for a while now, and I like it.
I recently changed my Rails app from a traditional subdirectory routing architecture to a subdomain-based routing (like Basecamp and others. See 37-Signals: How to do Basecamp-style subdomains in Rails)
I was wondering if you would consider adding support for generating sitemaps for subdomain-based routing. I realize that subdomains kind of defeat the purpose of sitemaps.

Thoughts?

Not able to load the sitemap into the AWS root

Hi all,

I would like to upload the sitemap_index and the sitemap into the root of the AWS bucket (as opposed to a subdirectory -- see the wiki example).

I tried to set the sitemaps_path to "" or "/" but in the first case a subdirectory named "." is created on S3, in the second case a directory without name is created.

Is there a way I can use to save the files to the root directory of the bucket?

Thanks in advance.

Paul

Sitemaps with less than 50,000 urls still use sitemap index

When I create a sitemap with less than 50,000 urls, it should not use a sitemap index file. I personally run multiple sitemaps for my sites, including a small news-only one, and the idea of a sitemap index file for a sitemap with 20 urls seems silly. Other than that I love this project and think it will be perfect.

Could anyone spearhead fixing? Willing to donate for this!

something wrong with installation (stack level too deep)

Hi!

Thanks for great gem!!!

Please help me get this working

ruby 1.8.7 (2011-06-30 patchlevel 352)
Rails 2.3.5
sitemap_generator (2.1.6)

I tried installation from gem and plug-in
All of them failed to refresh with error

stack level too deep
(DELEGATE):2:in set_options' (__DELEGATE__):2:insend'
(DELEGATE):2:in set_options' (__DELEGATE__):2:insend'
(DELEGATE):2:in set_options' (__DELEGATE__):2:insend'
(DELEGATE):2:in set_options' (__DELEGATE__):2:insend'
(DELEGATE):2:in set_options' (__DELEGATE__):2:insend'

Do you know how to deal with it?
Thanks!

Allow an extra step before pinging search engines

During our sitemap building, we'd like to RSync our sitemap out to our servers before pinging the search engines.

What do you think the best way to do that is? I'd rather a built-in solution, rather than forking our own copy and modifying it to include our step.

Exception handling

We were using crontab, and at one point one of our path method was removed, and we forgot to change the sitemap.rb and it was silently failing.

When I found that and fix the sitemap I added a simple exception catcher for it.

SitemapGenerator::Sitemap.create do
  begin
    # ... code ...
  rescue Exception => e
    puts e.message
    HoptoadNotifier.notify(e, :component => 'Sitemap Generator')
  end
end

But I was wondering if there is a better way to this already integrated in the gem. And if there isn't can I add a point for extending with exception handlers.

Support for Heroku?

It would be great if you could support read only file systems like Heroku.

Sitemap index only includes first sitemap

Having an issue where my sitemap index generates only one item in the XML. The stats however correctly output the 6 elements.

Here's the task output:

  • sitemaps/com/sitemap1.xml.gz 7392 links / 10 MB / 595 KB gzipped
  • sitemaps/com/sitemap2.xml.gz 4117 links / 10 MB / 565 KB gzipped
  • sitemaps/com/sitemap3.xml.gz 3916 links / 10 MB / 545 KB gzipped
  • sitemaps/com/sitemap4.xml.gz 31588 links / 10 MB / 697 KB gzipped
  • sitemaps/com/sitemap5.xml.gz 50000 links / 7.51 MB / 692 KB gzipped
  • sitemaps/com/sitemap6.xml.gz 21965 links / 3.32 MB / 294 KB gzipped
  • sitemaps/com/sitemap_index.xml.gz 6 sitemaps / 848 Bytes / 238 Bytes gzipped

And the content of the generated index:

<?xml version="1.0" encoding="UTF-8"?><sitemapindex xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><sitemap><loc>http://roomorama.s3.amazonaws.com/sitemaps/com/sitemap1.xml.gz</loc></sitemap></sitemapindex>

Issue with Rails 3.0.3

I'm having trouble using this generator.
Here's my config:
SitemapGenerator::Sitemap.default_host = 'http://www.mysite.com'
SitemapGenerator::Sitemap.add_links do |sitemap|

sitemap.add restaurants_path
Restaurant.find_each do |r|
    sitemap.add restaurant_path(r), :lastmod => r.updated_at
end
end

When I execute 'rake sitemap:refresh' I get:
rake aborted!
undefined method `>=' for nil:NilClass

If I remove the 'find_each' block, leaving just the first line (sitemap.add restaurants_path) then it works.
Also, I can use restaurant_path with no problems on my website.

Here's the full trace:

** Invoke sitemap:refresh:no_ping (first_time)
** Invoke sitemap:create (first_time)
** Invoke sitemap:require_environment (first_time)
** Invoke environment (first_time)
** Execute environment
** Execute sitemap:require_environment
** Execute sitemap:create
rake aborted!
undefined method >=' for nil:NilClass /usr/lib/ruby/gems/1.8/gems/activesupport-3.0.1/lib/active_support/whiny_nil.rb:48:inmethod_missing'
/usr/lib/ruby/1.8/rational.rb:526:in **' /usr/lib/ruby/gems/1.8/gems/actionpack-3.0.1/lib/action_view/helpers/number_helper.rb:269:innumber_with_precision'
/usr/lib/ruby/gems/1.8/gems/actionpack-3.0.1/lib/action_view/helpers/number_helper.rb:347:in number_to_human_size' /usr/lib/ruby/gems/1.8/gems/sitemap_generator-1.3.7/lib/sitemap_generator/builder/sitemap_file.rb:141:insummary'
/usr/lib/ruby/gems/1.8/gems/sitemap_generator-1.3.7/lib/sitemap_generator/link_set.rb:33:in create' /usr/lib/ruby/gems/1.8/gems/sitemap_generator-1.3.7/tasks/sitemap_generator_tasks.rake:41 /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:636:incall'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:636:in execute' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:631:ineach'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:631:in execute' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:597:ininvoke_with_call_chain'
/usr/lib/ruby/1.8/monitor.rb:242:in synchronize' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:590:ininvoke_with_call_chain'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:607:in invoke_prerequisites' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:604:ineach'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:604:in invoke_prerequisites' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:596:ininvoke_with_call_chain'
/usr/lib/ruby/1.8/monitor.rb:242:in synchronize' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:590:ininvoke_with_call_chain'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:583:in invoke' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2051:ininvoke_task'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2029:in top_level' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2029:ineach'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2029:in top_level' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2068:instandard_exception_handling'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2023:in top_level' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2001:inrun'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2068:in standard_exception_handling' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:1998:inrun'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/bin/rake:31
/usr/bin/rake:19:in `load'
/usr/bin/rake:19

Thank you and keep up the good work :)

Sitemaps are created twice

I am having an issue with v2.0.0. My sitemaps are always created twice.

# config/sitemap.rb
SitemapGenerator::Sitemap.default_host = 'http://google.com'

SitemapGenerator::Sitemap.create do
end

When I execute rake sitemap:create I get the following output:

(in /Users/me/app)
+ sitemap1.xml.gz                   2 links /  762 Bytes /  308 Bytes gzipped
+ sitemap_index.xml.gz           1 sitemaps /  359 Bytes /  198 Bytes gzipped
Sitemap stats: 2 links / 1 sitemaps / 0m00s
+ sitemap2.xml.gz                   0 links /  464 Bytes /  215 Bytes gzipped
+ sitemap_index.xml.gz           1 sitemaps /  359 Bytes /  198 Bytes gzipped
Sitemap stats: 0 links / 1 sitemaps / 0m00s

The generated files contain ungzipped:

public/sitemap_index.xml

<?xml version="1.0" encoding="UTF-8"?><sitemapindex xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><sitemap><loc>http://google.com/sitemap2.xml.gz</loc></sitemap></sitemapindex>

public/sitemap1.xml

<?xml version="1.0" encoding="UTF-8"?><urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1" xmlns:geo="http://www.google.com/geo/schemas/sitemap/1.0"><url><loc>http://google.com/</loc><lastmod>2011-05-31T20:52:48+00:00</lastmod><changefreq>always</changefreq><priority>1.0</priority></url><url><loc>http://google.com/sitemap_index.xml.gz</loc><lastmod>2011-05-31T20:52:48+00:00</lastmod><changefreq>always</changefreq><priority>1.0</priority></url></urlset>

public/sitemap2.xml

<?xml version="1.0" encoding="UTF-8"?><urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1" xmlns:geo="http://www.google.com/geo/schemas/sitemap/1.0"></urlset>

When I added some real pathes to sitemap.rb, in general sitemap1.xml included all specified pathes plus root path plus sitemap path whereby sitemap2.xml contained only the specified pathes.

no block given

root@altrove:/rails/lauranovara# rake sitemap:refresh
rake aborted!
no block given

(See full trace by running task with --trace)
root@altrove:/rails/lauranovara#

my config/sitemap.rb looks like :

SitemapGenerator::Sitemap.default_host = "http://lauranovara.com/"
SitemapGenerator::Sitemap.add_links do |sitemap|
sitemap.add '/galleries/jewels'
sitemap.add '/galleries/people'
sitemap.add '/galleries/food'
sitemap.add '/galleries/funny'
sitemap.add '/pages/about'
end

the galleries controller looks like :
class GalleriesController < HomeController
inherit_resources
actions :index, :show
respond_to :html, :xml

caches_page :show
caches_action :index

set the initial the background image

def index
index! do |format|
format.html do
redirect_to gallery_path(@config.galleries.first)
return
end
format.xml { render :template => '/galleries', :layout => false }
end
end

def show
show! do
@page_title = "#{@gallery.title} Gallery"
@page_keywords = @gallery.keywords.blank? ? "#{@gallery.title.downcase}, gallery, photography, portraits, professional" : @gallery.keywords
@page_description = @gallery.description
@galleries = @config.galleries.find :all
render :template => '/gallery'
return
end
end

private #-------
# Defining the collection explicitly for ordering
def collection
@galleries ||= @config.galleries.find :all
end
end

Sitemap namespaces

Hi dear sitemap generators,

For some reason, we might want to create distinct sitemaps according to their content type, or simply cause of some organization constraints. It would be then great to have the possibility to define namespaces, each one of them including its own set of urls.

Something like that in the sitemap.rb:

SitemapGenerator::Sitemap.namespace(:posts) do 
  SitemapGenerator::Sitemap.add_links do |sitemap|
    sitemap.add(...)
  end
end

SitemapGenerator::Sitemap.namespace(:videos) do 
  SitemapGenerator::Sitemap.add_links do |sitemap|
    sitemap.add(...)
  end
end

would generate sitemaps files like:

sitemap_index.xml
sitemap_posts1.xml
sitemap_videos1.xml
...

Thanks for your feedbacks :)

undefined method "verbose" when refreshing sitemap

[riding@host equine]$ cd /home/riding/railsapps/equine && /usr/bin/rake -t sitemap:refresh RAILS_ENV=production
(in /home/riding/railsapps/equine)
** Invoke sitemap:refresh (first_time)
** Invoke sitemap:create (first_time)
** Invoke environment (first_time)
** Execute environment
** Invoke sitemap:clean (first_time)
** Invoke environment
** Execute sitemap:clean
** Execute sitemap:create

  • /home/riding/railsapps/equine/public/sitemap1.xml.gz
  • /home/riding/railsapps/equine/public/sitemap_index.xml.gz
    Sitemap stats: 19,785 links, 0m12s
    rake aborted!
    undefined method verbose=' for #<SitemapGenerator::LinkSet:0xb78a859c> /home/riding/railsapps/equine/vendor/plugins/sitemap_generator/tasks/sitemap_generator_tasks.rake:36 /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:636:incall'
    /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:636:in execute' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:631:ineach'
    /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:631:in execute' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:597:ininvoke_with_call_chain'
    /usr/lib/ruby/1.8/monitor.rb:242:in synchronize' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:590:ininvoke_with_call_chain'
    /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:607:in invoke_prerequisites' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:604:ineach'
    /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:604:in invoke_prerequisites' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:596:ininvoke_with_call_chain'
    /usr/lib/ruby/1.8/monitor.rb:242:in synchronize' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:590:ininvoke_with_call_chain'
    /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:583:in invoke' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2051:ininvoke_task'
    /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2029:in top_level' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2029:ineach'
    /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2029:in top_level' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2068:instandard_exception_handling'
    /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2023:in top_level' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2001:inrun'
    /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2068:in standard_exception_handling' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:1998:inrun'
    /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/bin/rake:31
    /usr/bin/rake:16:in `load'
    /usr/bin/rake:16

No such file or directory

Running a Rails 2.2.2 App.

If you see it having trouble finding the sitemap.rb file like this:

No such file or directory - /Users/me/code/MyRailsAppconfig/sitemap.rb

You can just add a / to line 24 of lib/sitemap_generator/interpreter.rb

Rails 2.3.5 with bundler / email executed from rake errors

I am running Rails 2.3.5 with bundler 0.9.26. If we don't include the code for the Rakefile we don't get access to the sitemap rake tasks. Having the code in the Rakefile causes all emails fired from rake tasks to throw this error:

undefined method `url_for' for #ActionView::Base:0x104d86eb8

Don't modify options object given to #add

This line: https://github.com/kjvarga/sitemap_generator/blob/master/lib/sitemap_generator/link_set.rb#L130 changes the given options instead of creating a new object.

The problem (other than violating the principle to never modify objects that you didn't create) is that, when I have code like this:

default_site_options = { :changefreq => 'daily', :priority => 0.5, :lastmod => nil }.freeze

SitemapGenerator::Sitemap.create do
  add some_url, default_site_options

it returns my default_site_options object as:

{:news=>{}, :lastmod=>nil, :priority=>0.5, :videos=>[], :images=>[], :host=>"http://www.sofatutor.com", :changefreq=>"daily"}

When I then continue adding videos:

  Video.all.each do |video|
    add video.video_url, default_site_options.merge(:video => video.sitemap_tags)
  end

the videos array fills up with more and more videos:

{:news=>{}, :lastmod=>nil, :priority=>0.5, :videos=>[{:gallery_loc=>"http://localhost:3000/mathematik/verschiedenes/wichtige-konstanten", :category=>"Mathematik", :publication_date=>"2009-03-09T02:47:49+01:00", :price=>nil, :uploader=>"Konstantin E", :title=>"Irrationalität der Eulerschen Zahl e", :expiration_date=>nil, :description=>"Hier wird der Beweis dafür geführt, dass die Eulersche Zahl e tatsächlich irrational, also nicht als Bruch schreibbar ist. Dieses Videos erklärt eulersche zahl e, rational, irrational, beweis und reihendarstellung. Nachhilfe in Mathematik mit Lernvideos und Testfragen findest du auf http://localhost:3000/mathematik.", :tags=>["Video", "Nachhilfe", "Lernen", "Vorbereitung", "Klausuren", "Tests", "Lernvideo", "Üben", "Lernen", "Erklärung", "Anleitung", "eulersche zahl e", "rational", "irrational", "beweis", "reihendarstellung"], :duration=>203, :content_loc=>nil, :family_friendly=>true}], :images=>[], :host=>"http://www.sofatutor.com", :changefreq=>"daily"}```

…which means I get a wrong sitemap (and a quite large one, too.)

const_missing with config.threadsafe!

In production I have config.threadsafe! enabled.
This leads to models not accesible in SitemapGenerator:

SitemapGenerator::Sitemap.add_links do |sitemap|
  Monster.where(:weight => weight).find_each do |m|
    sitemap.add m.path.to_s, :changefreq => 'monthly', :lastmod => m.updated_at, :priority => priority
  end
end

produces

bundle exec rake --trace sitemap:refresh CONFIG_FILE="config/sitemap_monsters.rb" 
** Invoke sitemap:refresh (first_time)
** Invoke sitemap:create (first_time)
** Invoke sitemap:require_environment (first_time)
** Invoke environment (first_time)
** Execute environment
** Execute sitemap:require_environment
** Execute sitemap:create
rake aborted!
uninitialized constant Monster
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/ext/module.rb:36:in `const_missing'
config/sitemap_monsters.rb:26:in `block in run'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/sitemap_generator-2.0.1/lib/sitemap_generator/interpreter.rb:47:in `eval'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/sitemap_generator-2.0.1/lib/sitemap_generator/link_set.rb:33:in `create'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/sitemap_generator-2.0.1/lib/sitemap_generator/link_set.rb:43:in `add_links'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/sitemap_generator-2.0.1/lib/sitemap_generator.rb:27:in `method_missing'
config/sitemap_monsters.rb:8:in `run'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/sitemap_generator-2.0.1/lib/sitemap_generator/interpreter.rb:65:in `instance_eval'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/sitemap_generator-2.0.1/lib/sitemap_generator/interpreter.rb:65:in `run'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/sitemap_generator-2.0.1/tasks/sitemap_generator_tasks.rake:41:in `block (2 levels) in <top (required)>'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:205:in `call'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:205:in `block in execute'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:200:in `each'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:200:in `execute'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:158:in `block in invoke_with_call_chain'
/usr/local/rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/monitor.rb:201:in `mon_synchronize'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:151:in `invoke_with_call_chain'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:176:in `block in invoke_prerequisites'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:174:in `each'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:174:in `invoke_prerequisites'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:157:in `block in invoke_with_call_chain'
/usr/local/rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/monitor.rb:201:in `mon_synchronize'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:151:in `invoke_with_call_chain'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/task.rb:144:in `invoke'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:112:in `invoke_task'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:90:in `block (2 levels) in top_level'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:90:in `each'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:90:in `block in top_level'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:129:in `standard_exception_handling'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:84:in `top_level'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:62:in `block in run'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:129:in `standard_exception_handling'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/lib/rake/application.rb:59:in `run'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/gems/rake-0.9.2/bin/rake:32:in `<top (required)>'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/bin/rake:19:in `load'
/srv/monsterzoo/shared/bundle/ruby/1.9.1/bin/rake:19:in `<main>'
Tasks: TOP => sitemap:refresh => sitemap:create

The only workaround is to disable threadsafety for rake tasks in my environment:

config.threadsafe! unless $rails_rake_task

Ping search engines require login

Hello,

Thank you very much for gem, I use it and like it.

I see that some search engine require login (authorization) before ping. For example: http://webmaster.yandex.ru/wmconsole/sitemap_list.xml?host=http://www.example.com%2Fsitemap_index.xml.gz requires authorization. How can I handle this situation from sitemap generator. Are you planning to add credentials to handle this case?

Also it would be great to allow ping search engines behind proxy including ntlm (gem "ruby-ntlm", ">= 0.0.1" ) proxy.

I am ready to participate and test.

Sincerely yours,
Artem Rufanov.

P.S.
Have a good day!

rake sitemap:refresh "rake aborted! wrong number of arguments (1 for 0)"

Hi, I've installed the gem via git like this
gem "sitemap_generator", :git => "git://github.com/kjvarga/sitemap_generator.git"

then I've run:
rake sitemap:install

I've edited my sitemap.rb which is like this:

# Set the host name for URL creation
SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.sitemaps_path = 'system/sitemaps'

SitemapGenerator::Sitemap.create do  
  add '/'
end

Then run:
rake sitemap:refresh:no_ping --trace

The rake task is failing. It looks like that I have some wrong xml lib or something... I'm very new to ruby sorry :S. This is the stack trace output:

shevany:entretenerse mauroasprea$ rake sitemap:refresh:no_ping --trace
** Invoke sitemap:refresh:no_ping (first_time)
** Invoke sitemap:create (first_time)
** Invoke sitemap:require_environment (first_time)
** Invoke environment (first_time)
** Execute environment
** Execute sitemap:require_environment
** Execute sitemap:create
rake aborted!
wrong number of arguments (1 for 0)
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/builder-3.0.0/lib/builder/xmlbase.rb:135:in to_xs' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/builder-3.0.0/lib/builder/xmlbase.rb:135:in_escape'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/builder-3.0.0/lib/builder/xmlbase.rb:88:in text!' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/builder-3.0.0/lib/builder/xmlbase.rb:76:inmethod_missing'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bundler/gems/sitemap_generator-386551f3221e/lib/sitemap_generator/builder/sitemap_url.rb:61:in to_xml' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/builder-3.0.0/lib/builder/xmlbase.rb:155:incall'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/builder-3.0.0/lib/builder/xmlbase.rb:155:in _nested_structures' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/builder-3.0.0/lib/builder/xmlbase.rb:63:inmethod_missing'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bundler/gems/sitemap_generator-386551f3221e/lib/sitemap_generator/builder/sitemap_url.rb:60:in to_xml' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bundler/gems/sitemap_generator-386551f3221e/lib/sitemap_generator/builder/sitemap_file.rb:82:inadd'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bundler/gems/sitemap_generator-386551f3221e/lib/sitemap_generator/link_set.rb:322:in add_default_links' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bundler/gems/sitemap_generator-386551f3221e/lib/sitemap_generator/link_set.rb:118:inadd'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bundler/gems/sitemap_generator-386551f3221e/lib/sitemap_generator/interpreter.rb:31:in add' /Users/mauroasprea/entretenerse/config/sitemap.rb:6:inrun'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bundler/gems/sitemap_generator-386551f3221e/lib/sitemap_generator/interpreter.rb:49:in instance_eval' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bundler/gems/sitemap_generator-386551f3221e/lib/sitemap_generator/interpreter.rb:49:ineval'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bundler/gems/sitemap_generator-386551f3221e/lib/sitemap_generator/link_set.rb:36:in create' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bundler/gems/sitemap_generator-386551f3221e/lib/sitemap_generator.rb:31:insend'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bundler/gems/sitemap_generator-386551f3221e/lib/sitemap_generator.rb:31:in method_missing' /Users/mauroasprea/entretenerse/config/sitemap.rb:5:inrun'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bundler/gems/sitemap_generator-386551f3221e/tasks/sitemap_generator_tasks.rake:41
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/task.rb:205:in call' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/task.rb:205:inexecute'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/task.rb:200:in each' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/task.rb:200:inexecute'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/task.rb:158:in invoke_with_call_chain' /Users/mauroasprea/.rvm/rubies/ruby-1.8.7-p352/lib/ruby/1.8/monitor.rb:242:insynchronize'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/task.rb:151:in invoke_with_call_chain' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/task.rb:176:ininvoke_prerequisites'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/task.rb:174:in each' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/task.rb:174:ininvoke_prerequisites'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/task.rb:157:in invoke_with_call_chain' /Users/mauroasprea/.rvm/rubies/ruby-1.8.7-p352/lib/ruby/1.8/monitor.rb:242:insynchronize'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/task.rb:151:in invoke_with_call_chain' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/task.rb:144:ininvoke'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/application.rb:116:in invoke_task' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/application.rb:94:intop_level'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/application.rb:94:in each' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/application.rb:94:intop_level'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/application.rb:133:in standard_exception_handling' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/application.rb:88:intop_level'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/application.rb:66:in run' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/application.rb:133:instandard_exception_handling'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/lib/rake/application.rb:63:in run' /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/gems/rake-0.9.2.2/bin/rake:33 /Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bin/rake:19:inload'
/Users/mauroasprea/.rvm/gems/ruby-1.8.7-p352@entretenersetesting/bin/rake:19
Tasks: TOP => sitemap:refresh:no_ping => sitemap:create

undefined method `link_to'

Is there a way to define non-resrouceful paths? I only see examples for resourceful paths

I however would like to work with paths like these:

link_to location_item.name, :controller => '/location_items', :action => 'display', :location_name => name_to_url(location.name), :location_item_name => name_to_url(location_item.name), :id => location_item.id

Unable to specify sitemaps_path

I'm specify in top of config/sitemap.rb
SitemapGenerator::Sitemap.sitemaps_path = "sitemap"

but self.sitemaps_path is nil anyway.

uninitialized constant SitemapGenerator

Even in a brand new rails application, SitemapGenerator fails to run any rake tasks. Maybe I'm doing something wrong but:

$ rails sitemap
$ cd sitemap
$ script/plugin install git://github.com/kjvarga/sitemap_generator.git
$ rake --trace sitemap:install

And I get the uninitialized constant error found here:
http://pastie.org/1048786

This is a bare rails app with nothing in it, Rails version 2.3.2

undefined local variable or method `video' for #<SitemapGenerator

Hi,

I've used this sitemap generator before, however I just followed the instructions to create a video sitemap and it gives me the error:
undefined local variable or method `video' for #<SitemapGenerator

The error seems to be coming from the add line when I declare a video hash
add(post_path, :video => {:bla => something, :blah => something else})

Is this a known issue or is there maybe something I have done wrong?

Thanks in advance.

Multiple videos support

Hi,

Following monkey-patch adds support for multiple videos.
Instead of:
add path, :video => video1_hash
use:
add path, :videos => [video1_hash, video2_hash, video3_hash]

# Paste this into config/initilizers/sitemap_multiple_videos.rb

module SitemapGenerator
  module Builder
    class SitemapUrl < Hash

      # Call with:
      #   sitemap - a Sitemap instance, or
      #   path, options - a path for the URL and options hash
      def initialize(path, options={})
        if sitemap = path.is_a?(SitemapGenerator::Builder::SitemapFile) && path
          options.reverse_merge!(:host => sitemap.location.host, :lastmod => sitemap.lastmod)
          path = sitemap.location.path_in_public
        end

        SitemapGenerator::Utilities.assert_valid_keys(options, :priority, :changefreq, :lastmod, :host, :images, :videos, :geo, :news)
        options.reverse_merge!(:priority => 0.5, :changefreq => 'weekly', :lastmod => Time.now, :images => [], :news => {})
        self.merge!(
          :path => path,
          :priority => options[:priority],
          :changefreq => options[:changefreq],
          :lastmod => options[:lastmod],
          :host => options[:host],
          :loc => URI.join(options[:host], path).to_s,
          :images => prepare_images(options[:images], options[:host]),
          :news => prepare_news(options[:news]),
          :videos => options[:videos],
          :geo => options[:geo]
        )
      end

      # Return the URL as XML
      def to_xml(builder=nil)
        builder = ::Builder::XmlMarkup.new if builder.nil?
        builder.url do
          builder.loc        self[:loc]
          builder.lastmod    w3c_date(self[:lastmod])   if self[:lastmod]
          builder.changefreq self[:changefreq]          if self[:changefreq]
          builder.priority   self[:priority]            if self[:priority]

          unless self[:news].blank?
            news_data = self[:news]
            builder.news:news do
              builder.news:publication do
                builder.news :name, news_data[:publication_name] if news_data[:publication_name]
                builder.news :language, news_data[:publication_language] if news_data[:publication_language]
              end

              builder.news :access, news_data[:access] if news_data[:access]
              builder.news :genres, news_data[:genres] if news_data[:genres]
              builder.news :publication_date, news_data[:publication_date] if news_data[:publication_date]
              builder.news :title, news_data[:title] if news_data[:title]
              builder.news :keywords, news_data[:keywords] if news_data[:keywords]
              builder.news :stock_tickers, news_data[:stock_tickers] if news_data[:stock_tickers]
            end
          end


          unless self[:images].blank?
            self[:images].each do |image|
              builder.image:image do
                builder.image :loc, image[:loc]
                builder.image :caption, image[:caption]             if image[:caption]
                builder.image :geo_location, image[:geo_location]   if image[:geo_location]
                builder.image :title, image[:title]                 if image[:title]
                builder.image :license, image[:license]             if image[:license]
              end
            end
          end

          unless self[:videos].blank?
            self[:videos].each do |video|
              builder.video :video do
                builder.video :thumbnail_loc, video[:thumbnail_loc]
                builder.video :title, video[:title]
                builder.video :description, video[:description]
                builder.video :content_loc, video[:content_loc]           if video[:content_loc]
                if video[:player_loc]
                  builder.video :player_loc, video[:player_loc], :allow_embed => (video[:allow_embed] ? 'yes' : 'no'), :autoplay => video[:autoplay]
                end

                builder.video :rating, video[:rating]                     if video[:rating]
                builder.video :view_count, video[:view_count]             if video[:view_count]
                builder.video :publication_date, video[:publication_date] if video[:publication_date]
                builder.video :expiration_date, video[:expiration_date]   if video[:expiration_date]
                builder.video :family_friendly, (video[:family_friendly] ? 'yes' : 'no')  if video[:family_friendly]
                builder.video :duration, video[:duration]                 if video[:duration]
                video[:tags].each {|tag| builder.video :tag, tag }        if video[:tags]
                builder.video :tag, video[:tag]                           if video[:tag]
                builder.video :category, video[:category]                 if video[:category]
                builder.video :gallery_loc, video[:gallery_loc]           if video[:gallery_loc]

                if video[:uploader]
                  builder.video :uploader, video[:uploader], video[:uploader_info] ? { :info => video[:uploader_info] } : {}
                end
              end
            end
          end

          unless self[:geo].blank?
            geo = self[:geo]
            builder.geo :geo do
              builder.geo :format, geo[:format] if geo[:format]
            end
          end
        end
        builder << '' # Force to string
      end
    end
  end
end

can't convert nil into String - rails 3.0.4

getting error "can't convert nil into String"

full trace

can't convert nil into String
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/sitemap_generator-1.3.1/lib/sitemap_generator/link_set.rb:60:in join' /Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/sitemap_generator-1.3.1/lib/sitemap_generator/link_set.rb:60:ininitialize'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/sitemap_generator-1.3.1/lib/sitemap_generator.rb:21:in new' /Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/sitemap_generator-1.3.1/lib/sitemap_generator.rb:21:inblock in module:SitemapGenerator'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/activesupport-3.0.4/lib/active_support/core_ext/kernel/reporting.rb:11:in block in silence_warnings' /Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/activesupport-3.0.4/lib/active_support/core_ext/kernel/reporting.rb:22:inwith_warnings'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/activesupport-3.0.4/lib/active_support/core_ext/kernel/reporting.rb:11:in silence_warnings' /Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/sitemap_generator-1.3.1/lib/sitemap_generator.rb:14:inmodule:SitemapGenerator'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/sitemap_generator-1.3.1/lib/sitemap_generator.rb:9:in <top (required)>' /Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/bundler-1.0.7/lib/bundler/runtime.rb:64:inrequire'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/bundler-1.0.7/lib/bundler/runtime.rb:64:in block (2 levels) in require' /Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/bundler-1.0.7/lib/bundler/runtime.rb:62:ineach'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/bundler-1.0.7/lib/bundler/runtime.rb:62:in block in require' /Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/bundler-1.0.7/lib/bundler/runtime.rb:51:ineach'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/bundler-1.0.7/lib/bundler/runtime.rb:51:in require' /Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/bundler-1.0.7/lib/bundler.rb:112:inrequire'
/Users/omar/Proyectos/AlfonsoParra/aparra/config/application.rb:7:in <top (required)>' <internal:lib/rubygems/custom_require>:29:inrequire'
internal:lib/rubygems/custom_require:29:in require' /Users/omar/Proyectos/AlfonsoParra/aparra/Rakefile:4:in<top (required)>'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/rake-0.8.7/lib/rake.rb:2383:in load' /Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/rake-0.8.7/lib/rake.rb:2383:inraw_load_rakefile'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/rake-0.8.7/lib/rake.rb:2017:in block in load_rakefile' /Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/rake-0.8.7/lib/rake.rb:2068:instandard_exception_handling'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/rake-0.8.7/lib/rake.rb:2016:in load_rakefile' /Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/rake-0.8.7/lib/rake.rb:2000:inblock in run'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/rake-0.8.7/lib/rake.rb:2068:in standard_exception_handling' /Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/rake-0.8.7/lib/rake.rb:1998:inrun'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/gems/rake-0.8.7/bin/rake:31:in <top (required)>' /Users/omar/.rvm/gems/ruby-1.9.2-p0/bin/rake:19:inload'
/Users/omar/.rvm/gems/ruby-1.9.2-p0/bin/rake:19:in `

'

Sitemap file generation should use binary mode

Hello,

Sitemap generator should use binary mode for file generation since under Windows it will generate a text based file.

A simple change from 'w' to 'wb' fix the problem.

Thank you.

clash with less

There seems to be a clash with the less 1.2.20 on the 'verbose' method. If I run any rake task that makes a reference to verbose I get the following error:

no block given
from /opt/local/lib/ruby/gems/1.8/gems/less-1.2.20/lib/less/ext.rb:28:in `verbose'

Sitemap too big! The uncompressed size exceeds 10Mb

Sitemap generation has been working for weeks. Suddenly today:

...
+ /data/mysite/releases/20100526223320/public/sitemap25.xml.gz
+ /data/mysite/releases/20100526223320/public/sitemap26.xml.gz
+ /data/mysite/releases/20100526223320/public/sitemap27.xml.gz
+ /data/mysite/releases/20100526223320/public/sitemap28.xml.gz
** Sitemap too big! The uncompressed size exceeds 10Mb
+ /data/mysite/releases/20100526223320/public/sitemap29.xml.gz
+ /data/mysite/releases/20100526223320/public/sitemap30.xml.gz
+ /data/mysite/releases/20100526223320/public/sitemap31.xml.gz
+ /data/mysite/releases/20100526223320/public/sitemap32.xml.gz
+ /data/mysite/releases/20100526223320/public/sitemap_index.xml.gz
Sitemap stats: 1,554,832 links, 584m21s

I'm not sure why that would happen, since splitting the sitemaps in to different files is up to sitemap_generator. The maximum length of any url in the sitemap28.xml file is 308 characters.

generation runs twice

I've installed the gem and plugin according to the README. I'm generating eleven sitemaps without error, however they are being generated twice.

Can't modify frozen object when running rake sitemap:refresh

I have moved from using the gem to installing a plugin. Now i'm getting this error. Thank you for your help.

javier@javier-laptop:~/Projects/tareas_site$ rake sitemap:refresh --trace
(in /home/javier/Documents/Projects/Freelancing/OasicTech/tareas_site)
** Invoke sitemap:refresh (first_time)
** Invoke sitemap:create (first_time)
** Invoke environment (first_time)
** Execute environment

** Execute sitemap:create

  • /sitemap1.xml.gz 16 links / 2.7 KB / 422 Bytes gzipped

Sitemap stats: 16 links / 1 files / 0m00s
rake aborted!
can't modify frozen object
/usr/lib/ruby/gems/1.8/gems/sitemap_generator-0.3.3/lib/sitemap_generator/builder/sitemap_file.rb:76:in filesize=' /usr/lib/ruby/gems/1.8/gems/sitemap_generator-0.3.3/lib/sitemap_generator/builder/sitemap_file.rb:76:in<<'
/usr/lib/ruby/gems/1.8/gems/sitemap_generator-0.3.3/lib/sitemap_generator/link_set.rb:93:in new_sitemap' /usr/lib/ruby/gems/1.8/gems/sitemap_generator-0.3.3/lib/sitemap_generator/link_set.rb:110:infinalize!'
/usr/lib/ruby/gems/1.8/gems/sitemap_generator-0.3.3/lib/sitemap_generator/link_set.rb:25:in create' /usr/lib/ruby/gems/1.8/gems/sitemap_generator-0.3.3/tasks/sitemap_generator_tasks.rake:28 /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:636:incall'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:636:in execute' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:631:ineach'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:631:in execute' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:597:ininvoke_with_call_chain'
/usr/lib/ruby/1.8/monitor.rb:242:in synchronize' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:590:ininvoke_with_call_chain'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:607:in invoke_prerequisites' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:604:ineach'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:604:in invoke_prerequisites' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:596:ininvoke_with_call_chain'
/usr/lib/ruby/1.8/monitor.rb:242:in synchronize' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:590:ininvoke_with_call_chain'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:583:in invoke' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2051:ininvoke_task'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2029:in top_level' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2029:ineach'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2029:in top_level' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2068:instandard_exception_handling'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2023:in top_level' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2001:inrun'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:2068:in standard_exception_handling' /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake.rb:1998:inrun'
/usr/lib/ruby/gems/1.8/gems/rake-0.8.7/bin/rake:31
/usr/bin/rake:19:in `load'
/usr/bin/rake:19

ruby 1.8.6 does not have bytesize

I had to change bytesize to .length to make this work in 1.8.6. I think the changes to how strings are handled comes in ruby 1.9

not sure if there is an elegant fix to this to make this plugin compatible across all ruby versions...

Add ping servers

Hi, Karl!

Thank you for awesome gem. I am using it for my project, and have one question :). Is it possible to add other ping servers additional to basic. That will be notified after refresh task.

Thanks,
Vlad

Yahoo Sitemap URL

Might need to update the old url for yahoo sitemap url from:

"http://search.yahooapis.com/SiteExplorerService/V1/ping?sitemap=#{sitemap_index_url}&appid=#{yahoo_app_id}",

to

"http://search.yahooapis.com/SiteExplorerService/V1/updateNotification?appid=#{yahoo_app_id}&url=#{sitemap_index_url}"

The "ping" path now returns "The service has been shut down. For further details, please see the Deprecated Services blog post http://developer.yahoo.com/blogs/ydn/posts/2010/08/api_updates_and_changes"

Where is the value of default_host coming from

I have put the following config:
SitemapGenerator::Sitemap.default_host = "http://cheapr.me"

When I generate the site map, the "loc" is coming up as :
http://cheapr.in/
Check -http://cheapr.me/sitemap.xml
I grep'd but there nowhere there is "cheapr.in" in any of the files in my project (I confess that I had originally planned to have my site - cheapr.in) - so unless there is some spooky mind-reading going on, how is the LOC changed to cheapr.in ?

From where is this domain picked up (and why my default_host is ignored ?)

Pass information to block

Dear gem owner!

Thank you for nice gem. I am using it at few projects and happy but had a problem with one project. I should pass custom information into block to generate sitemap. My application contains a few independent website and should create an independent sitemap for each website. Here is little code that illustrates how I do it:
Website.find(:all).each do |website|
@website_id = website.id
@website = website

puts 'test1 - @website_id ' + @website_id.inspect + " context: " + self.inspect
SitemapGenerator::Sitemap.default_host = @website.domain
SitemapGenerator::Sitemap.public_path = "tmp/"
SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/#{@website.code}"
SitemapGenerator::Sitemap.adapter = SitemapGenerator::WaveAdapter.new
SitemapGenerator::Sitemap.create do
  puts 'test2 - @website_id ' + @website_id.inspect + " context: " + self.inspect
  add "/home"
  add "/welcome"
end

end

and here is output:
test1 - @website_id 1 context: #<struct ExportSitemapJob job_id={:admin_job_id=>22, :website_id=>3}, website_id=nil>
test2 - @website_id nil context: #<SitemapGenerator::Interpreter:0x3605fe8 @linkset=#<SitemapGenerator::LinkSet:0x350ff58 @include_root=true, @include_index=true, @filename=:sitemap, @verbose=false, @default_host="www.example.org", @public_path=#Pathname:XXX/tmp/, @sitemaps_path="sitemaps/wellness", @adapter=, @sitemaps_namer=sitemap1.xml.gz, @added_default_links=false, @interpreter=#<SitemapGenerator::Interpreter:0x3605fe8 ...>, @sitemap=nil, @sitemap_index_namer=sitemap_index.xml.gz, @sitemap_index=nil>>
that illustrate that context and different. I used code like this:

SitemapGenerator::Sitemap.create :yield_sitemap => true do |interpreter|
interpreter.add “/home”
interpreter.add “/welcome”
end
but has result: undefined method `yield_sitemap=' for #
Can I pass some information to block? I think this would be useful for some tasks and would be great to have example at documentation that illustrates how to result this task.

This is questions or request for documentation rather than bug, but I wasn't able to find this category at github bug tracking system.

Sincerely yours,
Artem Rufanov.

Allowing the inclusion of old files for incremental generation of sitemap

@tulios writes:

In my current project we have a huge amount of links and videos to generate sitemap, but the big
problem is that we have a lot of links and videos every day because of that we decide to make the build of the sitemap daily.

Is possible to configure the start index of the namer, however the sitemap index is generated without the old files. I know this
is very specific of my project but I am pulling back, maybe some one have the same problem.

The usage is:

...
group(sitemaps_namer: my_namer, include_all_files: true) do
...
end

We pass an option :include_all_files and the generation of sitemap index will take in count the numbers before the start value.

...
my_namer = SitemapGenerator::SitemapNamer.new("my-sitemap", start: 3)
...
group(sitemaps_namer: my_namer, include_all_files: true) do
...
end

In the index we will have
...

http://site/my-sitemap1.xml.gz

http://site/my-sitemap2.xml.gz

http://site/my-sitemap3.xml.gz

Sitemap for multiple domain

I have a rails 2.3.2 application which handles around 5 domains. I just can't figure out how to use well. I need to create a new rake task to generate sitemap for a domain, put it in a separate folder. I need a rake task to generate sitemap for all domains, put them in a separate folder (for each domain).

Any idea?

Broken gemspec

Is README.md.orig supposed to be in there? I'm getting this error:

sitemap_generator at /...snip.../sitemap_generator-7b26f576c155 did not have a valid gemspec.
This prevents bundler from installing bins or native extensions, but that may not affect its functionality.
The validation message from Rubygems was:
["README.md.orig"] are not files

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.