MediaWiki talk:Google Sitemaps
From ThinkLemon
Google Sitemaps script for MediaWiki 1.5.x
Excellent sitemaps script, I have (attempted) to edit it to support the new DB structure of MediaWiki 1.5.x. Could you put it at the bottom of or probably more appropriately at the top of the original v0.1, and credit the http://www.free-wiki-hosting.com service. Thankyou very much ;-)
Of course, this is subject to people beta testing this to check for bugs, email sales@duckcomputing.com to report a bug or ask any questions.
<?php
# -----------------------------------------------------
# v0.2.1? - Now supports 1.5.x (not backwards compatible)
# Written by http://www.thinklemon.com
# Updated to Mediawiki 1.5.x by http://www.free-wiki-hosting.com
# - Subversion adds support for URL encoding for foreign characters.
# -----------------------------------------------------
# -----------------------------------------------------
# Includes
# ...
# Need to include/require some Mediawiki stuff especially LocalSettings.php for definitions.
# -----------------------------------------------------
define( 'MEDIAWIKI', true );
require_once( './LocalSettings.php' );
require_once( 'includes/GlobalFunctions.php' );
# -----------------------------------------------------
# Send XML header, tell agents this is XML.
# -----------------------------------------------------
header("Content-Type: application/xml; charset=UTF-8");
# -----------------------------------------------------
# Send xml-prolog
# -----------------------------------------------------
echo '<'.'?xml version="1.0" encoding="utf-8" ?'.">\n";
# -----------------------------------------------------
# Start connection
# -----------------------------------------------------
$connWikiDB = mysql_pconnect($wgDBserver, $wgDBuser, $wgDBpassword) or trigger_error(mysql_error(),E_USER_ERROR);
mysql_select_db($wgDBname, $connWikiDB);
# -----------------------------------------------------
# Build query
# -----------------------------------------------------
$query_rsPages = "SELECT page_namespace as `cur_namespace`, page_title as `cur_title`, ".
"page_touched as `cur_timestamp`".
"FROM `".$wgDBprefix."page` ".
"WHERE (page_is_redirect = 0 AND page_namespace IN (0, 1, 2, 3, 4, 5, 6, 7, 12, 13, 14, 15)) ".
"ORDER BY page_touched DESC";
# -----------------------------------------------------
# Fetch the data from the DB
# -----------------------------------------------------
$rsPages = mysql_query($query_rsPages, $connWikiDB) or die(mysql_error());
# Fetch the array of pages
$row_rsPages = mysql_fetch_assoc($rsPages);
$totalRows_rsPages = mysql_num_rows($rsPages);
# -----------------------------------------------------
# Start output
# -----------------------------------------------------
?>
<!-- MediaWiki - Google Sitemaps - Alpha -->
<!-- <?php echo $totalRows_rsPages ?> wikipages found. -->
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
<?php
do {
# -----------------------------------------------------
# Determine the pagetitle using namespace:page_name
# -----------------------------------------------------
$row_rsPages['cur_title'] = urlencode($row_rsPages['cur_title']);
switch ($row_rsPages['cur_namespace']) {
case "1":
$sPageName = "Talk:".$row_rsPages['cur_title'];
break;
case 2:
$sPageName = "User:".$row_rsPages['cur_title'];
break;
case 3:
$sPageName = "User_Talk:".$row_rsPages['cur_title'];
break;
case 4:
$sPageName = $wgSitename.':'.$row_rsPages['cur_title'];
break;
case 5:
$sPageName = $wgSitename."_talk:".$row_rsPages['cur_title'];
break;
case 6:
$sPageName = "Image:".$row_rsPages['cur_title'];
break;
case 7:
$sPageName = "Image_talk:".$row_rsPages['cur_title'];
break;
case 12:
$sPageName = "Help:".$row_rsPages['cur_title'];
break;
case 13:
$sPageName = "Help_talk:".$row_rsPages['cur_title'];
break;
case 14:
$sPageName = "Category:".$row_rsPages['cur_title'];
break;
case 15:
$sPageName = "Category_Talk:".$row_rsPages['cur_title'];
break;
default:
$sPageName = $row_rsPages['cur_title'];
}
# -----------------------------------------------------
# Start output
# -----------------------------------------------------
?>
<url>
<loc>http://<?php echo $_SERVER["HTTP_HOST"].eregi_replace('\$1',$sPageName,$wgArticlePath);?></loc>
<lastmod><?php echo fnTimestampToIso($row_rsPages['cur_timestamp']); ?></lastmod>
<changefreq>weekly</changefreq>
<priority><?php if(eregi('Help:',$sPageName)){echo '0.1';}else{echo '0.7';} ?></priority>
</url>
<?php } while ($row_rsPages = mysql_fetch_assoc($rsPages)); ?>
</urlset>
<?php
# -----------------------------------------------------
# Clear Connection
# -----------------------------------------------------
mysql_free_result($rsPages);
# -----------------------------------------------------
# General functions
# -----------------------------------------------------
function fnTimestampToIso($ts) {
# $ts is a MediaWiki Timestamp (TS_MW)
# ISO-standard timestamp (YYYY-MM-DDTHH:MM:SS+00:00)
return gmdate( 'Y-m-d\TH:i:s\+00:00', wfTimestamp( TS_UNIX, $ts ) );
}
?>
- Hi Mark, It seems we were working on the same thing at the same time. I just finished upgrading my installation today and put up a new Sitemaps scripts around the same time as you did. As I'm not aware of the complete inner workings of MediaWiki I was trying to figure out how to use the MediaWiki functions (like namespaces). But as you may have noticed from the new script, I haven't gotten very far. It's a plain update. Anyway, I've upgraded the script for 1.5.x much along the lines as you did. Still it does need some refinement. I'm working on that. I see you've made some alterations I did not. And may want to incorporate... --Caspar 00:50, 5 December 2005 (CET)
- Done an upgrade, incorporating some stuff of yours. --Caspar 01:03, 1 February 2006 (CET)
Errors
My first attempt at running your code produced this error:
XML Parsing Error: no element found
Location: http://ukcider.co.uk/wiki/sitemap.xml.php
Line Number 10, Column 1:
^
- Hi Andy (?), Looks like you are running firefox? :-) If you take look at the Page Source (CTRL+U), you'll see that the script is fetching your pages from the database, but it breaks on an undefined function: wftimestamp(). This function is part of the MediaWiki installation and can be found in GlobalFunctions.php. This puzzled me for a bit only to realise it probably is missing from your installation because you are running an older version of MediaWiki (v1.3.5). Mine is running 1.4.4, which also isn't the latest version... I think upgrading your MediaWiki installation should solve the problem. Upgrading should be smooth when you follow the instructions, but please make a backup first! --Caspar 15:25, 14 Oct 2005 (CEST)
- Hi Casper, thanks for your help. I'm putting off upgrading for a little while but I'll come back and try the sitemap generator again afterwards. Meanwhile, I'm still having trouble with googlebot using up ever increasing bandwidth on a nightly basis, with sitemap and robots.txt not seeming to be helping yet. I have another mediawiki installation to start up soon, so I'll definitely use the latest version for that. --Andy Roberts 06:46, 16 Oct 2005 (CEST)
- Is Google indexing too much of your wiki? If so, you may want to look at '<a href="..." rel="nofollow">' for some links, like the 'edit' and/or 'history'-tabs. (See this page-source) This'll hold back the Googlebot from indexing these not-so-usefull pages. You'll have to edit your 'skin' and selectively make Google NOT index some parts. If you are comfortable with editing a skin you may even want to exclude some pages with '<meta name="robots" content="noindex,nofollow">' in the head-section. Sitemaps is just a way of telling Google what content is new. It won't stop it from indexing unwanted stuff. --213.10.135.44 00:21, 17 Oct 2005 (CEST)
- Yes, Google is starting to index too much - a bit more every night. Thanks for clarifying the purpose of sitemaps and the skin editing tip. I'm hoping some improvements to my robots.txt file will sort it out now, particluarly if I can persuade it to stop issuing requests for all the different combinations of recent changes! http://ukcider.co.uk/robots.txt On the other front, I've sucessfully installed 1.5 so I'll have a go at the sitemaps script for that one once there are a few pages up. --Andy Roberts 11:23, 18 Oct 2005 (CEST)
- Just to confirm that after upgrading to 1.5, the google sitemaps generating script worked fine. Cheers.--Andy Roberts 23:54, 25 Oct 2005 (CEST)
- That's great news. Now there's nothing holding *me* back from upgrading to 1.5. ;-) One more thing, after submiting your sitemap to Google you may want to 'verify' your sitemap (look in your sitemap-panel at Google). After verification you'll get info on whether they were succesfull at retrieving your pages or not. --Caspar 03:08, 26 Oct 2005 (CEST)
Andy
I have a "clean" install of 1.5.2 on my server. 1.5 does not have the "cur" table, hence the script fails... (think you might have inherited the table from a previous version).. could you plz have a look ?
A second question. if i have a robots.txt file with
User-agent: * Disallow: /wiki/
and then submit a google sitemap with the actual "URL"s for the wiki page, will they be indexed or not ? TIA.
Nd.
- On the second question, you get an error report with Google sitemaps which tells you "access restricted by robots.txt" The problem I'm getting with the googlebot is that it seems to take a lot longer to turn around than I thought. Currently it's still listing a load of recent changes pages for mine, although I've had disallow /wiki/ for a couple of weeks now. once it settles down I'm going to change it to disallow /wiki/index.php? and similar in the hope that it will index only the content pages.--Andy Roberts 10:55, 24 Nov 2005 (CET)--
- I see that the 1.5 incarnation of MediaWiki does not have a 'cur' table. I'll upgrade ASAP and figure out where they put the 'current pages'-table. It seems they totally overhauled the database structure (scheme). If you notice any outage here on ThinkLemon... I'm working on it. :-) --Caspar 01:00, 25 Nov 2005 (CET)
- I think I'll start an article about Search Engine Indexing Strategies :-) --Caspar 01:37, 25 Nov 2005 (CET)
- Well the upgrade didn't go as smooth as planned. But I assume to be up and running again. The current pages are indeed in a different database table. I'm experimenting with it. If I succeed, I'll update the article on Google Sitemaps. --Caspar 19:04, 4 December 2005 (CET)
One quick thing in case others get the same error. Someone had used an ampersand in a title and it made the entire thing stall out. I had to go in and move the page to one with a different title with a hyphen in it and delete the original. Just in case anyone runs into the same issue. Other than that the sitemap works great. Thanks! -- Greenguy
- Note-to-self: Do XMLEncoding of pagetitles. Thanks for the tip! Will do some script upgrading soon... --Caspar 13:20, 17 January 2006 (CET)
- @Greenguy: I think I've fixed the & stuff in the pagetitles. So you may want to look at the new version (v0.3) --Caspar 01:07, 1 February 2006 (CET)
Google Adsense
How I could my Google Adsense code in every wiki pages like ThinkLemon. 141.53.194.251 16:22, 25 Nov 2005 (CET)
- Actually the method below breaks the AdSense Ts&Cs since it will still show some pages that don't have content on them (and won't work on sites with only one language). This is a nice method, as can be seen by the ads displayed there, and it can also block one user (i.e. the AdSense "owner") so they don't accidentally click on their own ads! (Also against the Ts&Cs of course...)
- That's an easy one (I think). If you're using the monobook skin, go to your wiki/skin dir. Open up monobook.php. Find:
<div class="portlet" id="p-tb">
<h5><?php $this->msg('toolbox') ?></h5>
... <-- Meaning there's a lot more code here. I removed it for simplification!!!
</div>
<?php if( $this->data['language_urls'] ) { ?><div id="p-lang" class="portlet">
- In between the toolbox and language portlet paste the following:
<div class="portlet" id="p-ad">
<h5>related</h5>
<div class="pBody">
<!-- Insert your Adsense script here -->
</div>
</div>
- If everything went well, you should have a new 'box' on your left hand side with your adsense.
Thanks
Just found this and added it to the site I just put up. I really appreciate the script! I also fudged my apache.conf so that it processes .xml as php's, so that I could have the lovely url: www.balisongtimes.com/sitemap.xml :).
Thanks again!
German pagetitles
Great script, thanks!
I translated the pagetitles for a German wiki:
# -----------------------------------------------------
# 1. Determine the pagetitle using namespace:page_name
# 2. Set priority of the namespace
# -----------------------------------------------------
$nPriority = 0;
switch ($row_rsPages['page_namespace']) {
case "1":
$sPageName = "Diskussion:".$row_rsPages['page_title'];
$nPriority = 0.9;
break;
case 2:
$sPageName = "Benutzer:".$row_rsPages['page_title'];
$nPriority = 0.7;
break;
case 3:
$sPageName = "Benutzer_Diskussion:".$row_rsPages['page_title'];
$nPriority = 0.6;
break;
case 4:
$sPageName = $wgMetaNamespace.":".$row_rsPages['page_title'];
$nPriority = 0.9;
break;
case 5:
$sPageName = $wgMetaNamespace."_Diskussion:".$row_rsPages['page_title'];
$nPriority = 0.8;
break;
case 6:
$sPageName = "Bild:".$row_rsPages['page_title'];
$nPriority = 0.5;
break;
case 7:
$sPageName = "Bild_Diskussion:".$row_rsPages['page_title'];
$nPriority = 0.4;
break;
case 8:
$sPageName = "MediaWiki:".$row_rsPages['page_title'];
$nPriority = 0.4;
break;
case 9:
$sPageName = "MediaWiki_Diskussion:".$row_rsPages['page_title'];
$nPriority = 0.3;
break;
case 10:
$sPageName = "Vorlage:".$row_rsPages['page_title'];
$nPriority = 0.3;
break;
case 11:
$sPageName = "Vorlage_Diskussion:".$row_rsPages['page_title'];
$nPriority = 0.2;
break;
case 12:
$sPageName = "Hilfe:".$row_rsPages['page_title'];
$nPriority = 0.1;
break;
case 13:
$sPageName = "Hilfe_Diskussion:".$row_rsPages['page_title'];
$nPriority = 0.1;
break;
case 14:
$sPageName = "Kategorie:".$row_rsPages['page_title'];
$nPriority = 0.6;
break;
case 15:
$sPageName = "Kategorie_Diskussion:".$row_rsPages['page_title'];
$nPriority = 0.5;
break;
default:
$sPageName = $row_rsPages['page_title'];
$nPriority = 1;
}
84.137.193.13 01:21, 9 September 2006 (CEST) (Heiner Otterstedt),
- Thank you Heiner! If people have installed a localized version of MediaWiki you've showed them how to change things around. Nice. --Caspar 00:44, 26 September 2006 (CEST)
Error: I get the following
This XML file does not appear to have any style information associated with it. The document tree is shown below.
I get this from your site and from mine. Is this a browser bloblem or some other thing?
- Hi there whats-your-name, not to worry! It's a Firefox thing. What you are viewing in your browser is XML. And XML isn't really meant to be viewed in a browser, other than for viewing that the output is correct. If you're getting this message it means everythings OK. It's a way of Firefox telling you that it can't translate the XML into something meaningfull for humans (through an XSL or CSS stylesheet).
- If this is what you are viewing in Firefox, try and look at it in Internet Explorer and you'll get something equally confusing. :-) Anyway, just sign up for a Google Sitemaps account and submit your sitemap. Google will tell you if it's wrong or not, and more. --Caspar 00:42, 26 September 2006 (CEST)
Problem with Google Sitemaps
the google sitemaps tool tells me that the sitemap has an error when i open the sitemap in Firefox 2 i get the following: "XML declaration not at the beginning of external Entity" in German: "XML-Deklaration nicht am Beginn von externer Entität" Error in row Nr. 13, position 1
My Sitemap: http://www.northwoodcycling.com/wiki/sitetest.xml
kind regards Matthias
- Hi Matthias, you've got some extra white-space/lines before the xml-declaration (<?xml ... ?>). Try and remove that and it should validate as XML again. Maybe you've inserted some stuff before the actual script. Firefox (and Google) are really picky when XML doesn't start with the declaration right on the first line. --Caspar 18:00, 30 January 2007 (CET)
- Thank you Caspar. But i use the original non modified php script .. i changed really nothing. Maybe it depends on the PHP on the server? http://www.northwoodcycling.com/wiki/info.php. I googled that error and found several issues with web cms systems like joomla togehther with rss feeds .. but all German spoken .. and they did not really help me. Could it be that the php interpreter adds some kind of (XML or XHTML) declaration before the start of the script? thank you Matthias
- It looks like something (like a CMS) is doing stuff before the Sitemaps-script runs. From a quick look I cannot see what that could be in your case. Maybe it could be a setting in your PHP-installation. Or ... --Caspar 13:27, 13 February 2007 (CET)
- This is also happening in MediaWiki 1.10.0. A newline is being inserted at the top, before the XML declaration. --Neurophyre 23:00, 28 August 2007 (CEST)
- Update: I've isolated this problem to the Bad Behavior antispam plugin. I've disabled it until I can determine if there's a bugfix. --Neurophyre 23:06, 28 August 2007 (CEST)
Same URL
I am trying to use this with MW 1.9.2 and I just get a whole bunch of URL's that are exactly the same.
My file - http://travwiki.byethost33.com/wiki/sitemap.xml.php
--24.183.108.96 22:38, 8 February 2007 (CET)
- That's odd. But I think the MediaWiki people changed something in the database (again). It looks like the script cannot gather the pagenames and just returns your domainname instead. I haven't gotten to upgrading MediaWiki so I can't tell exactly what's going wrong. If you're so inclined to you could update the script yourself or wait untill I upgrade (not so soon). --Caspar 13:21, 13 February 2007 (CET)
I'm running MW 1.9.2 and I was able to get the sitemap working without a problem. --72.145.132.82 19:00, 10 March 2007 (CET)
- Ive got same problem on MW 1.10.0 (on 3 different sites...). --84.154.220.120 10:11, 14 May 2007 (CEST)
Ive got the problem solved. New versions of mediawiki dont ad these lines to the localsettings:
$wgScript = "$wgScriptPath/index.php"; $wgRedirectScript = "$wgScriptPath/redirect.php"; $wgArticlePath = "$wgScript/$1"; $wgStylePath = "$wgScriptPath/skins"; $wgStyleDirectory = "$IP/skins"; $wgLogo = "$wgStylePath/common/images/wiki.png"; $wgUploadPath = "$wgScriptPath/images"; $wgUploadDirectory = "$IP/images";
As soon as you ad these commands the sitemap will work again. I dont know if you nedd all of them but I got it working on FlyerWiki.net with MW 1.10.0. I hope this will solve the problems of MW 1.9.2 . --84.154.220.120 15:05, 14 May 2007 (CEST)
- I am using Mediawiki 1.11.10 and when I used the sitemap script all I got was a list of folder references but without thhe documents. I added $wgUsePathInfo= "true"; $wgScript= "$wgScriptPath/index.php"; $wgArticlePath="$wgScript/$1"; to the local settings file and tried again and the sitemap was successfully generated.
Keveen2 22:57, 4 October 2007 (CEST)
This is designed to work only with short/pretty urls. In order to get this to work you should follow this link. This is the only web page I have found that works for my site.
PHP Notice: Use of undefined constant ... in apache error.log
Please add
require_once( './w/includes/Defines.php' );
before line 23
require_once( './LocalSettings.php' );
. Some constants are probably used inside LocalSettings.php. --147.229.5.124 16:37, 3 April 2007 (CEST)
- I don't see errors in my log, but that could be attributed to many different factors (MW version, PHP settings, ...). Although I must mention that the
./w/-part in your addition may depend on where you've put the script and where your installation of MediaWiki resides. If it's in your wiki-dir I guessrequire_once( './includes/Defines.php' );would do it for you. One day I'll do an upgrade... ;-) *extension anyone?* Thanks for mentioning though! --Caspar 00:36, 5 April 2007 (CEST)
Unicode Page Title Garbled
Page Title ??????
<loc>http://www.example.com/wiki/??????</loc>
Line 60 insert
mysql_query("SET NAMES utf8", $connWikiDB) or die(mysql_error());
Tanks.--222.225.40.100 17:03, 5 June 2007 (CEST)
- Alternatively change line 161 to:
-
<loc><?php echo fnXmlEncode( "http://" . $wgServerName . regi_replace('\$1',utf8_encode($sPageName),$wgArticlePath) ) ?></loc> - This is adding utf8_encode() to ensure the pagename output is UTF8 encoded. Graeme 03:14, 28 April 2008 (CEST)
Fully automated sitemap script
I modified your script so that it generates the Namespacenames automatically by using the in LocalSettings.php given language. I also had to add some cashing related things to make this work - but since i have no idea on how the cashingsystem in mediawiki operates this may not be the best solution.
you can customize the $aPriorities Array for Priorities per Namespace
<?php
# -----------------------------------------------------
# MediaWiki - Google Sitemaps generation. v0.3
#
# A page that'll generate valid Google Sitemaps code
# from the current MediaWiki installation.
# v0.3: Small changes to fix others situations
# v0.2: Updated for MediaWiki 1.5.x
# v0.1: First attempt for MediaWiki 1.4.x
#
# See http://www.thinklemon.com/wiki/MediaWiki:Google_Sitemaps
#
# TODO: Further refinements like caching...
# -----------------------------------------------------
# -----------------------------------------------------
# Includes
# Need to include/require some Mediawiki stuff
# especially LocalSettings.php for definitions.
# -----------------------------------------------------
define( 'MEDIAWIKI', true );
$mediawiki = "/www/htdocs/w009b8fa/mediawiki-1.12.0";
require_once( "$mediawiki/LocalSettings.php" );
require_once( "$mediawiki/includes/GlobalFunctions.php" );
require_once( "$mediawiki/includes/Defines.php" );
require_once( "$mediawiki/includes/Namespace.php" );
require_once( "$mediawiki/languages/Language.php" );
require_once( "$mediawiki/includes/ProfilerSimple.php");
require_once( "$mediawiki/includes/ObjectCache.php");
# -----------------------------------------------------
# init the Cash and the Profiler
# -----------------------------------------------------
$wgProfiler = new ProfilerSimple(); // dont know what the profiler does ;-)
$wgMemc =& wfGetMainCache();
$messageMemc =& wfGetMessageCacheStorage();
$parserMemc =& wfGetParserCacheStorage();
# -----------------------------------------------------
# Create a Language Object for the current Langugage
# and get the namespaces names
# -----------------------------------------------------
$lang = Language::factory( $wgLanguageCode );
$namespaces = $lang->getNamespaces();
# -----------------------------------------------------
# Send XML header, tell agents this is XML.
# -----------------------------------------------------
header("Content-Type: application/xml; charset=UTF-8");
# -----------------------------------------------------
# Send xml-prolog
# -----------------------------------------------------
echo '<'.'?xml version="1.0" encoding="utf-8" ?'.">\n";
# -----------------------------------------------------
# Start connection
# -----------------------------------------------------
$connWikiDB = mysql_pconnect($wgDBserver, $wgDBuser, $wgDBpassword)
or trigger_error(mysql_error(),E_USER_ERROR);
mysql_select_db($wgDBname, $connWikiDB);
# -----------------------------------------------------
# Build query
# Skipping redirects and MediaWiki namespace
# -----------------------------------------------------
$query_rsPages = "SELECT page_namespace, page_title, page_touched ".
"FROM ".$wgDBprefix."page ".
"WHERE (page_is_redirect = 0 AND page_namespace NOT IN (8, 9)) ".
"ORDER BY page_touched DESC";
# -----------------------------------------------------
# Fetch the data from the DB
# -----------------------------------------------------
$rsPages = mysql_query($query_rsPages, $connWikiDB) or die(mysql_error());
# Fetch the array of pages
$row_rsPages = mysql_fetch_assoc($rsPages);
$totalRows_rsPages = mysql_num_rows($rsPages);
# -----------------------------------------------------
# Start output
# -----------------------------------------------------
?>
<!-- MediaWiki - Google Sitemaps - v0.3 -->
<!-- <?php echo $totalRows_rsPages ?> wikipages found. -->
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
<?php
// Find Project Namespace
if($wgMetaNamespace === FALSE)
$wgMetaNamespace = str_replace( ' ', '_', $wgSitename );
do {
# -----------------------------------------------------
# 1. Determine the pagetitle using namespace:page_name
# 2. Set priority of the namespace
# -----------------------------------------------------
$aPriorities = array(
"Category" => 1.0,
"" => 0.9 );
$nPriority = 0;
$sPageNameSpace = $namespaces[ $row_rsPages['page_namespace'] ];
$sPageName = ($sPageNameSpace===""?"":$sPageNameSpace.":") . $row_rsPages['page_title'];
if ( isset($aPriorities[$sPageNameSpace]) )
$nPriority = $aPriorities[$sPageNameSpace];
# -----------------------------------------------------
# Start output
# -----------------------------------------------------
?>
<url>
<loc><?php echo fnXmlEncode( "http://" . $wgServerName . eregi_replace('\$1',$sPageName,$wgArticlePath) ) ?></loc>
<lastmod><?php echo fnTimestampToIso($row_rsPages['page_touched']); ?></lastmod>
<changefreq>weekly</changefreq>
<priority><?php echo $nPriority ?></priority>
</url>
<?php } while ($row_rsPages = mysql_fetch_assoc($rsPages)); ?>
</urlset>
<?php
# -----------------------------------------------------
# Clear Connection
# -----------------------------------------------------
mysql_free_result($rsPages);
# -----------------------------------------------------
# General functions
# -----------------------------------------------------
// Convert timestamp to ISO format
function fnTimestampToIso($ts) {
# $ts is a MediaWiki Timestamp (TS_MW)
# ISO-standard timestamp (YYYY-MM-DDTHH:MM:SS+00:00)
return gmdate( 'Y-m-d\TH:i:s\+00:00', wfTimestamp( TS_UNIX, $ts ) );
}
// Convert string to XML safe encoding
function fnXmlEncode( $string ) {
$string = str_replace( "\r\n", "\n", $string );
$string = preg_replace( '/[\x00-\x08\x0b\x0c\x0e-\x1f]/', '', $string );
return htmlspecialchars( $string );
}
?>
-- Christoph
Reduce output
Thanks a lot for that awesome Extension! As my sitemap is getting too big I am wondering if it is possible to reduce the amount of sites listed in the sitemap? Thanks a lot in advance.
- No one ?
