ThinkLemon

I think there used to be a tagline about (tag)soup here… anyway…


Also visit:


MediaWiki:Google Sitemaps

!!! The wiki no longer exists. This page is only remaining for archival purposes.

Currently there is no official automatic [[Google Sitemaps|Sitemaps]] generation for MediaWiki installations. So the ThinkLemon wiki uses a custom script to build a Sitemap file from the Wiki database.

== What it does ==

This [[PHP]] script gathers information from the MediaWiki installation and fetches the page titles, namespaces and timestamps from the database. It then outputs the page collection to the required [[XML]] format.

See [http://www.thinklemon.com/wiki/sitemap.xml.php sitemap.xml.php] for the results of this wiki.

”’Update”’: This version (v0.3) has been updated to work with MediaWiki 1.5.x installations. It does NOT work with 1.3.x and 1.4.x versions of MediaWiki. Nor is it guaranteed to work with future versions.

See the [http://www.thinklemon.com/wiki/index.php?title=MediaWiki:Google_Sitemaps&oldid=1252 November 25, 2005 version of this page] if you are running a 1.4.x version of MediaWiki

== Instructions ==

# Copy-paste the source code from below to an empty text file.

# Save the file as sitemap.xml.php.

# Upload the script to the dir containing your MediaWiki installation.

# Test the script by calling it from a browser.

If you are certain the output is correct then apply the sitemap to [[Google Sitemaps]]

Note that Google is picky on placement of the script. It has to be in the path as were the to index pages are.

Please put Q&A in the [[MediaWiki talk:Google Sitemaps|discussion page]].

== Sourcecode ==

Version 0.3

< ?php
# -----------------------------------------------------
# MediaWiki - Google Sitemaps generation. v0.3
#
# A page that'll generate valid Google Sitemaps code
# from the current MediaWiki installation.
# v0.3: Small changes to fix others situations
# v0.2: Updated for MediaWiki 1.5.x
# v0.1: First attempt for MediaWiki 1.4.x
#
# See http://www.thinklemon.com/wiki/MediaWiki:Google_Sitemaps
#
# TODO: Further refinements like caching...
# -----------------------------------------------------

# -----------------------------------------------------
# Includes
# Need to include/require some Mediawiki stuff
# especially LocalSettings.php for definitions.
# -----------------------------------------------------

define( 'MEDIAWIKI', true );

require_once( './LocalSettings.php' );
require_once( 'includes/GlobalFunctions.php' );

# -----------------------------------------------------
# Send XML header, tell agents this is XML.
# -----------------------------------------------------

header("Content-Type: application/xml; charset=UTF-8");

# -----------------------------------------------------
# Send xml-prolog
# -----------------------------------------------------

echo '<'.'?xml version="1.0" encoding="utf-8" ?'.">\n"; 

# -----------------------------------------------------
# Start connection
# -----------------------------------------------------

$connWikiDB = mysql_pconnect($wgDBserver, $wgDBuser, $wgDBpassword)
	or trigger_error(mysql_error(),E_USER_ERROR);
mysql_select_db($wgDBname, $connWikiDB);

# -----------------------------------------------------
# Build query
# Skipping redirects and MediaWiki namespace
# -----------------------------------------------------

$query_rsPages = "SELECT page_namespace, page_title, page_touched ".
	"FROM ".$wgDBprefix."page ".
	"WHERE (page_is_redirect = 0 AND page_namespace NOT IN (8, 9)) ".
	"ORDER BY page_touched DESC";

# -----------------------------------------------------
# Fetch the data from the DB
# -----------------------------------------------------

$rsPages = mysql_query($query_rsPages, $connWikiDB) or die(mysql_error());
# Fetch the array of pages
$row_rsPages = mysql_fetch_assoc($rsPages);
$totalRows_rsPages = mysql_num_rows($rsPages);

# -----------------------------------------------------
# Start output
# -----------------------------------------------------

?>



< ?php  // Find Project Namespace if($wgMetaNamespace === FALSE) 	$wgMetaNamespace = str_replace( ' ', '_', $wgSitename ); do {  	# ----------------------------------------------------- 	# 1. Determine the pagetitle using namespace:page_name 	# 2. Set priority of the namespace 	# ----------------------------------------------------- 	 	$nPriority = 0; 	switch ($row_rsPages['page_namespace']) { 		case "1": 			$sPageName = "Talk:".$row_rsPages['page_title']; 			$nPriority = 0.9; 			break; 		case 2: 			$sPageName = "User:".$row_rsPages['page_title']; 			$nPriority = 0.7; 			break; 		case 3: 			$sPageName = "User_talk:".$row_rsPages['page_title']; 			$nPriority = 0.6; 			break; 		case 4: 			$sPageName = $wgMetaNamespace.":".$row_rsPages['page_title']; 			$nPriority = 0.9; 			break; 		case 5: 			$sPageName = $wgMetaNamespace."_talk:".$row_rsPages['page_title']; 			$nPriority = 0.8; 			break; 		case 6: 			$sPageName = "Image:".$row_rsPages['page_title']; 			$nPriority = 0.5; 			break; 		case 7: 			$sPageName = "Image_talk:".$row_rsPages['page_title']; 			$nPriority = 0.4; 			break; 		case 8: 			$sPageName = "MediaWiki:".$row_rsPages['page_title']; 			$nPriority = 0.4; 			break; 		case 9: 			$sPageName = "MediaWiki_talk:".$row_rsPages['page_title']; 			$nPriority = 0.3; 			break; 		case 10: 			$sPageName = "Template:".$row_rsPages['page_title']; 			$nPriority = 0.3; 			break; 		case 11: 			$sPageName = "Template_talk:".$row_rsPages['page_title']; 			$nPriority = 0.2; 			break; 		case 12: 			$sPageName = "Help:".$row_rsPages['page_title']; 			$nPriority = 0.1; 			break; 		case 13: 			$sPageName = "Help_talk:".$row_rsPages['page_title']; 			$nPriority = 0.1; 			break; 		case 14: 			$sPageName = "Category:".$row_rsPages['page_title']; 			$nPriority = 0.6; 			break; 		case 15: 			$sPageName = "Category_talk:".$row_rsPages['page_title']; 			$nPriority = 0.5; 			break; 		default: 			$sPageName = $row_rsPages['page_title']; 			$nPriority = 1; 	} # ----------------------------------------------------- # Start output # ----------------------------------------------------- ?>

		< ?php echo fnXmlEncode( "http://" . $wgServerName . eregi_replace('\$1',$sPageName,$wgArticlePath) ) ?>
		< ?php echo fnTimestampToIso($row_rsPages['page_touched']); ?>
		weekly
< ?php echo $nPriority ?>

< ?php } while ($row_rsPages = mysql_fetch_assoc($rsPages)); ?>

< ?php # ----------------------------------------------------- # Clear Connection # ----------------------------------------------------- mysql_free_result($rsPages); # ----------------------------------------------------- # General functions # ----------------------------------------------------- // Convert timestamp to ISO format function fnTimestampToIso($ts) { 	# $ts is a MediaWiki Timestamp (TS_MW) 	# ISO-standard timestamp (YYYY-MM-DDTHH:MM:SS+00:00) 	return gmdate( 'Y-m-d\TH:i:s\+00:00', wfTimestamp( TS_UNIX, $ts ) ); } // Convert string to XML safe encoding function fnXmlEncode( $string ) { 	$string = str_replace( "\r\n", "\n", $string ); 	$string = preg_replace( '/[\x00-\x08\x0b\x0c\x0e-\x1f]/', '', $string ); 	return htmlspecialchars( $string ); } ?>

== Version History ==

* V0.3: Script updated for other install situations.

:* Altered SQL Statement to exclude namespaces instead of include.

:* Added Template-namespaces and per-namespace priority.

:* Added XML encoding of pagetitles and correct path.

* V0.2: Script updated for MediaWiki 1.5.x as the database schema changed. The ‘cur’ table has moved to the ‘page’ table.

* V0.1: first attempt at Google Sitemaps for MediaWiki 1.4.x.

== Disclaimer ==

This script is provided as-is. It is not guaranteed to work at other webservers other than the ThinkLemon.com domain.

ThinkLemon.com cannot be held liable for loss of data, crashing servers, loss of business, loss of whatever. Take the script, TEST it and change it to make it work for you, again TEST it.

[[Category:MediaWiki|Google Sitemaps]] [[Category:Techniques]]

Related Articles

MediaWiki and Google Sitemaps, the script
I have put the Sitemap script I put together for my MediaWiki installation online. You've asked for it so here it is. :-) MediaWiki:Google Sitemaps Please follow the instructions and please take note of the Disclaimer. Feedback...
MediaWiki & Google Sitemaps update
Just to let you know. I've done an update on the MediaWiki:Google Sitemaps script. So if you're running a MediaWiki installation take a look at the script....
MediaWiki and Google Sitemaps
As I could not find an extension or script specifically for MediaWiki that would automatically deploy a Google Sitemap, I thought, why not build one myself? And in the process try and learn some PHP...

Search

Are you looking for:

Recent Comments

  • Bob Riccardo: Their are also 2 structures just west of Antsely Madagascar.
  • Bob Riccardo: To anyone. What is the structure at the southeastern end of...
  • Bob Riccardo: To anyone. What is the structure at the southeastern end of...
  • Wrecked Reviews: Awesome, this worked wonders! Thanks for posting this!
  • Chip Woods: IMO Chixulub is the impact crater for the theory called...

Recent Articles

Archives

Meta information


ThinkLemon is proudly powered by pure will-power, determination and lack of direction in general. Furthermore, it sits on a piece of hardware, I have no clue where, that somehow manages to support a webserver, a scripting language, a database and therefore ... a tool to fill this space.

Entries (RSS) | Comments (RSS).