Yahoo Sitemaps
From ThinkLemon
This is a PHP file which will print out a text file containing all the links in your wiki on a separate line. This can then be submitted to Yahoo! like a primitive Google sitemap. This file was originally modified from the Google Sitemap by Elliot (psconclave.com). This will work with 1.5.x and 1.6.x.
[edit]
Instructions
- Copy-paste the source code from below to an empty text file.
- Save the file as yahoo_sitemap.php.
- Upload the script to your root directory e.g http://www.mysite.com/yahoo_sitemap.php.
- Edit your .htaccess file in your root directory (you may need to create this if it doesn't exist yet, add these lines to it
RewriteEngine On RewriteRule ^urllist.txt$ /yahoo_sitemap.php [L]
- Test your yahoo sitemap generator by going to http://www.mysite.com/urllist.txt (example: http://www.psconclave.com/urllist.txt.
[edit]
Sourcecode
<?php
# Mediawiki Yahoo sitemap generator
# Edited from the Google Sitemap Generator
# Edited by the brilliant Elliot Goodrich <rebornfromtheashes@gmail.com>
# .htaccess rewrite must be used because yahoo only accepts urllist.txt
#
# # yahoo rewrite
# RewriteRule ^urllist.txt$ /yahoo_sitemap.php [L]
#
define( 'MEDIAWIKI', true );
require_once( './w/LocalSettings.php' );
require_once( './w/includes/GlobalFunctions.php' );
# -----------------------------------------------------
# Send text header, tell agents this is text file.
# -----------------------------------------------------
header("Content-Type: text/plain; charset=UTF-8");
# -----------------------------------------------------
# Start connection
# -----------------------------------------------------
$connWikiDB = mysql_pconnect($wgDBserver, $wgDBuser, $wgDBpassword)
or trigger_error(mysql_error(),E_USER_ERROR);
mysql_select_db($wgDBname, $connWikiDB);
# -----------------------------------------------------
# Build query
# Skipping redirects and MediaWiki namespace
# -----------------------------------------------------
$query_rsPages = "SELECT page_namespace, page_title, page_touched ".
"FROM ".$wgDBprefix."page ".
"WHERE (page_is_redirect = 0 AND page_namespace NOT IN (8, 9)) ".
"ORDER BY page_touched DESC";
# -----------------------------------------------------
# Fetch the data from the DB
# -----------------------------------------------------
$rsPages = mysql_query($query_rsPages, $connWikiDB) or die(mysql_error());
# Fetch the array of pages
$row_rsPages = mysql_fetch_assoc($rsPages);
$totalRows_rsPages = mysql_num_rows($rsPages);
// Find Project Namespace
if($wgMetaNamespace === FALSE)
$wgMetaNamespace = str_replace( ' ', '_', $wgSitename );
do {
# -----------------------------------------------------
# 1. Determine the pagetitle using namespace:page_name
# -----------------------------------------------------
switch ($row_rsPages['page_namespace']) {
case "1":
$sPageName = "Talk:".$row_rsPages['page_title'];
break;
case 2:
$sPageName = "User:".$row_rsPages['page_title'];
break;
case 3:
$sPageName = "User_talk:".$row_rsPages['page_title'];
break;
case 4:
$sPageName = $wgMetaNamespace.":".$row_rsPages['page_title'];
break;
case 5:
$sPageName = $wgMetaNamespace."_talk:".$row_rsPages['page_title'];
break;
case 6:
$sPageName = "Image:".$row_rsPages['page_title'];
break;
case 7:
$sPageName = "Image_talk:".$row_rsPages['page_title'];
break;
case 8:
$sPageName = "MediaWiki:".$row_rsPages['page_title'];
break;
case 9:
$sPageName = "MediaWiki_talk:".$row_rsPages['page_title'];
break;
case 10:
$sPageName = "Template:".$row_rsPages['page_title'];
break;
case 11:
$sPageName = "Template_talk:".$row_rsPages['page_title'];
break;
case 12:
$sPageName = "Help:".$row_rsPages['page_title'];
break;
case 13:
$sPageName = "Help_talk:".$row_rsPages['page_title'];
break;
case 14:
$sPageName = "Category:".$row_rsPages['page_title'];
break;
case 15:
$sPageName = "Category_talk:".$row_rsPages['page_title'];
break;
default:
$sPageName = $row_rsPages['page_title'];
}
# -----------------------------------------------------
# Start output
# -----------------------------------------------------
print "http://" . $wgServerName . eregi_replace('\$1',$sPageName,$wgArticlePath) . "\n";
} while ($row_rsPages = mysql_fetch_assoc($rsPages));
# -----------------------------------------------------
# Clear Connection
# -----------------------------------------------------
mysql_free_result($rsPages);
?>
[edit]
Changes to the Google Sitemap Generator
- Removed assigning priority to increase speed
- Removed all output except the URL
- Changed the header to output text
