User-agent: * Disallow: /*?* Disallow: /*? Disallow: /?s= Disallow: /tag/ Disallow: /rss/ Disallow: /feed/ Disallow: /date/ Disallow: /search/ Disallow: /links-page/ Disallow: /archive/ Disallow: /archives/ Disallow: /category/ Disallow: /category/*/* Disallow: /trackback/ Disallow: */trackback Disallow: /contact-form/ Disallow: /page/ Disallow: /pages/ Disallow: */comments Disallow: /comments/ Disallow: /comments/feed/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-content/plugins/ Disallow: /wp-content/themes/ Disallow: /wp-content/cache/ Allow: /wp-content/uploads/ Disallow: /cgi-bin/  # Google Googlebot User-agent: Googlebot Disallow: /feed/$ Disallow: /*/feed/$ Disallow: /*/feed/rss/$ Disallow: /*/trackback/$ Disallow: /*/*/feed/$ Disallow: /*/*/feed/rss/$ Disallow: /*/*/trackback/$ Disallow: /*/*/*/feed/$ Disallow: /*/*/*/feed/rss/$ Disallow: /*/*/*/trackback/$ Disallow: /*.php$ Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.wmv$ Disallow: /*.avi$ Disallow: /*.cgi$ Disallow: /*.txt$  # Google Image User-agent: Googlebot-Image Allow: /*  User-agent: Mediapartners-Google Allow: /  User-agent: Adsbot-Google Allow: /  User-agent: Googlebot-Image Allow: /  User-agent: Googlebot-Mobile Allow: /  User-agent: ia_archiver Disallow: /  User-agent: duggmirror Disallow: /  Sitemap:


这个蜘蛛是Google专门抓取广告主AdWords登陆页面质量得分(landing page quality)的
Googles网页(Google Web Index)和新闻(google news)索引网页蜘蛛
Google图片索引网页蜘蛛(Google image index)
Google无线的索引爬虫(Google Mobile Index)
这个蜘蛛是Google专门抓取广告网站决定AdSense内容(Google Adsense Content)相关性等的专用爬虫

检查Robots.txt的设置可以使用Google网站管理员工具robots分析工具,具体的使用请见google robots说明。

需要注意的是,robots.txt只对遵守规矩的蜘蛛有用,对于一些流氓蜘蛛(见我另一篇关于soso spider爬虫的博客),基本等于没有作用。

jQuery Tools:我们期待已久的内容展示型 Web UI 库 - 基于 COMSHARP CMS

jQuery Tools:我们期待已久的内容展示型 Web UI 库
jQuery Tools 是一套非常优秀的 Web UI 库,包括 Tab 容器,可折叠容器,工具提示,浮动层以及可滚动容器等等,可以为你的站点带来非同寻常的桌面般体验,这套工具的主要作用是显示内容,这是绝多多数站点最需要的东西。这套令人惊异的 UI 库只有 5.59K 大小,基于 MIT 和 GPL 两种许可模式。

jQuery Tools. The missing UI library for the Web

和别的 Web UI 库不同,别的 UI 库很多是面向行为的,如拖放,滚动,表格排序,可拖放窗口等等,它们更适合于 富 Web 应用,如 Email 客户端,任务管理,图片组织整理等。而 jQuery Tools 主要面向内容展示,因此更适用于单纯的内容型网站。

jQuery Tools 使用也很简便,只需几行调用代码即可,其官方站点包含大量演示和调用代码可以参考。以下是该 UI 库中包含的主要 UI 工具介绍。

jQuery Tools / Overlay

Overlay (浮动层)可以用来浮动在任何 HTML 对象的上方,在现代 Web 设计中,浮动层是一种非常有效的 UI 概念,可以用来

  1. 着重显示你的产品。
  2. 显示警示信息。
  3. 提示用户输入。
  4. 以灯箱风格浏览图片库。

jQuery Tools / Overlay 可以很轻松地处理上述各种情形和各种效果。

jQuery Tools / Tooltip

工具提示是 Web 上最实用的工具,Web 默认的提示工具太简陋,jQuery Tools / Tooltip 会以一种非常漂亮的视觉效果显示提示内容。

Tooltips in action

jQuery Tools / Tabs

Web 上最受欢迎的 UI 部件当属 Tab 容器,如果没有 Tab,我们的很多网页不知要多么混乱,所有用户都熟悉 Tab 这种用户友好的部件,jQuery Tools / Tab 非常好用,甚至可以使用浏览器的前进后退导航按钮在 Tab 间前进或后退。

Tabs in action

jQuery Tools / Scrollable

jQuery Tools / Scrollable 可以在网页中实行定制的,局部的滚动,在现代 Web 设计中,这几乎是最受欢迎的一种技术,可以用在下面这些场合:

  • 产品目录
  • 滚动新闻
  • 在表单中实现定制选择框
  • 浏览图片库
  • 视频播放列表
  • 各种站点导航

Scrollable in action

jQuery Tools / Flashembed

如今 JavaScript 真是越来越快了,所有浏览器都在 JavaScript 引擎上较量,我们会看到越来越多的基于 JavaScript 的 Web 部件面世,然而至少在当下, Flash 仍有其用武之地,比如播放视频。

jQuery Tools / Flashembed 可以用来在网页中加载 Flash 对象,虽然类似的工具比比皆是,但 Flashembed 绝对是其中最好用的一个。

15个优秀的第三方 Web 技术集成 - 基于 COMSHARP CMS

15个优秀的第三方 Web 技术集成
在 Web 开发与设计中,事事亲历亲为并非好事,我们经常被告诫不要重复发明轮『子』(don't reinvent the wheel),大而全式的开发不仅是巨大的负担,而且带来更多安全隐患,你毕竟不是所有技术的行家,业界有很多优秀的第三方技术可以借用或者集成,我们必须承认,这些技术比我们自己所能设计的要好得多。本文介绍了15种可以集成到我们的 Web 站点的技术。

1. RSS feeds


在站点中自己设计和管理 RSS 是个巨大的负担,尤其当订阅者的暴增的时候,同时,一些文章聚合站点,如 AllTop 以及 Technorati ,也会对你的 RSS 发起自动访问,总有一天你的服务器会不堪重负而熔掉,下面介绍的3个第三方方案可以帮你卸掉这份负担。


FeedBurner 是一个功能齐全的 RSS 管理服务,它的众多工具可以帮助你很好地管理,分析你的 RSS Feed,被 Google 收购后,该服务曾有短暂的不稳定,不过现在已经完全恢复。

Website Features That You Can Easily Offload



Feedity 非常适合那些非 CMS 类站点,比如,那些纯粹的静态 HTML 站点。你只需输入你需要提供 RSS Feed 的站点的地址,Feedity 就会对该站点进行监控,一旦发现更新,就会向订阅者们推出 RSS 更新,除了自动监控,还可以手工控制该工具所选的 HTML 页面。



Page2RSS 是一个简单的 Web 服务,可以监控指定网页的更新,你可以将这个服务集成到站点首页,以便让访问者知道你所作的更新,这里有一个实例可以参考。和专业的 RSS Feed 相比这个服务可能有些业余,但颇可以用来应急。


2. 站内搜索

站内搜索需要你的站点服务器提供大量的数据库查询操作,这是相当大的一个负担。很多著名搜索引擎提供第三方搜索 API,不仅让你的服务器减轻负担,而且他们的搜索算法显然比你的算法更优秀。

Google AJAX Search API

该 API 允许 Web 开发者使用 Google 的数据设计融合式搜索应用,Google 还提供了一个向导工具,根据提示,你可以一步一步生成相应集成代码放你的站点。

Google AJAX Search API

Yahoo! Search BOSS

和 Google 的搜索 API 类似,但该 API 在结果显示上可以更容易和你的站点集成,另外,和 Google 的结果不同, Yahoo BOSS 的搜索结果中不包含广告, 有一个关于该 API 的实例。

Yahoo! Search BOSS

3. 托管 JavaScript 库

在你自己的站点托管 JavaScript 库,如 jQuery, MooTools, Prototype 不仅对服务器带来管理上的负担,如版本管理问题,而且,由于很多站点使用第三方托管的 JavaScript 库,访问者的浏览器缓存中往往已经包含了这些第三方托管的 JavaScript 库的缓存,使用自己托管的 JavaScript 库将不必要地增加你的网页的反应时间。

Google AJAX Libraries API

Google Ajax 库 API 可以在 Google 的 CDN 体系中提供多个著名 JavaScript 库的托管,意味着访问者可以就近访问到这些 JavaScript 库,非常显著地提升反应速度。

Google AJAX Libraries API

4. Web 表单



Wufoo 可以轻松实现第三方表单的设计和管理,根据不同的用量,他们提供从免费到数百美元的不同收费标准,免费版允许你集成3个表单,每个表单的栏目不超过10个,对多数站点来说,这已经足够了。



这是一个功能强大而齐全的 Web 表单应用,免费版允许你集成3个表单,但每个表单每月使用次数只有10次。








5. 投票调查

下面的第三方投票与调查 Web 服务可以实现非常专业的投票与调查功能。


Vizu 是一个免费的投票 Web 服务,可以轻松集成到 WordPress, Blogger, Typepad 等著名 CMS 或博客系统。



可能是全球最优秀的投票与调查 Web 服务提供商,可以在你的站点使用投票和调查功能,它的设计界面是我用过的最好用的 Web 界面,直接拖放按钮或对象即可。针对不同用量,他们有不同服务价格,不过对多数站点来说,免费版的已经足够用。


这是一个非常受欢迎的 Web 调查服务,管理界面很好用,提供了众多选项设计你的调查问卷,著名站点 Digg 使用的就是该服务。


6. Captcha 技术

Captcha 技术可以很好地拦截 Web 表单发布中的垃圾信息,但自己设计 Captcha 功能却很不易,不仅要有适当的算法,服务器还要管理 Captcha 图像,以下几个 Captcha 服务可以集成到你的站点。


reCaptcha 是一个免费服务,它还有一个使命,就是借助全球用户的参与,帮助识别那些 OCR 技术不太容易识别的旧书,旧报纸。reCaptcha 会从那些扫描的文档中抽取几个单词,用户做 Captcha 测试的同时,也在帮助对旧书刊进行数字化。


该服务是免费的,甚至可以用于商业用户,只要你的服务器支持 PHP, ASP, Perl, Python, JSP, 以及 Ruby on Rails,就可以在你的站点集成该技术。


该技术只需要三步,即可轻松在你的站点实现 Captcha。




Custom Search returns homepage - Custom Search Help

Custom Search returns homepage Report abuse

Level 1
I have 2 sites using the custom search -- one is working fine (; the other ( had been working, but failed at some point. The Search that is not working (on redirects to the site's homepage when the submit button is hit. I did add Google Ananlytics to this site (but not the other) a few weeks ago. Both sites are registered under one account (nrhrehab)

All replies

Level 3
When the browser attempts to open with parameters specified in the URL, the web server issues a (type 302) redirect.  I think you may need to ask your web server administrator or web host tech support for help to correct the server configuration to prevent the redirect.

1. 纽约,美国

  这似乎并不出乎人们的意料。纽约总是排在各种城市榜单的首位,并几近成为国际化都市的典范。该评选最初开始于 2009年4月,当时这座老牌国际化都市正因金融危机的侵袭显得步履蹒跚,一些亚洲城市几乎要偷走纽约的皇冠。然而,经过各项指标的评估――建筑、艺术、文化、商业、餐饮、生活质量和国际形象,纽约还是让人很难找出缺点。

2. 伦敦,英国


3. 巴黎,法国


4. 柏林,德国


5. 巴塞罗那,西班牙




5. 东京,日本


8. 伊斯坦布尔,土耳其


9. 罗马,意大利


9. 悉尼,澳大利亚


Four approaches to getting around argument length limitations on the command line.

At some point during your career as a Linux user, you may have come across the following error:

[user@localhost directory]$ mv * ../directory2 bash: /bin/mv: Argument list too long 

The "Argument list too long" error, which occurs anytime a user feeds too many arguments to a single command, leaves the user to fend for oneself, since all regular system commands (ls *, cp *, rm *, etc...) are subject to the same limitation. This article will focus on identifying four different workaround solutions to this problem, each method using varying degrees of complexity to solve different potential problems. The solutions are presented below in order of simplicity, following the logical principle of Occam's Razor: If you have two equally likely solutions to a problem, pick the simplest.

Method #1: Manually split the command line arguments into smaller bunches.

Example 1

[user@localhost directory]$ mv [a-l]* ../directory2 [user@localhost directory]$ mv [m-z]* ../directory2 

This method is the most basic of the four: it simply involves resubmitting the original command with fewer arguments, in the hope that this will solve the problem. Although this method may work as a quick fix, it is far from being the ideal solution. It works best if you have a list of files whose names are evenly distributed across the alphabet. This allows you to establish consistent divisions, making the chore slightly easier to complete. However, this method is a poor choice for handling very large quantities of files, since it involves resubmitting many commands and a good deal of guesswork.

Method #2: Use the find command.

Example 2

[user@localhost directory]$ find $directory -type f -name '*' -exec mv {} $directory2/. \; 

Method #2 involves filtering the list of files through the find command, instructing it to properly handle each file based on a specified set of command-line parameters. Due to the built-in flexibility of the find command, this workaround is easy to use, successful and quite popular. It allows you to selectively work with subsets of files based on their name patterns, date stamps, permissions and even inode numbers. In addition, and perhaps most importantly, you can complete the entire task with a single command.

The main drawback to this method is the length of time required to complete the process. Unlike Method #1, where groups of files get processed as a unit, this procedure actually inspects the individual properties of each file before performing the designated operation. The overhead involved can be quite significant, and moving lots of files individually may take a long time.

Method #3: Create a function. *

Example 3a

function large_mv () {       while read line1; do                 mv directory/$line1 ../directory2         done } ls -1 directory/ | large_mv 

Although writing a shell function does involve a certain level of complexity, I find that this method allows for a greater degree of flexibility and control than either Method #1 or #2. The short function given in Example 3a simply mimics the functionality of the find command given in Example 2: it deals with each file individually, processing them one by one. However, by writing a function you also gain the ability to perform an unlimited number of actions per file still using a single command:

Example 3b

function larger_mv () {       while read line1; do                 md5sum directory/$line1 >>  ~/md5sums                 ls -l directory/$line1 >> ~/backup_list                 mv directory/$line1 ../directory2         done } ls -1 directory/ | larger_mv 

Example 3b demonstrates how you easily can get an md5sum and a backup listing of each file before moving it.

Unfortunately, since this method also requires that each file be dealt with individually, it will involve a delay similar to that of Method #2. From experience I have found that Method #2 is a little faster than the function given in Example 3a, so Method #3 should be used only in cases where the extra functionality is required.

Method #4: Recompile the Linux kernel. **

This last method requires a word of caution, as it is by far the most aggressive solution to the problem. It is presented here for the sake of thoroughness, since it is a valid method of getting around the problem. However, please be advised that due to the advanced nature of the solution, only experienced Linux users should attempt this hack. In addition, make sure to thoroughly test the final result in your environment before implementing it permanently.

One of the advantages of using an open-source kernel is that you are able to examine exactly what it is configured to do and modify its parameters to suit the individual needs of your system. Method #4 involves manually increasing the number of pages that are allocated within the kernel for command-line arguments. If you look at the include/linux/binfmts.h file, you will find the following near the top:

/*  * MAX_ARG_PAGES defines the number of pages allocated for   arguments  * and envelope for the new program. 32 should suffice, this gives  * a maximum env+arg of 128kB w/4KB pages!  */ #define MAX_ARG_PAGES 32 

In order to increase the amount of memory dedicated to the command-line arguments, you simply need to provide the MAX_ARG_PAGES value with a higher number. Once this edit is saved, simply recompile, install and reboot into the new kernel as you would do normally.

On my own test system I managed to solve all my problems by raising this value to 64. After extensive testing, I have not experienced a single problem since the switch. This is entirely expected since even with MAX_ARG_PAGES set to 64, the longest possible command line I could produce would only occupy 256KB of system memory--not very much by today's system hardware standards.

The advantages of Method #4 are clear. You are now able to simply run the command as you would normally, and it completes successfully. The disadvantages are equally clear. If you raise the amount of memory available to the command line beyond the amount of available system memory, you can create a D.O.S. attack on your own system and cause it to crash. On multiuser systems in particular, even a small increase can have a significant impact because every user is then allocated the additional memory. Therefore always test extensively in your own environment, as this is the safest way to determine if Method #4 is a viable option for you.


While writing this article, I came across many explanations for the "Argument list too long" error. Since the error message starts with "bash:", many people placed the blame on the bash shell. Similarly, seeing the application name included in the error caused a few people to blame the application itself. Instead, as I hope to have conclusively demonstrated in Method #4, the kernel itself is to "blame" for the limitation. In spite of the enthusiastic endorsement given by the original binfmts.h author, many of us have since found that 128KB of dedicated memory for the command line is simply not enough. Hopefully, by using one of the methods above, we can all forget about this one and get back to work.


* All functions were written using the bash shell.

** The material presented in Method #4 was gathered from a discussion on the linux-kernel mailing list in March 2000. See the "Argument List too Long" thread in the linux-kernel archives for the full discussion.