Friday, July 25, 2008

Concept|软件工程师不可不知的10个概念

软件工程师不可不知的10个概念 - 基于 COMSHARP CMS

软件工程师不可不知的10个概念
Label作者: its|发布: 2008-7-24 (7:36)|阅读: 268|评论: 0

出色的软件工程师善用设计模式,勤于代码重构,编写单元测试,并对简单有宗教般的追求。除了这些,优秀的软件工程师还要通晓10个概念,这10个概念超越了编程语言与设计模式,软件工程师应当从更广的范围内明白这些道理:

  1. 界面 (Interfaces )
  2. 惯例与模板 (Conventions and Templates)
  3. 分层 (Layering )
  4. 算法的复杂性 (Algorithmic Complexity)
  5. 散列法 (Hashing )
  6. 缓存 (Caching )
  7. 并发 (Concurrency )
  8. 云计算(Cloud Computing )
  9. 安全(Security )
  10. 关系数据库 (Relational Databases )

  

10. 关系数据库 (Relational Databases)

关系数据库因为在大规模 Web 服务上缺乏可扩充性而颇受微词,然而,关系数据库仍然是近20年来计算机技术中最伟大的成就。关系数据库对处理订单,公司数据方面有着出色的表现。

关系数据库的核心是以记录表示数据,记录存放在数据库表,数据库使用查询语言(SQL)对数据进行搜索与查询,同时,数据库对各个数据表进行关联。

数据库的标准化技术(normalization)讲的是使用正确的方式对数据进行分存以降低冗余,并加快存取速度。

 

9. 安全 (Security)

随着黑客的崛起与数据敏感性的上升,安全变得非常重要。安全是个广义的概念,涉及验证,授权与信息传输。

验证是对用户的身份进行检查,如要求用户输入密码。验证通常需要结合 SSL (secure socket layer)进行;授权在公司业务系统中非常重要,尤其是一些工作流系统。最近开发的 OAuth 协议可以帮助 Web 服务将相应信息向相应用户开放。Flickr 便使用这种方式管理私人照片和数据的访问权限。

另外一个安全领域是网络设防,这关系到操作系统,配置与监控。不仅网络危险重重,任何软件都是。Firefox 被称为最安全的浏览器,仍然需要频频发布安全补丁。要为你的系统编写安全代码就需要明白各种潜在的问题。

 

8. 云计算 (Cloud Computing)

RWW 最近的关于云计算的文章 Reaching For The Sky Through Compute Clouds 讲到了云计算如何改变大规模 Web 应用的发布。大规模的并行,低成本,与快速投入市场。

并行算法发明以来,首先迎来的是网格计算,网格计算是借助空闲的桌面计算机资源进行并行计算。最著名的例子是 Berkley 大学的 SETI@home 计划,该计划使用空闲的 CPU 资源分析太空数据。金融机构也大规模实施网格计算进行风险分析。空闲的资源,加上 J2EE 平台的崛起,迎来了云计算的概念:应用服务虚拟化。就是应用按需运行,并可以随着时间和用户规模而实时改变。

云计算最生动的例子是 Amazon 的 Web 服务,一组可以通过 API 进行调用的应用,如云服务(EC2),一个用来存储大型媒体文件的数据库(S3),索引服务(SimpleDB),序列服务(SQS)。

 

7. 并发 (Concurrency)

并发是软件工程师最容易犯错的地方,这可以理解,因为我们一直遵从线形思维,然而并发在现代系统中非常重要。

并发是程序中的并行处理,多数现代编程语言包含内置的并发能力,在 Java,指的是线程。关于并发,最经典的例子是"生产/消费"模式,生产方生产数据和任务,并放入工作线程消费或执行。并发的复杂性在于,线程需要经常访问共同数据,每个线程都有自己的执行顺序,但需要访问共同数据。Doug Lea 曾写过一个最复杂的并发类,现在是 core Java 的一部分。

6. 缓存(Caching)

缓存对现代 Web 程序不可或缺,缓存是从数据库取回,并存放在内存中的数据。因为数据库直接存取的代价非常高,将数据从数据库取回并放在缓存中访问就变得十分必要。比如,你有一个网站,要显示上周的畅销书,你可以从数据将畅销书榜一次性取回放在缓存中,而不必在每次访问时都去数据库读数据。

缓存需要代价,只有最常用的内容才可以放入缓存。很多现代程序,包括 Facebook,依靠一种叫做 Memcached 的分布式缓存系统,该系统是 Brad Firzpatrick 在工作于 LiveJournal 项目时开发的,Memcached 使用网络中空闲的内存资源建立缓存机制,Memcached 类库在很多流行编程语言,包括 Java 和 PHP 中都有。

5. 散列法(Hashing)

Hashing 的目的是加速访问速度。如果数据是序列存储的,从中查询一个项的时间取决于数据列的大小。而散列法对每一个项计算一个数字作为索引,在一个好的 Hashing 算法下,数据查找的速度是一样的。

除了存储数据,散列法对分布式系统也很重要。统一散列法(uniform hash )用来在云数据库环境下,在不同计算机之间分存数据。Google 的索引服务就是这种方法的体现,每一个 URL 都被散列分布到特定计算机。

散列函数非常复杂,但现代类库中都有现成的类,重要的是,如何对散列法进行细调以获得最好的性能。

4. 算法的复杂性 (Algorithmic Complexity)

关于算法的复杂性,软件工程师需要理解这样几件事。第一,大O标记法(big O notation);第二,你永远都不应该使用嵌套式循环(循环里面套循环),你应该使用 Hash 表,数组或单一循环;第三,如今优秀类库比比皆是,我们不必过分纠缠于这些库的效能的差别,我们以后还有机会进行细调;最后,不要忽视算法的优雅及性能,编写紧凑的,可读的代码可以让你的算法更简单,更干净。

 

3. 分层 (Layering)

用分层来讨论软件架构是最容易的。John Lakos 曾出版过一本关于大型 C++ 系统的书。Lakos 认为软件包含了层,书中介绍了层的概念,方法是,对每个软件组件,数一下它所依赖的组件数目就可以知道它的复杂程度。

Lakos 认为,一个好的软件拥有金字塔结构,就是说,软件组件拥有层层积累的复杂度,但每个组件本身必须简单,一个优秀的软件包含很多小的,可重复使用的模块,每个模块有自己的职责。一个好的系统中,组件之间的依赖性不可交叉,整个系统是各种各样的组件堆积起来,形成一个金字塔。

Lakos 在软件工程的很多方面都是先驱,最著名的是 Refactoring (代码重构)。代码重构指的是,在编程过程中需要不断地对代码进行改造以保证其结构的健壮与灵活。

 

2. 惯例与模板 (Conventions and Templates)

命名惯例和基础模板在编程模式中常被忽视,然而它可能是最强大的方法。命名惯例使软件自动化成为可能,如,Java Beans 框架在 getter 和 setter 方法中,使用简单的命名惯例。del.icio.us 网站的 URL 命名也使用统一的格式,如 http://del.icio.us/tag/software 会将用户带到所有标签为 software 的页。

很多社会网络均使用简单命名,如,你的名字是 johnsmith ,那你的头像可能命名为 johnsmith.jpg,而你的 rss 聚合文件的命名很可能是 johnsmith.xml 。

命名惯例还用于单元测试,如,JUnit 单元测试工具会辨认所有以 test 开头的类。

我们这里说的模板(templates )指的并不是  C++ 或 Java 语言中的 constructs,我们说的是一些包含变量的模板文件,用户可以替换变量并输出最终结果。

Cold Fusion 是最先使用模板的程序之一,后来,Java 使用 JSP 实现模板功能。Apache 近来为 Java 开发了非常好用的通用模板, Velocity。PHP 本身就是基于模板的,因为它支持 eval 函数。

1. 界面(Interfaces)

软件工程中最重要的概念是界面。任何软件都是一个真实系统的模型。如何使用简单的用户界面进行模型化至关重要。很多软件系统走这样的极端,缺乏抽象的冗长代码,或者过分设计而导致无谓的复杂。

在众多软件工程书籍中,Robert Martin 写的《敏捷编程》值得一读。

关于模型化,以下方法对你会有帮助。首先,去掉那些只有在将来才可能用得着的方法,代码越精练越好。第二,不要总认为以前的东西是对的,要善于改变。第三,要有耐心并享受过程。

Sunday, July 20, 2008

xargs|Linux command usage

Use this to copy a list of files to another dir
WSGPD1@wgwprddbsl001 $ find ../../NewPurge/data/ -ctime +3 -print |xargs -i mv {} .

Sunday, June 1, 2008

Django|django中如何国际化

百博 - django中如何国际化..

django中如何国际化你的页面
bluecrystal   2007-11-05  

作者: bluecrystal  链接:http://bluecrystal.javaeye.com/blog/138106  发表时间: 2007年11月05日

声明:本文系JavaEye网站发布的原创博客文章,未经作者书面许可,严禁任何网站转载本文,否则必将追究法律责任!

在web开发中经常会根据不同语言地区的用户显示不同的页面,或者你想集中管理你的页面提示或警告信息,这种情况下,我们一般都回利用一些系统自身的国际化功能来完成这个工作。
下面我会用很简洁的方式一步一步从创建一个项目开始,描述如何在django中使用国际化功能。在开始之前,先说说开发环境:winxppro+sp2+python2.5+django0.96,另下面的很多django命令都在django安装根目录的bin下,请事先设置好路径方便使用。
第一步: 创建一个项目
使用django-admin.py startproject djtest 创建项目;
 
第二步: 创建应用
使用manage.py startapp international 创建应用;
 
第三步: 修改配置文件
在djtest目录下,更改settings.py,修改DATABASE_ENGINE DATABASE_NAME DATABASE_USER DATABASE_PASSWORD,这几个参数值随便设置吧,但是要保证能够链接上数据库,否则后面django自带的测试服务器启动不了,并且一访问就报错 :),此外再设置USE_I18N = True
第四步: 配置urls.py
在urlpatterns中增加一行 (r'^international/test/', 'djtest.international.views.test')
第五步: 写一个简单的处理函数
打开views.py,添加如下代码
python 代码
  1. from django.shortcuts import render_to_response
  2. def test(request):
  3. return render_to_response('international/test.html')
 
第六步: 写一个简单的模板文件
在djtest目录下创建一个templates/international目录,并在该目录下创建一个test.html模板文件,主要加入下面两行(详细请看上传的源码):
{% load i18n %}
{% trans 'hello test' %}
 
第七步: 创建包含国际化文本串的文件
在djtest目录下,首先创建目录locale,敲入命令 make-messages.py -l zh_CN 命令在djtest下的locale/zh_CN/LC_MESSAGES下生成文件django.po,该文件为刚刚敲入的命令遍历djtest下的源代码和模板目录后生成的,所以打开这个文件,你会找到这样两行:
代码
  1. msgid "hello test"
  2. msgstr "中文测试"
在msgstr的双引号内写入自己想要表达的字符串即可,比如"中文测试",并将该django文件保存为utf-8格式,在windows下最好不要带bom。
然后我们在djtest下,再敲入命令 make-messages.py -l en 命令在djtest下的locale/en/LC_MESSAGES下生成文件django.po,然后我们做同样的处理,只不过把msgstr的内容写成"english test",也将该文件保存为无bom的utf-8格式。
请注意,每个.po文件都请将Content-Type: text/plain; charset设置为utf-8;
 
第八步: 编译.po文件
在djtest目录下,敲入命令 compile-messages.py 该命令会为每一个.po文件生成.mo文件,供django使用;
 
第九步: 设置settings.py文件
INSTALLED_APPS加入'djtest.international', 设置LANGUAGE_CODE = 'zh-cn'
第十步: 启动django的测试服务器
在djtest目录下,敲入命令 manage.py runserver,然后访问http://localhost:8000/international/test/浏览效果,更改settings.py中的LANGUAGE_CODE = 'en', 你就可以看到英文的消息。

FreeBSD|make install时配置菜单不见了

make install时配置菜单不见了

make install时配置菜单不见了

Posted by 刘 敏 as FreeBSD

在FreeBSD下由于系统会纪录曾安装过 ports 的当时所选择的清单选项,因此有时会因为安装时出了问题,或是再一次安装时,就不会出现清单可以选择。如何让清单选项重新出现呢?

make clean
make showconfig # 显示设定的内容
make rmconfig # 清除设定的内容
make config

之前设定 ports 的选项预设会纪录在 /var/db/ports/{ports_name}/options 内容中。如要查看之前 python 有选择的清单选项:

cat /var/db/ports/python/options

Saturday, May 31, 2008

Django|django crash problem - Size does matter — tomster.org

Size does matter — tomster.org

Size does matter

Filed Under:

From the obscure-obscenity-Department

Today, the Zope instance hosting this site experienced frequent crashes (as in "every couple of minutes) in the form of coredumps of the zope process. Stumped, bewildered, frustrated and near tears I did what I always do in such cases: pour my little heart out on #plone ;-)

Sure enough, help was at hand. And today's tip of the hat goes to... drumroll... lurker, who alerted me to the fact, that python needs explicit 'huge stack' suppport on FreeBSD. A bit of googling supported this view and so I reinstalled python from the ports, but first ran make config and then enabled HUGE_STACK_SIZE:


After that I ran a few stresstests with ab and the instance easily withheld several thousand requests for the front page. Why this option isn't enabled by default is beyond me. So keep that in mind, kids, when you install python on your BSD boxen: size does matter ;-) (As if the term 'huge stack size' wasn't already obscene enough without such a foot-in-mouth reference...)


Could this be a stack size problem?

On FreeBSD, the default stack size is configured smaller than in most other distributions. This already bites on a regular basis with Zope installations. (e.g. things like http://tomster.org/blog/archive/2006/09/27/size-does-matter)

From a current FreeBSD port /usr/ports/lang/python24/Makefile:

.if defined(WITHOUT_HUGE_STACK_SIZE)
CFLAGS+= -DTHREAD_STACK_SIZE=0x20000
.else
CFLAGS+= -DTHREAD_STACK_SIZE=0x100000
.endif # defined(WITHOUT_HUGE_STACK_SIZE)



Wyatt Baldwin wrote:

On 6/15/06, Philip Jenvey <pjen@groovie.org> wrote:

On Jun 15, 2006, at 6:56 PM, uuellbee wrote:

One known cause of core dumps on FreeBSD is when a python app needs a large stack size (this can be avoided by enabling the python port's HUGE_STACK_SIZE option), but since you're loading a simple app this can't be the problem.

I found some info on this and one suggestion was to #define THREAD_STACK_SIZE in thread_pthread.h. [See http://www.pythomnic.org/step_by_step.html.] I tried this instead of setting HUGE_STACK_SIZE because I'm compiling Python from source.

Here is the line I added:

#define THREAD_STACK_SIZE (0x100000)

Now when I visit the test site, it just keeps Loading... apparently forever. (It's been going in a another tab for a while now).

You might try switching the threading library via libmap.conf incase there's something strange related to threads.

I can't find libmap.conf on my system.

I tried compiling without threads, but something complained about not finding threads when I started the server. I also tried using the --with-pth option (GNU pth threading libraries), but that didn't change anything.

Otherwise to get some kind on information on why the core dump occurred you'll need to recompile python with debugging symbols. You can do this by putting the following line in /etc/make.conf prior to building the port:

CFLAGS=-g

Then you can run 'gdb python.core' and issue the 'bt' command to gdb to see a backtrace.

I'll recompile with the THREAD_STACK_SIZE hack removed and try this......

Here's what I get from running 'gdb ~/bin/python python.core': [Copyright, etc] This GDB was configured as "i386-marcel-freebsd"... Core was generated by `python'. Program terminated with signal 10, Bus error. [Bunch of lines of reading/loading symbols] #0 0x2825a31b in pthread_testcancel () from /usr/lib/libpthread.so.1

Here is the output of bt: #0 0x2825a31b in pthread_testcancel () from /usr/lib/libpthread.so.1 #1 0x28252902 in pthread_mutexattr_init () from /usr/lib/libpthread.so.1 #2 0x00000000 in ?? ()

This sounds pretty similar to this issue:

http://mail.python.org/pipermail/python-list/2005-April/276728.html

What's strange is he wasn't able to immediately reproduce the core dump while running python through GDB (mentioned here: http:// mail.python.org/pipermail/python-list/2005-February/265137.html ). You might want to also try what he did -- running python through gdb with the symbols enabled. Enabling the debugging symbols will hopefully provide a more thorough back trace and possibly explain why the larger thread stack size changed the behavior (I suspect there's a bigger problem that blows out the stack and the larger one simply postpones the problem).

Before doing that I would play with libmap.conf. There isn't an /etc/ defaults/libmap.conf, but there should be a man page for it on your system (libmap.conf(5)). You're currently using the libpthread threading library: what you want is to switch to the libthr library via libmap.conf (and hope the problem doesn't occur there).

I am currently doing this on a 7.0-CURRENT machine, and my libmap.conf looks like this:

libpthread.so.2 libthr.so.2 libpthread.so libthr.so

The version numbers are going to be different for FreeBSD 5.4. Check the man page aHUGE_STACK_SIZEnd google for more info/examples.

I'll take a deeper look into libmap. One thing that might be interesting is I installed Myghty by itself and had no problems, which would perhaps imply that the problem is somewhere in Paste (not that I really know what I'm talking about).

~wyatt

I never got around to messing around with gdb or libmap. I just set HUGE_STACK_SIZE in the FreeBSD Python port and everything works fine.

~wyatt

Thursday, May 29, 2008

Cygwin|Set SSH daemon in Cygwin on a Windows 2003 server

HOWTO setup the Cygwin SSH daemon on a Windows 2003 server

Installing the Cygwin SSH daemon

How to setup the secure shell daemon on a Windows 2003 server

Note : This set of instructions has worked for me at our institution. You should read /usr/share/doc/Cygwin/openssh.README after installing cygwin and check the cygwin mailing list if you encounter problems.

Installing and Testing cygwin

  • Create the destination folder (C:\cygwin or D:\cygwin as appropriate). Default permissions will be for administrators and SYSTEM only. Add SERVER\Users with modify control to the list. These permissions will be inherited to the rest of the folder as it is populated.
  • Create a directory to locally store the cygwin packages e.g. C:\temp\cygwinarchive. Open a browser window to the following URL http://www.cygwin.com/setup.exe and save the installation file setup.exe to the archive directory just created (C:\temp\cygwinarchive in this example)
  • Double click on the downloaded cygwin setup program. The current version is 2.510.2.2 (February 3rd, 2006). Click 'Next' and answer the prompts :
    • Leave default "install from internet"
    • Install to root directory c:\cygwin
    • leave default "install for all users"
    • leave default text file type "unix / binary"
    • Set local package directory to c:\temp\cygwinarchive (the directory created in the previous step). This should be the default.
    • Leave the default "direct connection"
    • Select a mirror (any of the ones with starting with http://mirror in the name). The package list will be downloaded.
    • The 'Select Packages' window can be stretched. Click on the plus sign to expand the categories. Install at least the following list of packages.
      • From Admin, select all packages.
      • From Archive, select unzip and zip packages.
      • From Base, leave the default, select all packages.
      • From Doc, leave the default, man and 'cygwin doc' packages.
      • From Editors, select vim package.
      • From Net, select openssh (openssl will get checked automatically), rsync and tcp_wrappers packages.
    • When you've selected these packages, click 'Next'. The installation tells you which packages it is installing as it progresses.
    • Uncheck 'Create desktop icon'. Leave default 'Add to start menu'. Click 'Finish'.
    • A post install script runs a few final commands. Then you should see a message saying 'Installation complete'. Click 'OK'.
  • Edit C:\cygwin\cygwin.bat. Make sure it contains these lines - you will need to add the line setting the CYGWIN environment variable.
    @echo off set CYGWIN=binmode tty ntsec C: chdir \cygwin\bin  bash --login -i 
  • Test cygwin to make sure it works. Start, Programs, Cygnus Solutions, Cygwin Bash Shell - should get a command window with a prompt saying 'Administrator@servername'. This is a bash shell and you can use unix or DOS / NT type commands e.g.
    • 'ls /bin' to see the cygwin bin directory
    • 'dir c:' to see the contents of the C: directory
    Type "control d" or "logout" to exit the shell.

  • If you get a message saying 'cannot create /home/userid', run this command from the cygwin window "mkpasswd -l >/etc/passwd".

  • While you're in the cygwin shell window, run this command to change the mount prefix from "/cygdrive" to "/". You should logout and back in again after running this command in order to reset your PATH environment variable properly.
    mount -s --change-cygdrive-prefix / 
  • Also, create a home directory where you can place user startup files. The default location is the "Documents and Settings" folder. Creating a /home directory and using the -p switch to assign the home directory when adding a new user keeps all the cygwin files under the c:\cygwin directory.
    mkdir -p /home 

Installing the SSH daemon service

  • From a cygwin prompt (Start, All Programs, Cygwin ?), run ssh-host-config to create the service, set up the ssh host keys and create the sshd_config file in /etc/. Note that 2 local users are created, one called sshd to handle privilege separation and one that is required on Windows 2003 called sshd_server that runs the service in order to use public key authentication. You should see output like this:
    $ ssh-host-config Generating /etc/ssh_host_key Generating /etc/ssh_host_rsa_key Generating /etc/ssh_host_dsa_key Overwrite existing /etc/ssh_config file? (yes/no) yes Generating /etc/ssh_config file Overwrite existing /etc/sshd_config file? (yes/no) yes Privilege separation is set to yes by default since OpenSSH 3.3. However, this requires a non-privileged account called 'sshd'. For more info on privilege separation read /usr/share/doc/openssh/README.privsep.  Should privilege separation be used? (yes/no) yes Warning: The following function requires administrator privileges! Should this script create a local user 'sshd' on this machine? (yes/no) yes Generating /etc/sshd_config file Added ssh to C:\WINDOWS\system32\drivers\etc\services   Warning: The following functions require administrator privileges!  Do you want to install sshd as service? (Say "no" if it's already installed as service) (yes/no) yes  You appear to be running Windows 2003 Server or later.  On 2003 and later systems, it's not possible to use the LocalSystem account if sshd should allow passwordless logon (e. g. public key authentication). If you want to enable that functionality, it's required to create a new account 'sshd_server' with special privileges, which is then used to run the sshd service under.  Should this script create a new local account 'sshd_server' which has the required privileges? (yes/no) yes  Please enter a password for new user 'sshd_server'.  Please be sure that this password matches the password rules given on your system. Entering no password will exit the configuration.  PASSWORD=xxxxxxx  User 'sshd_server' has been created with password 'xxxxxxxx'. If you change the password, please keep in mind to change the password for the sshd service, too.  Also keep in mind that the user sshd_server needs read permissions on all users' .ssh/authorized_keys file to allow public key authentication for these users!.  (Re-)running ssh-user-config for each user will set the required permissions correctly.   Which value should the environment variable CYGWIN have when sshd starts? It's recommended to set at least "ntsec" to be able to change user context without password. Default is "ntsec".  CYGWIN=binmode ntsec tty  The service has been installed under sshd_server account. To start the service, call net start sshd' or cygrunsrv -S sshd'.  Host configuration finished. Have fun! 
  • You can start the service from the services MMC panel, or using either of the commands listed above ("net start sshd" or "cygrunsrv -S sshd").

Generating public/private SSH keys for a user

  • If you need to generate ssh public and private keys for a user on this machine who will be uploading data or logging in to a remote machine, you will need to carry out this step. Sign on as the user who needs the keys created. They will automatically be in their home directory. Run ssh-user-config to setup the ssh keys. Create only an SSH2 RSA identity (use a null passphrase - just press return). Output should be similar to this :
       cygwinadmin@HICKORY ~    $ ssh-user-config    Shall I create an SSH1 RSA identity file for you? (yes/no) no    Shall I create an SSH2 RSA identity file for you? (yes/no)  (yes/no) yes    Generating /home/pswander/.ssh/id_rsa    Enter passphrase (empty for no passphrase):Press ENTER    Enter same passphrase again:Press ENTER    Do you want to use this identity to login to this machine? (yes/no) yes    Shall I create an SSH2 DSA identity file for you? (yes/no)  (yes/no) no     Configuration finished. Have fun! 
  • Update the file /home/userid/.ssh/authorized_keys with any public keys from other users who you wish to be able to connect to this user's account. Refer to this document for more information. Make sure each entry you add is all on one line.
  • Make sure the service is running (state 4 = running)
    $ sc query sshd  SERVICE_NAME: sshd          TYPE               : 10  WIN32_OWN_PROCESS           STATE              : 4  RUNNING                                  (STOPPABLE, NOT_PAUSABLE, IGNORES_SHUTDOWN))         WIN32_EXIT_CODE    : 0  (0x0)         SERVICE_EXIT_CODE  : 0  (0x0)         CHECKPOINT         : 0x0         WAIT_HINT          : 0x0 
  • Test the service from the cygwin prompt using "ssh -v localhost". You will get challenged with the new host key and will have to enter your password as you connect. You should see output like this:
    The authenticity of host 'localhost (127.0.0.1)' can't be established. RSA key fingerprint is 75:8a:67:20:0d:75:dd:06:64:04:d0:ac:23:c7:74:ba. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'localhost' (RSA) to the list of known hosts.  The last line is: You are successfully logged in to this server!!!  
  • Test the service from a remote host. You can now update the authorized_keys file with the public key file from the user and host you want to connect from. Then test your connection from that host by issuing the command "ssh userid@servername dir c:\"

Adding and removing users from the passwd file

  • You can add domain or local users using the mkpasswd command. Test what would be added for a domain user with this command:
    mkpasswd -d domain_name -u joeuser 
  • You can add an ads domain user to the passwd file and give him a home directory in /home with this command:
    mkpasswd -d ads -p /home -u kscully >>/etc/passwd 
  • You can add local users using the -l switch instead of the -d switch. Be careful not to use the -d domain_name switch without specifying a user or you will get entries for ALL doamin users in the passwd file.
  • Users can be removed and both users and groups can be updated by starting a cygwin shell and using vi to edit the /etc/passwd and /etc/group files.

Restricting SSH access to specific servers

Working on a netsh script to restrict access to specific servers.

cygrunsrv --install sshd --path '/usr/sbin/sshd' --env 'PATH=/bin;/sbin' --env 'CYGWIN=ntsec tty' -a -D

Switching the user who runs the service

In a normal installation, the ssh-host-config script creates a local user called sshd_server under whose credentials the ssh daemon runs. This is fine for local shell access to the server and secure file transfers to and from the server, but it is not possible to access any network resources while the service is running under the local user account.

The solution is to run the service under a domain user account - one that has access to the shares or servers remote from the server running sshd. In order to switch the service to run under a different user, these steps must be carried out :

  • Open "Computer Management", open the Services tab, right click on the "Cygwin sshd" service and stop the service.
  • Right click on the "Cygwin sshd" service again and select properties. Under the 'Log On' tab, switch the name of the account the service is running from ".\sshd_server" to domain\userid, where domain and userid correspond to a userid with access to the resources you require in the domain. You will be prompted for this user's password.
  • Open Control Panel -> Administratice Tools -> Local Security Settings -> Local Policies. Then click on 'User Rights Assignment'. Make sure the domain user you specified in step one is in the list for these 4 rights :
    1. Adjust memory quotas for a process
    2. Create a token object
    3. Log on as a service (already granted if you completed step 1)
    4. Replace a process level token
  • Add the domain user to the local password file
        mkpasswd -d domain -u userid >> /etc/passwd     
  • Change to ownership of the files required by the sshd service owner. Open a cygwin bash session and run these commands for your userid
        $ chown userid /var/log/sshd.log     $ chown -R userid /var/empty     $ chown userid /etc/ssh*     
  • In the services tab again, right click on the 'Cygwin sshd' service and select 'start'. Check the event log for a successful start, or for errors in case the service does not start successfully.

Wednesday, May 28, 2008

Proxy|Python code accross proxy

Python code accross proxy
=====================================
#!/usr/bin/env python
"""
Test read a htm from internet and across intranet proxy

"""

import sys
import getopt
import urllib2

def startTask():
proxy={
'user':'Stephen',
'pass':'pass',
'host':'proxy.com',
'port':8080
}

proxy_support = urllib2.ProxyHandler({"http":"http://%(user)s:%(pass)s@%(host)s:%(port)d" % proxy})
#proxy_support = urllib2.ProxyHandler({"http":"http://username:password@proxy.com:8080"})


opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)

# install it
urllib2.install_opener(opener)

sock = urllib2.urlopen('http://www.python.org/')
print sock.headers
print sock.read()

if __name__ == "__main__":
startTask()

Friday, May 16, 2008

Raid|Raid 0+1 好還是 1+0 呢?

Raid 0+1 好還是 1+0 呢? at Hsiaoi Collection

Raid 0+1 好還是 1+0 呢?

Published at 2007, 八月 3日 in 科技.

Raid是什麼東西? 0+1 & 1+0 有什麼不同?
所謂的「RAID」,是「Redundant Array of Independent Disks 」的縮寫,也就是「獨立磁碟備援陣列」的意思。也有人將它說成「Redundant Array of Inexpensive Drives」﹙低價硬碟備援陣列﹚,當初RAID技術發表時用的是這個全名,不過現在則是前者的說法較多人採用。

其中「Redundant」是「過多、多餘」的意思,要組成一部磁碟機通常只需一顆硬碟,甚至一顆硬碟還能分割成許多磁碟區。但是在組RAID磁碟機時,要用上的硬碟比一顆還要「多」,也就是要用上2顆以上的硬碟。

因此RAID在實體上是多顆硬碟,在系統中被當作一顆硬碟使用,而在作業系統底下,也還是可以將它分割為單一或多個分割區。因此建立好的RAID, 使用起來跟單一硬碟是完全相同的,只是依組成方式的不同,RAID可以提供更大的容量、更高的讀寫效能,或是額外的「安全性」。﹙這裡所說的「安全性」, 是指硬碟損毀之後資料重建、回復的能力,與加密防駭等功能無關﹚

而且RAID的「容量、速度、安全性」加成之後的CP值,能夠遠遠高於相同效能表現的超高階硬碟,這才是許多玩家樂於採用的主因。RAID的組成方式很多,在進入我們的測試之前,先帶大家看看RAID的各種類別。

0+1 跟 1+0 差異?

RAID 0+1 是先 RAID 0(stripe)再 RAID 1(mirror), 如下圖:
01.jpg

RAID 1+0 是先 RAID 1(mirror)再 RAID 0(stripe), 如下圖:
10.jpg

這兩種 RAID 技術主要的差異在於 performance 及 reliability 的差別. 以下分幾種 disk failure 的 case 來討論. 每一[]表示一顆硬碟.

Case 1. 任何一顆硬碟損壞
RAID 0+1 : 另一個 stripe 可繼續運作, 但本身成為 SPOF.
RAID 1+0 : 兩個 segments 均可繼續運作, 所以無 SPOF 的 concern. (勝)

Case 2. 兩個 stripe/segment 各損壞一顆硬碟
RAID 0+1 : 兩個 stripes 都無法繼續運作.
RAID 1+0 : 兩個 segments 均可繼續運作, 所以無 SPOF 的 concern. (勝)

Case 3. 同一個 stripe/segment 損壞兩顆硬碟
RAID 0+1 : 另一個 stripe 可繼續運作, 但本身成為 SPOF.
RAID 1+0 : 若損壞的兩顆,屬於同一 mirror set, 則無法繼續運作(敗);若屬於不同 mirror set, 則兩個 segments 均可繼續運作, 所以無 SPOF 的 concern. (勝)

綜合以上分析, RAID 1+0 不是在所有的情況下, 它的 reliability 都優於 RAID 0+1, 但是在大部分的情況, RAID 1+0 的 reliability 是優於 RAID 0+1 的. 所以 RAID 0+1 適用於對 performance 的需求高於 reliability 的環境;RAID 1+0 則相反.

另外在 recovery 的情況, RAID 0+1 要重新 mirror 整個 stripe; 而 RAID 1+0 只要重新 mirror 一顆硬碟即可.

除了 performance 和 reliability 上的差異, 兩者在 poor scalability 和 high cost 的特性均相同.

JBOD (Just Bunch of Disks)

這種組成方式嚴格來說不算RAID,因為它的功能就跟它的全名一樣,「只是將多顆磁碟湊在一起」, 當作一顆超大硬碟來用。假設是4顆250GB的大硬碟,在JBOD模式下就成了一顆1TB﹙=1000GB﹚的超高容量硬碟,但是除了容量提升之外,它的 速度還是跟單一硬碟相同,也沒有額外的安全性。

RAID 0 (Striped)

這是最簡單也最猛的一種磁碟陣列,它的功能是在資料寫入時,將資料分割成幾個小區塊,分別存到各顆硬碟裡,因此可以提升寫入速度。當需要讀取時,再分別由所有硬碟裡將小區塊抓出來,所以也有較高的讀取速度。

但它的缺點是只要其中一顆硬碟壞掉,或只是小小的出點問題,都可能因為一小部分資料的不完整,就造成整個磁碟陣列無法正常讀取,全部的資料就這樣毀 於一旦,完全沒有安全性可言。儘管如此,RAID 0存取效能隨著組成硬碟數目增加而提升的特性,對於「效能至上」的玩家們還是有不小的吸引力。

RAID 1 (Mirrored)

這種磁碟陣列是將單一磁碟作「鏡射」(Mirror)的動作,也就是資料寫入時將相同的資料同時丟進兩顆硬碟,確保所有的資料都隨時存在另一個備 份。因為對單一硬碟寫入的資料量不變,所以寫入速度跟非RAID磁碟機沒有差別,不過在讀取時能同時由兩顆硬碟抓取資料,所以速度還是有所提升。

RAID 10 / 01 (Striped & Mirrored)

這是將RAID 0與RAID 1的架構作結合用的磁碟陣列,10與01的差別僅是先鏡射再分割資料,或是先分割再將資料鏡射到兩組硬碟,但功能是相同的,而且都需由4顆硬碟組成。這種 組法同時具備效能提升與資料備份的優點,只要不是「同組鏡射」的兩顆硬碟同時毀損,資料都可以救得回來。

RAID 2.3.4

這幾種RAID一直都沒有成為主流,也很少有硬體支援這幾種組法。它們都是由RAID 0改良而來,RAID 2是以位元為單位將資料分割寫入,並加入位元檢查用的錯誤修正碼(ECC),並以「漢明碼」來作資料編碼,單一磁碟毀損時可以藉此將故障硬碟的所有資料還原回來。

RAID 3則是改用的方式作資料編碼,並獨立使用一顆硬碟來存放同位檢查用的資料。而RAID 4同樣是以「同位元檢查」編碼、獨立硬碟存放檢查碼,但是資料的分割改回用資料區塊為單位。這兩種方式都至少需要3顆硬碟。

RAID 5 (Parity RAID)

RAID 5是由RAID 2.3.4改良而來,終於成為比較普及的一種架構。 它先將原始資料與同位檢查位元作組合,再以位元為單位分散存放在所有硬碟中,因此不需多用一部硬碟來存放檢查碼。但RAID 5實際上仍需一顆硬碟的容量來存放同位檢查碼,所以RAID整體的可用容量會等於總容量減去單顆硬碟容量,只是這個浪費的空間是分散在各顆硬碟中。RAID 5因為是分散的存取架構,因此效能提升明顯,而且任何一顆硬碟毀損,都還可以救得回來。 雖有浪費一顆硬碟容量的缺點,但是跟RAID 10 / 01一半的容量浪費相比,RAID 5單顆容量換得的安全性可說是相當划算。

各種RAID架構比較表

RAID方案

硬碟數

可用容量

效能

安全性

主要應用

JBOD

大於2

全部

不變

幾乎等於0

容量至上

RAID 0

大於2

全部

最高

危險

追求效能的狂熱玩家

RAID 1

2

總容量的50%

稍有提升

最高

完全不能出錯的資料備份

RAID 0+1

4以上的偶數

總容量的50%

極高

同時需要備份和效能,且預算無上限

RAID 5

3以上

N-1顆

讀快寫慢

同RAID 0+1但預算限制

原文參考: Sata Raid 完全攻略 )

Monday, March 17, 2008

Yum|yum command: Update / Install Packages

yum command: Update / Install Packages under Redhat Enterprise / CentOS Linux Version 5.x

Task: Register my system with RHN

To register your system with RHN type the following command and just follow on screen instructions (CentOS user skip to next step):
# rhn_register

WARNING! These examples only works with RHEL / CentOS Linux version 5.x or above. For RHEL 4.x and older version use up2date command.

Task: Display list of updated software (security fix)

Type the following command at shell prompt:
# yum list updates

Task: Patch up system by applying all updates

To download and install all updates type the following command:
# yum update

Task: List all installed packages

List all installed packages, enter:
# rpm -qa
# yum list installed

Find out if httpd package installed or not, enter:
# rpm -qa | grep httpd*
# yum list installed httpd

Task: Check for and update specified packages

# yum update {package-name-1}
To check for and update httpd package, enter:
# yum update httpd

Task: Search for packages by name

Search httpd and all matching perl packages, enter:
# yum list {package-name}
# yum list {regex}
# yum list httpd
# yum list perl*

Sample output:

Loading "installonlyn" plugin Loading "security" plugin Setting up repositories Reading repository metadata in from local files Installed Packages perl.i386 4:5.8.8-10.el5_0.2 installed perl-Archive-Tar.noarch 1.30-1.fc6 installed perl-BSD-Resource.i386 1.28-1.fc6.1 installed perl-Compress-Zlib.i386 1.42-1.fc6 installed perl-DBD-MySQL.i386 3.0007-1.fc6 installed perl-DBI.i386 1.52-1.fc6 installed perl-Digest-HMAC.noarch 1.01-15 installed perl-Digest-SHA1.i386 2.11-1.2.1 installed perl-HTML-Parser.i386 3.55-1.fc6 installed ..... ....... .. perl-libxml-perl.noarch 0.08-1.2.1 base perl-suidperl.i386 4:5.8.8-10.el5_0.2 updates 

Task: Install the specified packages [ RPM(s) ]

Install package called httpd:
# yum install {package-name-1} {package-name-2}
# yum install httpd

Task: Remove / Uninstall the specified packages [ RPM(s) ]

Remove package called httpd, enter:
# yum remove {package-name-1} {package-name-2}
# yum remove httpd

Task: Display the list of available packages

# yum list all

Task: Display list of group software

Type the following command:
# yum grouplist
Output:

Installed Groups:  Engineering and Scientific  MySQL Database  Editors  System Tools  Text-based Internet  Legacy Network Server  DNS Name Server  Dialup Networking Support  FTP Server  Network Servers  Legacy Software Development  Legacy Software Support  Development Libraries  Graphics  Web Server  Ruby  Printing Support  Mail Server  Server Configuration Tools  PostgreSQL Database Available Groups:  Office/Productivity  Administration Tools  Beagle  Development Tools  GNOME Software Development  X Software Development  Virtualization  GNOME Desktop Environment  Authoring and Publishing  Mono  Games and Entertainment  XFCE-4.4  Tomboy  Java  Java Development  Emacs  X Window System  Windows File Server  KDE Software Development  KDE (K Desktop Environment)  Horde  Sound and Video  FreeNX and NX  News Server  Yum Utilities  Graphical Internet Done 

Task: Install all the default packages by group

Install all 'Development Tools' group packages, enter:
# yum groupinstall "Development Tools"

Task: Update all the default packages by group

Update all 'Development Tools' group packages, enter:
# yum groupupdate "Development Tools"

Task: Remove all packages in a group

Remove all 'Development Tools' group packages, enter:
# yum groupremove "Development Tools"

Task: Install particular architecture package

If you are using 64 bit RHEL version it is possible to install 32 packages:
# yum install {package-name}.{architecture}
# yum install mysql.i386

Task: Display packages not installed via official RHN subscribed repos

Show all packages not available via subscribed channels or repositories i.e show packages installed via other repos:
# yum list extras
Sample output:

Loading "installonlyn" plugin Loading "security" plugin Setting up repositories Reading repository metadata in from local files Extra Packages DenyHosts.noarch 2.6-python2.4 installed VMwareTools.i386 6532-44356 installed john.i386 1.7.0.2-3.el5.rf installed kernel.i686 2.6.18-8.1.15.el5 installed kernel-devel.i686 2.6.18-8.1.15.el5 installed lighttpd.i386 1.4.18-1.el5.rf installed lighttpd-fastcgi.i386 1.4.18-1.el5.rf installed psad.i386 2.1-1 installed rssh.i386 2.3.2-1.2.el5.rf installed 

Task: Display what package provides the file

You can easily find out what RPM package provides the file. For example find out what provides the /etc/passwd file:
# yum whatprovides /etc/passwd
Sample output:

Loading "installonlyn" plugin Loading "security" plugin Setting up repositories Reading repository metadata in from local files  setup.noarch 2.5.58-1.el5 base Matched from: /etc/passwd  setup.noarch 2.5.58-1.el5 installed Matched from: /etc/passwd 

You can use same command to list packages that satisfy dependencies:
# yum whatprovides {dependency-1} {dependency-2}
Refer yum command man page for more information:
# man yum

Friday, March 14, 2008

SOA|SOAP Web Services : awkward. Python and Web Services : painful

SOAP Web Services : awkward. Python and Web Services : painful. - Nicolas Lehuen's Weblog

SOAP Web Services : awkward. Python and Web Services : painful.

By Nico on Tuesday, May 30 2006, 21:47 - General - Permalink

So, I had managed to dodge it until now, but that's it, we're in 2006 and I need to use SOAP web services. Back in 2002 I was implementing and consuming REST+XML or XML-RPC web services all over the place. At the time, SOAP smelled like... well, let's says it definitely didn't smell like soap. Today, I'm not a slightiest bit more convinced, but when you gotta do it, you gotta do it.

So yes, I used Java and the AXIS web services library to consume web services, namely the Google Adwords API and the Amazon Electronic Commerce Services. Of course, Java isn't a dynamic programming language, so you have to generate a bunch of code using the WSDL2Java compiler, but it's not that difficult. You then get a true Java API that smells like... well, it doesn't smell very soapy either.

I mean, I've already given a sample of the need to instantiate a factory to get a locator service that will build a service implementation that you need to configure. It isn't pretty. But that's only the initialization part. Then, you use an API that has been obviously generated by a computer and doesn't feel user friendly at all. It's a bit like a DOM API from another dimension. Well, let me pat myself on the back for this cunning analogy, because it is in fact exactly that : the AXIS toolkit did its best to build wrappers around the structure of request and response documents. XML documents, mind you, and I won't even add insult to injury by reminding my wide audience that those structures are specified in the Ruth Goldberg XML Schema format.

Well, give me embedded XML litterals like in E4X or C# 3.0, and native XPath support like in... well LINQ (also in C# 3.0, there's a thing going on here), and you've got a proper toolkit to consume web services. But here, AXIS generates a bunch of wrappers that are quite awkward to use as soon as the document structure is a tad complicated. Pretty soon you wish you could stop using those wrappers and access the raw XML nodes (preferably using the XOM API, but that's another story). I'm sure there's an option somewhere that allows you to do that, but I could not find it, and given what I saw, I expect the result to be quite messy.

Anyway, while struggling with all those bells and whistles, I told myself "Hey, all this mess is caused by the static typing of Java, why not try to do it in Python ?". Why not, indeed ? Well, because if using SOAP Web Services in Java is awkward, in Python, it's actually painful, because It Just Won't Work™.

Oh, I know the documentation are wonderful, and dynamic typing allows for wonderful proxy thingies that all work my magic... CORBA is just sweet in Python (see omniORB), because you don't have to worry about code generators and compilers and so on. You give your ORB an IDL file and bam, it works. Or not. But that's another story.

Well, it's not, actually, because SOAP Web Services are actually a stripped down, crappy version of CORBA, WSDL being a tricky, complicated version of IDL. So you'd expect the experience to be similar, and it is : you give your SOAP library a WSDL and bam, it works.

Except it doesn't. Because in the Python world, XML libraries in general and Web Services in particular are either half implemented or not unmaintained since the heidy days of 2000-2002. One of the brand new things in Python 2.5 is the inclusion of ElementTree, a replacement Python-friendly DOM API for the pile of non-standard crap that we have right now. Isn't it about time ?

For web services, the situation is much worse, and the sad truth is that you can use ZSI or SOAPy to consume web services, but you've got a very high chance that it won't be compatible with obscure web services like the one proposed by Google, Amazon or Yahoo. Thanks god the two later also provide REST APIs... But right now it's the Google APIs that interest me. And Google said "thou shall use SOAP, SOAP thou will use". And the Python SOAP APIs don't understand. We might as well use CORBA, for that kind of "interesting" incompatibilities.

raimondas says "Python Web Services - Not Quite Painless", but I feel that for the sake of precision, we should simply say "Python Web Services - Actually Quite Painful".

The good thing is, like for all painful things, it's a relief when it stops. I really didn't expect to enjoy using Java and AXIS for this. Oh, well...

They posted on the same topic