客戶端斷開連接后,服務(wù)器端存在大量僵尸進(jìn)程。這是由于服務(wù)器子進(jìn)程終止后,發(fā)送SIGCHLD信號給父進(jìn)程,而父進(jìn)程默認(rèn)忽略了該信號。為避免僵尸進(jìn)程的產(chǎn)生,無論我們什么時(shí)候創(chuàng)建子進(jìn)程時(shí),主進(jìn)程都需要等待子進(jìn)程返回,以便對子進(jìn)程進(jìn)行清理。為此,我們在服務(wù)器程序中添加SIGCHLD信號處理函數(shù)。
#include stdlib.h>
#include stdio.h>
#include errno.h>
#include string.h>
#include unistd.h>
#include sys/socket.h>
#include netinet/in.h>
#include sys/types.h>
#include netdb.h>
#define SERV_PORT 1113
#define LISTENQ 32
#define MAXLINE 1024
/***連接處理函數(shù)***/
void str_echo(int fd);
void
sig_chld(int signo)
{
pid_t pid;
int stat;
pid = wait(stat);//獲取子進(jìn)程進(jìn)程號
printf("child %d terminated\n", pid);
return;
}
int
main(int argc, char *argv[]){
int listenfd,connfd;
pid_t childpid;
socklen_t clilen;
struct sockaddr_in servaddr;
struct sockaddr_in cliaddr;
//struct sockaddr_in servaddr;
//struct sockaddr_in cliaddr;
if((listenfd = socket(AF_INET, SOCK_STREAM,0))==-1){
fprintf(stderr,"Socket error:%s\n\a",strerror(errno));
exit(1);
}
/* 服務(wù)器端填充 sockaddr結(jié)構(gòu)*/
bzero(servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl (INADDR_ANY);
servaddr.sin_port = htons(SERV_PORT);
signal(SIGCHLD,sig_chld);//處理SIGCHLD信號
/* 捆綁listenfd描述符 */
if(bind(listenfd,(struct sockaddr*)(servaddr),sizeof(struct sockaddr))==-1){
fprintf(stderr,"Bind error:%s\n\a",strerror(errno));
exit(1);
}
/* 監(jiān)聽listenfd描述符*/
if(listen(listenfd,5)==-1){
fprintf(stderr,"Listen error:%s\n\a",strerror(errno));
exit(1);
}
for ( ; ; ) {
clilen = sizeof(cliaddr);
/* 服務(wù)器阻塞,直到客戶程序建立連接 */
if((connfd=accept(listenfd,(struct sockaddr*)(cliaddr),clilen))0){
/*當(dāng)一個(gè)子進(jìn)程終止時(shí),執(zhí)行信號處理函數(shù)sig_chld,
而該函數(shù)返回時(shí),accept系統(tǒng)調(diào)用可能返回一個(gè)EINTR錯(cuò)誤,
有些內(nèi)核會(huì)自動(dòng)重啟被中斷的系統(tǒng)調(diào)用,為便于移植,將考慮對EINTR的處理*/
if(errno==EINTR)
continue;
fprintf(stderr,"Accept error:%s\n\a",strerror(errno));
exit(1);
}
//有客戶端建立了連接后
if ( (childpid = fork()) == 0) { /*子進(jìn)程*/
close(listenfd); /* 關(guān)閉監(jiān)聽套接字*/
str_echo(connfd); /*處理該客戶端的請求*/
exit (0);
}
close(connfd);/*父進(jìn)程關(guān)閉連接套接字,繼續(xù)等待其他連接的到來*/
}
}
void str_echo(int sockfd){
ssize_t n;
char buf[MAXLINE];
again:
while ( (n = read(sockfd, buf, MAXLINE)) > 0)
write(sockfd, buf, n);
if (n 0 errno == EINTR)//被中斷,重入
goto again;
else if (n 0){//出錯(cuò)
fprintf(stderr,"read error:%s\n\a",strerror(errno));
exit(1);
}
}
修改代碼后,當(dāng)客戶端斷開連接后,服務(wù)器端父進(jìn)程收到子進(jìn)程的SIGCHLD信號后,會(huì)執(zhí)行sig_chld函數(shù),對子進(jìn)程進(jìn)行了清理,便不會(huì)再出現(xiàn)僵尸進(jìn)程。此時(shí),一個(gè)客戶端主動(dòng)斷開連接后,服務(wù)器端會(huì)輸出類似如下信息:
child 12306 terminated
wait和waitpid
上述程序中sig_chld函數(shù),我們使用了wait()來清除終止的子進(jìn)程。還有一個(gè)類似的函數(shù)wait_pid。我們先來看看這兩個(gè)函數(shù)原型:
pid_t wait(int *status);
pid_t waitpid(pid_t pid, int *status, int options);
官方描述:All of these system calls are used to wait for state changes in a child of the calling process, and obtain information about the child whose state has changed. A state change is considered to be: the child ter minated; the child was stopped by a signal; or the child was resumed by a signal. In the case of a terminated child, performing a wait allows the system to release the resources associated with the child; if a wait is not performed, then the terminated child remains in a "zombie" state (see NOTES below).
關(guān)于wait和waitpid兩者的區(qū)別與聯(lián)系:
The wait() system call suspends execution of the calling process until one of its children terminates. The call wait(status) is equivalent to:
waitpid(-1, status, 0);
The waitpid() system call suspends execution of the calling process until a child specified by pid argument has changed state. By default, waitpid() waits only for terminated children, but this behavior is modifiable via the options argument, as described below.
也就是說,wait()系統(tǒng)調(diào)用會(huì)掛起調(diào)用進(jìn)程,直到它的任意一個(gè)子進(jìn)程終止。調(diào)用wait(status)的效果跟調(diào)用waitpid(-1, status, 0)的效果是一樣一樣的。
waitpid()會(huì)掛起調(diào)用進(jìn)程,直到參數(shù)pid指定的進(jìn)程狀態(tài)改變,默認(rèn)情況下,waitpid() 只等待子進(jìn)程的終止?fàn)顟B(tài)。如果需要,可以通過設(shè)置options的值,來處理非終止?fàn)顟B(tài)的情況。比如:
The value of options is an OR of zero or more of the following constants:
WNOHANG return immediately if no child has exited.
WUNTRACED also return if a child has stopped (but not traced via ptrace(2)). Status for traced children which have stopped is provided even if this option is not specified.
WCONTINUED (since Linux 2.6.10)also return if a stopped child has been resumed by delivery of SIGCONT.
等等一下非終止?fàn)顟B(tài)。
現(xiàn)在來通過實(shí)例看看wait()和waitpid()的區(qū)別。
通過修改客戶端程序,在客戶端程序中一次性建立5個(gè)套接字連接到服務(wù)器,狀態(tài)如下圖所示(附代碼):
#include stdlib.h>
#include stdio.h>
#include errno.h>
#include string.h>
#include unistd.h>
#include sys/socket.h>
#include netinet/in.h>
#include sys/types.h>
#include netdb.h>
#define SERV_PORT 1113
#define MAXLINE 1024
void str_cli(FILE *fp, int sockfd);
int
main(int argc, char **argv)
{
int i,sockfd[5];
struct sockaddr_in servaddr;
if (argc != 2){
fprintf(stderr,"usage: tcpcli IPaddress>\n\a");
exit(0);
}
for(i=0;i5;++i){//與服務(wù)器建立五個(gè)連接,以使得服務(wù)器創(chuàng)建5個(gè)子進(jìn)程
if((sockfd[i]=socket(AF_INET,SOCK_STREAM,0))==-1){
fprintf(stderr,"Socket error:%s\n\a",strerror(errno));
exit(1);
}
/* 客戶程序填充服務(wù)端的資料*/
bzero(servaddr,sizeof(servaddr));
servaddr.sin_family=AF_INET;
servaddr.sin_port=htons(SERV_PORT);
if (inet_pton(AF_INET, argv[1], servaddr.sin_addr) = 0){
fprintf(stderr,"inet_pton Error:%s\a\n",strerror(errno));
exit(1);
}
/* 客戶程序發(fā)起連接請求*/
if(connect(sockfd[i],(struct sockaddr *)(servaddr),sizeof(struct sockaddr))==-1){
fprintf(stderr,"connect Error:%s\a\n",strerror(errno));
exit(1);
}
}
str_cli(stdin, sockfd[0]);/*僅用第一個(gè)套接字與服務(wù)器交互*/
exit(0);
}
void
str_cli(FILE *fp, int sockfd)
{
int nbytes=0;
char sendline[MAXLINE],recvline[MAXLINE];
while (fgets(sendline, MAXLINE, fp) != NULL){//從標(biāo)準(zhǔn)輸入中讀取一行
write(sockfd, sendline, strlen(sendline));//將該行發(fā)送給服務(wù)器
if ((nbytes=read(sockfd, recvline, MAXLINE)) == 0){//從sockfd讀取從服務(wù)器發(fā)來的數(shù)據(jù)
fprintf(stderr,"str_cli: server terminated prematurely\n");
exit(1);
}
recvline[nbytes]='\0';
fputs(recvline, stdout);
}
}
當(dāng)客戶終止時(shí),所以打開的描述子均由內(nèi)核自動(dòng)關(guān)閉,因此5個(gè)連接基本在同一時(shí)刻發(fā)生,相當(dāng)于同時(shí)引發(fā)了5個(gè)FIN發(fā)往服務(wù)器,這會(huì)導(dǎo)致5個(gè)服務(wù)器子進(jìn)程基本在同一時(shí)刻終止,從而導(dǎo)致5個(gè)SIGCHLD信號幾乎同時(shí)遞送給服務(wù)器父進(jìn)程,示意圖如下所示:
也就是說,幾乎在同一時(shí)刻,遞送5個(gè)SIGCHLD信號給父進(jìn)程,這又會(huì)僵尸進(jìn)程進(jìn)程的出現(xiàn)。因?yàn)閡nix一般不對信號進(jìn)行排隊(duì),這就導(dǎo)致了5個(gè)SIGCHLD遞交上去,只執(zhí)行了一次sig_chld函數(shù),剩下四個(gè)子進(jìn)程便成為了僵尸進(jìn)程。對于這種情況,正確的做法是調(diào)用waitpid(),而不是wait()。
因此,我們最后的服務(wù)器端代碼中的信號處理函數(shù)做一點(diǎn)小改動(dòng),改成如下:
void
sig_chld(int signo)
{
pid_t pid;
int stat;
while ( (pid = waitpid(-1, stat, WNOHANG)) > 0)
printf("child %d terminated\n", pid);
return;
}
至此,我們解決了網(wǎng)絡(luò)編程中可能遇到的三類情況:
1.當(dāng)派生子進(jìn)程時(shí),必須捕獲SIGCHLD信號。代碼片段:signal(SIGCHLD,sig_chld);
2.當(dāng)捕獲信號時(shí),必須處理被中斷的系統(tǒng)調(diào)用。代碼片段:if(errno==EINTR) continue;
3.SIGCHLD信號處理函數(shù)必須編寫正確,以防出現(xiàn)僵尸進(jìn)程。代碼片段:while ( (pid = waitpid(-1, stat, WNOHANG)) > 0)