fork & Parallel::ForkManager

fork

標準関数のfork()の使い方。

use strict;

if (my $pid = fork()) {
    # parent
    print "here is parent proces.\n";
} elsif ($pid == 0) {
    # child
    sleep(3);
    print "child process ends.\n";
    exit 0;
} else {
 die "fork error: $!"
}
print "parent process ends.\n";
kotaro@script$ perl fork.pl
here is parent proces.
parent process ends.
kotaro@script$ child process ends.

Parallel::ForkManager

ただ、現在はParallel::ForkManagerを使うのが普通っぽい。
http://perldoc.jp/docs/modules/Parallel-ForkManager-0.7.5/ForkManager.pod
をほぼ丸コピでコードを書いてみた。

#!/usr/bin/perl
use strict;
use Parallel::ForkManager;

my $pm = new Parallel::ForkManager(10);
$pm->run_on_start(
    sub {   my ($pid,$ident)=@_;
            print "** $ident started, pid: $pid\n";
    }
);
$pm->run_on_finish(
    sub {
        my ($pid, $exit_code, $ident) = @_;
        print "** $ident just got out of the pool ".
                "with PID $pid and exit code: $exit_code\n";
    }
);

for (1..10) {
    my $pid;
    $pid = $pm->start && next;
    # 子プロセス
    sleep(3);
    $pm->finish;
}
print "waiting child process..\n";
$pm->wait_all_children;
print "end\n"
**  started, pid: 4105
**  started, pid: 4106
**  started, pid: 4107
**  started, pid: 4108
**  started, pid: 4109
**  started, pid: 4110
**  started, pid: 4111
**  started, pid: 4112
**  started, pid: 4113
**  started, pid: 4114
waiting child process..
**  just got out of the pool with PID 4105 and exit code: 0
**  just got out of the pool with PID 4106 and exit code: 0
**  just got out of the pool with PID 4107 and exit code: 0
**  just got out of the pool with PID 4108 and exit code: 0
**  just got out of the pool with PID 4109 and exit code: 0
**  just got out of the pool with PID 4110 and exit code: 0
**  just got out of the pool with PID 4111 and exit code: 0
**  just got out of the pool with PID 4112 and exit code: 0
**  just got out of the pool with PID 4113 and exit code: 0
**  just got out of the pool with PID 4114 and exit code: 0
end

もう少し突っ込んでみる

use Parallel::ForkManager;
my $MAX_PROCESSES = 5;

my $pm = new Parallel::ForkManager($MAX_PROCESSES);

my $who_am_i = 'parent';
for my $i (1..10) {
    $pm->start and next;
    print "$i: $MAX_PROCESSES\n";
    $who_am_i = 'child';
    $pm->finish;
}
$pm->wait_all_children;
print "who_am_i: $who_am_i\n";
1: 5
2: 5
3: 5
4: 5
5: 5
6: 5
7: 5
8: 5
9: 5
10: 5
who_am_i: parent

$pm->startした時点で子プロセスが生成されるが、その時点で親プロセスの変数等を丸っとコピーする模様。当然プロセス間で変数は共有できないので、$who_am_iは'parent'になる。

さらに突っ込む

$pm->finishする前にdieするとどうなるか?

use Parallel::ForkManager;
my $MAX_PROCESSES = 5;

my $pm = new Parallel::ForkManager($MAX_PROCESSES);
for my $i (1..10) {
    $pm->start and next;
    die "child($i) die.\n";
    $pm->finish;
}
$pm->wait_all_children;
print "parent finish.\n";
child(1) die.
child(2) die.
child(3) die.
child(4) die.
child(5) die.
child(6) die.
child(7) die.
child(8) die.
child(9) die.
child(10) die.
parent finish.

子プロセスが死んでも、親は最後まで実行される。

さらにさらに突っ込む
use Parallel::ForkManager;
my $MAX_PROCESSES = 5;

my $pm = new Parallel::ForkManager($MAX_PROCESSES);
eval {
    for my $i (1..10) {

        $pm->start and next;
        die "child($i) die.\n";
        $pm->finish;
        print "never executed!";
    }

    $pm->wait_all_children;
    print "parent finish.\n";
};
if ($@) {
    print "trap\n";
}
trap
trap
trap
trap
trap
trap
trap
trap
trap
trap
parent finish.

子プロセス立ち上げた時点の親プロセスのメモリがコピーされると思えば、なるほどの挙動ではある。ただ、子プロセスでのエラーは子プロセスでevalして、必要に応じてexit -> run_on_finishが真っ当。