Leela Zero が公開されています

2017/11/152018/03/20アプリ

Leelaの作者による Leela ZERO が公開されています。ソースコードも GitHub から見ることができます。Alpha GO Zero の論文を元にして開発されています。

This is a fairly faithful reimplementation of the system described in the Alpha Go Zero paper “Mastering the Game of Go without Human Knowledge.
gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper.

動作に必要なファイル

Leela Zero を動かすには以下の3つのファイルが必要です。

Leela Zero 本体　（Releases · gcp/leela-zero）
Weightファイル　（　https://sjeng.org/zero/best_v1.txt.zip　）
GTPプロトコルを解する表示・操作用プログラム(GUI)（Sabaki – http://sabaki.yichuanshen.de/ ）

各ファイルの場所は gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper. に書かれています。

Leela Zero のコマンドライン引数

下記の引数を受け付けるようです。Leela同士の対局の場合は –noponder （先読みしない）をつけた方が良いでしょう。非力なPCなら –playouts も必要かもしれません。
解説にあるように –gtp と -w は必須です。

D:Leela-Zeroleela-zero-0.3-windows>leelaz.exe -h
Leela Zero  Copyright (C) 2017  Gian-Carlo Pascutto
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions; see the COPYING file for details.

Allowed options:
  -h [ --help ]                 Show commandline options.
  -g [ --gtp ]                  Enable GTP mode.
  -t [ --threads ] arg (=2)     Number of threads to use.
  -p [ --playouts ] arg         Weaken engine by limiting the number of
                                playouts. Requires --noponder.
  -b [ --lagbuffer ] arg (=100) Safety margin for time usage in centiseconds.
  -r [ --resignpct ] arg (=10)  Resign when winrate is less than x%.
  -m [ --randomcnt ] arg (=0)   Play more randomly the first x moves.
  -n [ --noise ]                Enable policy network randomization.
  -w [ --weights ] arg          File with network weights.
  -l [ --logfile ] arg          File to log input/output to.
  -q [ --quiet ]                Disable all diagnostic output.
  --noponder                    Disable thinking on opponent's time.
  --gpu arg                     ID of the OpenCL device(s) to use (disables
                                autodetection).
  --rowtiles arg (=5)           Split up the board in # tiles.

Sabakiの設定

Engines → Manage Engines から Add し、 Leela Zero を追加しました。Leela Zero 同士でも対局させたいので、　–noponder も設定。

あとは File → New を選んで、対局者を設定すれば対戦可能です。

私のPCは Intel Core i7 4790@3.6GHz で4コアあるので –threads 4 も指定して走らせましたが、1手50秒ほどかかりました。専用のGPUを積んでいないので(CPU内蔵GPU)、こんなもんでしょうか。playoutsを減らすと思考時間は減りました。-p 500 なら、時々長考するぐらいでサクサク打てます。それでも私よりは強いです（というか、私が弱すぎて比較にならない）。

高速化のためには専用GPUを買えば良いのだろうけど、作者おすすめのGPUは結構いい値段します。

ソースコード公開の意図

AlphaGO Zero と同じ学習結果(weights)を得るには、普通のハードウェアでは1700年かかるそうです。Google並の資金力がある組織でもなければ、再現できません。このプログラムを公開することで、協働作業していきたいようです。詳細は後にアナウンスされるようです。

Gimme the weights
Recomputing the AlphaGo Zero weights will take about 1700 years on commodity hardware, see for example: http://computer-go.org/pipermail/computer-go/2017-October/010307.html
One reason for publishing this program is that we are setting up a public, distributed effort to repeat the work. Working together, and especially when starting on a smaller scale, it will take less than 1700 years to get a good network (which you can feed into this program, suddenly making it strong). Further details about this will be announced soon.
gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper.

自力で学習して weight ファイルを鍛える方法も上記ドキュメントに書かれていますので、良いGPUをお持ちの方は試してみると良いかもしれません。

2017/11/15アプリ

Posted by ず