org-page

static site generator

golang aop的一种实现方法

实际这个标题有点夸张了。我原本也不想实现golang的aop,也无意去实现。我的出发点仅仅是APM(Application Performance Management)。我们总说golang是一门怎么怎么神奇好用的语言,然而仅仅是一个aop都难以优雅地实现。再比如,opentracing,设计了一整套api,做分布式调用链追踪,其中包括java、golang等语言,实际上纯用java的人可能根本不需要用opentracing这套东西。java的字节码增强实在是太好用了。根本就不需要这样繁琐地在代码中显示地调用opentracing的api。所以golang有没有类似java这样的字节码增强的方法呢?肯定没有啊。

我们来了解下go代码是怎样变成机器码的。

How a Go Program Compiles down to Machine Code

大意是:

*.go -> AST(Abstract Syntax Tree) -> SSA(Static Single Assignment) -> machine-specific SSA -> Machine Code

显然,在AST到SSA这个过程中,可以通过修改语法树的方式,达到类似java的字节码增强的效果。所以需要改golang的编译器。

提起APM,两年前做APM调研的时候调研了OneAPM的产品,据当时的销售说,当时无人做golang相关的APM产品,时隔三年,OneAPM也支持了golang了。从golang agent的 安装手册 手册来看,OneAPM应该是用了类似jaeger的方案,需要手工埋点。

开发环境准备

工欲善其事必先利其器,我敬佩那些用vim和emacs写golang代码的大神(虽然我也用emacs写golang代码),但是面对golang源码这个大工程,我还是选择使用GoLand。因为牵涉到很多环境变量的切换,推荐使用 direnv 方便切换配置。

准备源代码

mkdir -p $HOME/src/github.com/golang
cd $HOME/src/github.com/golang
git clone https://github.com/golang/go.git
cd $HOME/src/github.com/golang/go/src
echo 'export GOROOT=$HOME/src/github.com/golang/go
export PATH=$GOROOT/bin:$PATH
export GOBIN=$GOROOT/bin' > .envrc
direnv allow

安装BOOTSTRAP环境(go编译器通过go语言编译,正如gcc通过gcc编译一般)

mkdir $HOME/gos
cd $HOME/gos
curl https://dl.google.com/go/go1.12.10.darwin-amd64.tar.gz | tar xvzf -
mv go go1.12.10
cd $HOME/src/github.com/golang/go/src
echo 'export GOROOT_BOOTSTRAP=$HOME/gos/go1.12.10' >> .envrc
direnv allow

切换到最新的tag,并创建一个分支

cd $HOME/src/github.com/golang/go
git checkout go1.13.1
git checkout -b go1.13.1-playground

首次尝试编译go编译器

cd $HOME/src/github.com/golang/go/src
./make.bash

会看到如下的输出

Building Go cmd/dist using /Users/shane/gos/go1.12.10.
Building Go toolchain1 using /Users/shane/gos/go1.12.10.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for darwin/amd64.
---
Installed Go for darwin/amd64 in /Users/shane/src/github.com/golang/go
Installed commands in /Users/shane/src/github.com/golang/go/bin

验证编译

which go
go version

GoLand设置

创建项目

File -> Open 选择工程根目录

66475611-3a39e980-ea83-11e9-802e-3b118d1ac906.png

设置GOROOT

$HOME/src/github.com/golang/go

66475821-a4528e80-ea83-11e9-970b-8982bafbab77.png

创建一个playground项目,用于测试编译

clone我的测试项目

cd $HOME/src/github.com/shanexu
git clone https://github.com/shanexu/go-playground.git

配置环境变量

cd $HOME/src/github.com/shanexu/go-playground
echo 'export GOROOT=$HOME/src/github.com/golang/go
export PATH=$GOROOT/bin:$PATH
export GOBIN=$(pwd)/bin' > .envrc

至此整个开发环境算是搭建成功了

go build 过程分析

cd $HOME/src/github.com/shanexu/go-playground
go build -o bin/helloworld helloworld/main.go

先从 go build 命令开始。go命令本身就是多个子命令的入口,比如我们现在要研究的build命令,就是他的一个子命令,其源码在 src/cmd/go/internal/work/build.go 中。

23: var CmdBuild = &base.Command{
24:   UsageLine: "go build [-o output] [-i] [build flags] [packages]",
25:   Short:     "compile packages and dependencies",  

配置一个运行配置如下图所示: 66554844-37033400-eb3d-11e9-8558-f42b8458b1e7.png

经过断点和肉眼调试,go build过程大致如下:

digraph G {
    "main.main at main.go" -> "cmd/go/internal/work.runBuild at build.go"
    "cmd/go/internal/work.runBuild at build.go" -> "cmd/go/internal/work.(*Builder).Do at exec.go"
    "cmd/go/internal/work.(*Builder).Do at exec.go" -> "writeActionGraph"
    "cmd/go/internal/work.(*Builder).Do at exec.go" -> "cmd/go/internal/work.(*Builder).Do.func3 at exec.go:177 handle(a)"
    "cmd/go/internal/work.(*Builder).Do.func3 at exec.go:177 handle(a)" -> "cmd/go/internal/work.(*Builder).Do.func2 at exec.go:117 err = a.Func(b, a)"
    "cmd/go/internal/work.(*Builder).Do.func2 at exec.go:117 err = a.Func(b, a)" -> "cmd/go/internal/work.(*Builder).build at exec.go:380"
    "cmd/go/internal/work.(*Builder).Do.func2 at exec.go:117 err = a.Func(b, a)" -> "cmd/go/internal/work.(*Builder).link at exec.go:1183"
    "cmd/go/internal/work.(*Builder).Do.func2 at exec.go:117 err = a.Func(b, a)" -> "cmd/go/internal/work.BuildInstallFunc at exec.go:1438"
}

main 方法调用, runBuild 方法, runBuild 再调用 *Builder.Do 方法,在 Do 方法中根据 Action 的依赖关系,调用 ActionFunc 方法。这里有个 writeActionGraph 方法,这个方法会打印 action 的关系图,但是由一个命令行参数控制。

243:   // Undocumented, unstable debugging flags.
244:   cmd.Flag.StringVar(&cfg.DebugActiongraph, "debug-actiongraph", "", "")

完整的命令行如下,其中 -p 1 表示执行 action 时的并发度为1。

go build -debug-actiongraph /tmp/build.txt -p 1 -v -o bin/helloworld helloworld/main.go

我们得到完整的 actionGraph 内容如下:

转成图,如下:

digraph G {
"0 link-install command-line-arguments" -> "1 link command-line-arguments"
"1 link command-line-arguments" -> "2 build command-line-arguments"
"1 link command-line-arguments" -> "3 build context"
"1 link command-line-arguments" -> "4 build fmt"
"1 link command-line-arguments" -> "5 build runtime"
"1 link command-line-arguments" -> "6 build errors"
"1 link command-line-arguments" -> "7 build internal/reflectlite"
"1 link command-line-arguments" -> "8 build sync"
"1 link command-line-arguments" -> "9 build time"
"1 link command-line-arguments" -> "10 build internal/fmtsort"
"1 link command-line-arguments" -> "11 build io"
"1 link command-line-arguments" -> "12 build math"
"1 link command-line-arguments" -> "13 build os"
"1 link command-line-arguments" -> "14 build reflect"
"1 link command-line-arguments" -> "15 build strconv"
"1 link command-line-arguments" -> "16 build unicode/utf8"
"1 link command-line-arguments" -> "17 build internal/bytealg"
"1 link command-line-arguments" -> "18 build internal/cpu"
"1 link command-line-arguments" -> "19 build runtime/internal/atomic"
"1 link command-line-arguments" -> "20 build runtime/internal/math"
"1 link command-line-arguments" -> "21 build runtime/internal/sys"
"1 link command-line-arguments" -> "22 build internal/race"
"1 link command-line-arguments" -> "23 build sync/atomic"
"1 link command-line-arguments" -> "24 build syscall"
"1 link command-line-arguments" -> "25 build sort"
"1 link command-line-arguments" -> "26 build math/bits"
"1 link command-line-arguments" -> "27 build internal/oserror"
"1 link command-line-arguments" -> "28 build internal/poll"
"1 link command-line-arguments" -> "29 build internal/syscall/unix"
"1 link command-line-arguments" -> "30 build internal/testlog"
"1 link command-line-arguments" -> "31 build unicode"
"2 build command-line-arguments" -> "3 build context"
"2 build command-line-arguments" -> "4 build fmt"
"2 build command-line-arguments" -> "5 build runtime"
"2 build command-line-arguments" -> "32 nop "
"3 build context" -> "6 build errors"
"3 build context" -> "7 build internal/reflectlite"
"3 build context" -> "8 build sync"
"3 build context" -> "9 build time"
"4 build fmt" -> "6 build errors"
"4 build fmt" -> "10 build internal/fmtsort"
"4 build fmt" -> "11 build io"
"4 build fmt" -> "12 build math"
"4 build fmt" -> "13 build os"
"4 build fmt" -> "14 build reflect"
"4 build fmt" -> "15 build strconv"
"4 build fmt" -> "8 build sync"
"4 build fmt" -> "16 build unicode/utf8"
"5 build runtime" -> "17 build internal/bytealg"
"5 build runtime" -> "18 build internal/cpu"
"5 build runtime" -> "19 build runtime/internal/atomic"
"5 build runtime" -> "20 build runtime/internal/math"
"5 build runtime" -> "21 build runtime/internal/sys"
"5 build runtime" -> "33 built-in package unsafe"
"6 build errors" -> "7 build internal/reflectlite"
"7 build internal/reflectlite" -> "5 build runtime"
"7 build internal/reflectlite" -> "33 built-in package unsafe"
"8 build sync" -> "22 build internal/race"
"8 build sync" -> "5 build runtime"
"8 build sync" -> "23 build sync/atomic"
"8 build sync" -> "33 built-in package unsafe"
"9 build time" -> "6 build errors"
"9 build time" -> "5 build runtime"
"9 build time" -> "8 build sync"
"9 build time" -> "24 build syscall"
"9 build time" -> "33 built-in package unsafe"
"10 build internal/fmtsort" -> "14 build reflect"
"10 build internal/fmtsort" -> "25 build sort"
"11 build io" -> "6 build errors"
"11 build io" -> "8 build sync"
"11 build io" -> "23 build sync/atomic"
"12 build math" -> "18 build internal/cpu"
"12 build math" -> "26 build math/bits"
"12 build math" -> "33 built-in package unsafe"
"13 build os" -> "6 build errors"
"13 build os" -> "27 build internal/oserror"
"13 build os" -> "28 build internal/poll"
"13 build os" -> "29 build internal/syscall/unix"
"13 build os" -> "30 build internal/testlog"
"13 build os" -> "11 build io"
"13 build os" -> "5 build runtime"
"13 build os" -> "8 build sync"
"13 build os" -> "23 build sync/atomic"
"13 build os" -> "24 build syscall"
"13 build os" -> "9 build time"
"13 build os" -> "33 built-in package unsafe"
"14 build reflect" -> "12 build math"
"14 build reflect" -> "5 build runtime"
"14 build reflect" -> "15 build strconv"
"14 build reflect" -> "8 build sync"
"14 build reflect" -> "31 build unicode"
"14 build reflect" -> "16 build unicode/utf8"
"14 build reflect" -> "33 built-in package unsafe"
"15 build strconv" -> "6 build errors"
"15 build strconv" -> "17 build internal/bytealg"
"15 build strconv" -> "12 build math"
"15 build strconv" -> "26 build math/bits"
"15 build strconv" -> "16 build unicode/utf8"
"17 build internal/bytealg" -> "18 build internal/cpu"
"17 build internal/bytealg" -> "33 built-in package unsafe"
"19 build runtime/internal/atomic" -> "33 built-in package unsafe"
"20 build runtime/internal/math" -> "21 build runtime/internal/sys"
"22 build internal/race" -> "33 built-in package unsafe"
"23 build sync/atomic" -> "33 built-in package unsafe"
"24 build syscall" -> "6 build errors"
"24 build syscall" -> "17 build internal/bytealg"
"24 build syscall" -> "27 build internal/oserror"
"24 build syscall" -> "22 build internal/race"
"24 build syscall" -> "5 build runtime"
"24 build syscall" -> "8 build sync"
"24 build syscall" -> "33 built-in package unsafe"
"25 build sort" -> "7 build internal/reflectlite"
"26 build math/bits" -> "33 built-in package unsafe"
"27 build internal/oserror" -> "6 build errors"
"28 build internal/poll" -> "6 build errors"
"28 build internal/poll" -> "11 build io"
"28 build internal/poll" -> "5 build runtime"
"28 build internal/poll" -> "8 build sync"
"28 build internal/poll" -> "23 build sync/atomic"
"28 build internal/poll" -> "24 build syscall"
"28 build internal/poll" -> "9 build time"
"28 build internal/poll" -> "33 built-in package unsafe"
"29 build internal/syscall/unix" -> "24 build syscall"
"29 build internal/syscall/unix" -> "33 built-in package unsafe"
"30 build internal/testlog" -> "23 build sync/atomic"
"32 nop " -> "3 build context"
"32 nop " -> "4 build fmt"
"32 nop " -> "5 build runtime"
"32 nop " -> "6 build errors"
"32 nop " -> "7 build internal/reflectlite"
"32 nop " -> "8 build sync"
"32 nop " -> "9 build time"
"32 nop " -> "10 build internal/fmtsort"
"32 nop " -> "11 build io"
"32 nop " -> "12 build math"
"32 nop " -> "13 build os"
"32 nop " -> "14 build reflect"
"32 nop " -> "15 build strconv"
"32 nop " -> "16 build unicode/utf8"
"32 nop " -> "17 build internal/bytealg"
"32 nop " -> "18 build internal/cpu"
"32 nop " -> "19 build runtime/internal/atomic"
"32 nop " -> "20 build runtime/internal/math"
"32 nop " -> "21 build runtime/internal/sys"
"32 nop " -> "22 build internal/race"
"32 nop " -> "23 build sync/atomic"
"32 nop " -> "24 build syscall"
"32 nop " -> "25 build sort"
"32 nop " -> "26 build math/bits"
"32 nop " -> "27 build internal/oserror"
"32 nop " -> "28 build internal/poll"
"32 nop " -> "29 build internal/syscall/unix"
"32 nop " -> "30 build internal/testlog"
"32 nop " -> "31 build unicode"
}

观察 actions[2].Cmd 的内容。可见 go build 命令实际上是调用了对应系统(OS)架构(ARCH)的编译器命令(compile)来编译源代码的。

[
  "/Users/shane/src/github.com/golang/go/pkg/tool/darwin_amd64/compile -o /var/folders/8x/6h3nms2s34z7vwk5blbsz3100000gn/T/go-build730813966/b001/_pkg_.a -trimpath \"/var/folders/8x/6h3nms2s34z7vwk5blbsz3100000gn/T/go-build730813966/b001=>\" -p main -lang=go1.13 -complete -buildid z5Cb5jRJruTRtEF3nuzz/z5Cb5jRJruTRtEF3nuzz -goversion go1.13.1 -D _/Users/shane/src/github.com/shanexu/go-playground/helloworld -importcfg /var/folders/8x/6h3nms2s34z7vwk5blbsz3100000gn/T/go-build730813966/b001/importcfg -pack -c=12 /Users/shane/src/github.com/shanexu/go-playground/helloworld/main.go /var/folders/8x/6h3nms2s34z7vwk5blbsz3100000gn/T/go-build730813966/b001/_gomod_.go"
]

命令行中有两个文件引起了我的兴趣: importcfg_gomod_.go

然而,在go build命令运行结束后这些文件,都会被删除,为了防止这样的事情发生,我在go build运行的过程中增加了两个条件断点—— cmd/go/internal/work/exec.go 第117、119行,条件为 a.json.ID = 2= ,ID为2的action正是main.go的编译过程。

第117行开始执行Action,第119行Action执行结束。

109:   // Handle runs a single action and takes care of triggering
110:   // any actions that are runnable as a result.
111:   handle := func(a *Action) {
112:     if a.json != nil {
113:       a.json.TimeStart = time.Now()
114:     }
115:     var err error
116:     if a.Func != nil && (!a.Failed || a.IgnoreFail) {
117:       err = a.Func(b, a)
118:     }
119:     if a.json != nil {
120:       a.json.TimeDone = time.Now()
121:     }
122: 
123:     // The actions run in parallel but all the updates to the
124:     // shared work state are serialized through b.exec.

在代码运行到119行后就可以获取文件内容。

_gomod_.go

1: package main
2: import _ "unsafe"
3: //go:linkname __debug_modinfo__ runtime.modinfo
4: var __debug_modinfo__ = "0w\xaf\f\x92t\b\x02A\xe1\xc1\a\xe6\xd6\x18\xe6path\tcommand-line-arguments\nmod\tgithub.com/shanexu/go-playground\t(devel)\t\n\xf92C1\x86\x18 r\x00\x82B\x10A\x16\xd8\xf2"
5:     

importcfg

# import config
packagefile context=/Users/shane/src/github.com/golang/go/pkg/darwin_amd64/context.a
packagefile fmt=/Users/shane/src/github.com/golang/go/pkg/darwin_amd64/fmt.a
packagefile runtime=/Users/shane/src/github.com/golang/go/pkg/darwin_amd64/runtime.a  

有了这两个文件以及命令行参数后,我们就可以手动执行compile命令了。在GoLand里添加一个新的run configuration。

66623436-39b46680-ebdb-11e9-8591-bd88617988ea.png

其中 Program arguments 填入如下的值。

-o /tmp/test/_pkg_.a -trimpath "/tmp/test=>" -p main -complete -buildid dcQ8aaV0cfiucttoOzOD/dcQ8aaV0cfiucttoOzOD -D /Users/shane/src/github.com/shanexu/go-playground -importcfg /tmp/test/importcfg -pack -c=12 /Users/shane/src/github.com/shanexu/go-playground/helloworld/main.go /tmp/test/_gomod_.go

至此我们就可以进入下一阶段的compile过程的分析了。

compile 过程分析

从入口文件 cmd/compile/main.go 看起。

40: func main() {
41:   // disable timestamps for reproducible output
42:   log.SetFlags(0)
43:   log.SetPrefix("compile: ")
44: 
45:   archInit, ok := archInits[objabi.GOARCH]
46:   if !ok {
47:     fmt.Fprintf(os.Stderr, "compile: unknown architecture %q\n", objabi.GOARCH)
48:     os.Exit(2)
49:   }
50: 
51:   gc.Main(archInit)
52:   gc.Exit(0)
53: }

从第51行开始进入真正的编译过程,其主要逻辑在 cmd/compile/internal/gc/main.go 中。整个编译过程可以分成几个阶段。

digraph G {
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:133 fe:init"
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:508 fe:loadsys"
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:511 fe:parse"
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:534 fe:typecheck:top1"
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:546 fe:typecheck:top2"
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:553 fe:typecheck:func"
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:589 fe:typecheck:capturevars"
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:605 fe:typecheck:inlining"
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:645 fe:typecheck:escapes"
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:659 fe:typecheck:xclosures"
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:680 fe:typecheck:compilefuncs"
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:714 fe:typecheck:externaldcls"
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:729 fe:typecheck:dumpobj"

"cmd/compile/internal/gc.Main at main.go:511 fe:parse" -> "cmd/compile/internal/gc.parseFiles at noder.go:27"
"cmd/compile/internal/gc.parseFiles at noder.go:27" -> "cmd/compile/internal/gc.parseFiles.func1 at noder.go:52"
"cmd/compile/internal/gc.parseFiles.func1 at noder.go:52" -> "cmd/compile/internal/syntax.Parse at syntax.go:58"

"cmd/compile/internal/gc.parseFiles at noder.go:27" -> "cmd/compile/internal/gc.parseFiles at noder.go:62"
"cmd/compile/internal/gc.parseFiles at noder.go:62" -> "cmd/compile/internal/gc.(*noder).node at noder.go:237"

}  

main.go 中有用于记录各步骤性能的 timings ,例如下面的几行代码。

511:   timings.Start("fe", "parse")
512:   lines := parseFiles(flag.Args())
513:   timings.Stop()

在整个编译过程结束后,根据 benchfile 变量的值来选择是否输出bench结果。

758:   if benchfile != "" {
759:     if err := writebench(benchfile); err != nil {
760:       log.Fatalf("cannot write benchmark data: %v", err)
761:     }
762:   }

所以加上如下的命令行参数,我们就能得到bench结果了。

  -bench=/tmp/test/bench.txt

得到结果如下:

commit: go1.13.1
goos: darwin
goarch: amd64
BenchmarkCompile:main:fe:init              1     889956 ns/op     10.22 %
BenchmarkCompile:main:fe:loadsys           1     323673 ns/op      3.72 %
BenchmarkCompile:main:fe:parse             1    1147490 ns/op     13.17 %    28 lines    24401 lines/s
BenchmarkCompile:main:fe:typecheck:top1    1     364684 ns/op      4.19 %
BenchmarkCompile:main:fe:typecheck:top2    1      19438 ns/op      0.22 %
BenchmarkCompile:main:fe:typecheck:func    1      36286 ns/op      0.42 %     2 funcs    55118 funcs/s
BenchmarkCompile:main:fe:capturevars       1        326 ns/op      0.00 %
BenchmarkCompile:main:fe:inlining          1    1799996 ns/op     20.67 %
BenchmarkCompile:main:fe:escapes           1     345481 ns/op      3.97 %
BenchmarkCompile:main:fe:xclosures         1     939317 ns/op     10.78 %
BenchmarkCompile:main:fe:subtotal          1    5866647 ns/op     67.36 %
BenchmarkCompile:main:be:compilefuncs      1    2145648 ns/op     24.63 %     2 funcs      932 funcs/s
BenchmarkCompile:main:be:externaldcls      1       1618 ns/op      0.02 %
BenchmarkCompile:main:be:dumpobj           1     666268 ns/op      7.65 %
BenchmarkCompile:main:be:subtotal          1    2813534 ns/op     32.30 %
BenchmarkCompile:main:unaccounted          1      29703 ns/op      0.34 %
BenchmarkCompile:main:total                1    8709884 ns/op    100.00 %  

cmd/compile/internal/gc.parseFiles.func1 at noder.go:52 此处调用 syntax.Parse 对整个go文件进行语法解析。直觉上,只要修改这里生成的语法树,就能插入自定义代码了。

编译期插入代码

在进入这一阶段之前可以先阅读下这篇 Understanding Go programs with go/parser ,对go的AST有一定的感性认识。

比如这个我们一直在默默测试的 main.go

 1: package main
 2: 
 3: import (
 4:   "context"
 5:   "fmt"
 6: )
 7: 
 8: func hello(ctx context.Context) {
 9:   fmt.Println("hello world")
10: }
11: 
12: func main() {
13:   hello(context.Background())
14: }
15: 

经过 go/parser.ParseFile

 1: package main
 2: 
 3: import (
 4:   "go/parser"
 5:   "go/token"
 6:   "io/ioutil"
 7:   "os"
 8:   "path/filepath"
 9: 
10:   "github.com/davecgh/go-spew/spew"
11: )
12: 
13: func main() {
14:   home, err := os.UserHomeDir()
15:   if err != nil {
16:     panic(err)
17:   }
18: 
19:   file := filepath.Join(home, "src", "github.com", "shanexu", "go-playground", "helloworld", "main.go")
20:   src, err := ioutil.ReadFile(file)
21:   if err != nil {
22:     panic(err)
23:   }
24: 
25:   fset := token.NewFileSet()
26:   f, err := parser.ParseFile(fset, "main.go", src, parser.AllErrors)
27:   if err != nil {
28:     panic(err)
29:   }
30: 
31:   spew.Dump(f)
32: }  

得到AST:

cmd/compile/internal/syntax.Parse 的结果则都是 cmd/compile/internal/syntax 包下的类型。基本都能和 go/ast 下的类型一一对应。比如:ast.File和syntax.File,ast.ExprStmt和syntax.ExprStmt,ast.Ident和syntax.Name,ast.CallExpr和syntax.CallExpr,ast.SelectorExpr和syntax.SelectorExpr。

我们现在可以对AST出手。 例如我们想针对所有main包下以hello开头的函数在进入方法时打印“start ${方法名}...”,在离开方法是打印“stop ${方法名}...”, 则可以在文件 cmd/compile/internal/gc/noder.go Parse结束后,修改p.file的值。代码如下:

52:       p.file, _ = syntax.Parse(base, f, p.error, p.pragma, syntax.CheckBranches) // errors are tracked via p.error
53:       if p.file.PkgName.Value == "main" {
54:         for _, d := range p.file.DeclList {
55:           d, _ := d.(*syntax.FuncDecl)
56:           if d == nil {
57:             continue
58:           }
59:           if !strings.HasPrefix(d.Name.Value, "hello") {
60:             continue
61:           }
62:           d.Body.List = append([]syntax.Stmt{
63:             &syntax.ExprStmt{
64:               X: &syntax.CallExpr{
65:                 Fun: &syntax.SelectorExpr{
66:                   X:   &syntax.Name{Value: "fmt"},
67:                   Sel: &syntax.Name{Value: "Println"},
68:                 },
69:                 ArgList: []syntax.Expr{
70:                   &syntax.BasicLit{
71:                     Value: strconv.Quote("start " + d.Name.Value + "..."),
72:                     Kind:  syntax.StringLit,
73:                   },
74:                 },
75:               },
76:             },
77:             &syntax.CallStmt{
78:               Tok: syntax.Defer,
79:               Call: &syntax.CallExpr{
80:                 Fun: &syntax.SelectorExpr{
81:                   X:   &syntax.Name{Value: "fmt"},
82:                   Sel: &syntax.Name{Value: "Println"},
83:                 },
84:                 ArgList: []syntax.Expr{
85:                   &syntax.BasicLit{
86:                     Value: strconv.Quote("stop " + d.Name.Value + "..."),
87:                     Kind:  syntax.StringLit,
88:                   },
89:                 },
90:               },
91:             },
92:           }, d.Body.List...)
93:         }
94:       }  

修改完后需要重新编译golang。

cd $HOME/src/github.com/golang/go/src
./make.bash

cd $HOME/src/github.com/shanexu/go-playground
go clean -cache
go run helloworld/main.go

看到如下结果:

start hello...
hello world
stop hello...  

在修改了AST之后,实际上hello方法的源码应该长这样:

func hello(ctx context.Context) {
  fmt.Println("start hello...")
  defer fmt.Println("stop hello...")
  fmt.Println("hello world")
}

这里有个问题,插入的代码新引入了fmt包,如果原始代码里面没有引入fmt包会怎样?

$ go clean -cache; go run helloworld/main.go
# command-line-arguments
helloworld/main.go:7:16: undefined: fmt in fmt.Println  

果然编译失败了。

在语法树中按需插入import呢?

 52:       if p.file.PkgName.Value == "main" {
 53:         for _, d := range p.file.DeclList {
 54:           d, _ := d.(*syntax.FuncDecl)
 55:           if d == nil {
 56:             continue
 57:           }
 58:           if !strings.HasPrefix(d.Name.Value, "hello") {
 59:             continue
 60:           }
 61:           hasHello = true
 62:           d.Body.List = append([]syntax.Stmt{
 63:             &syntax.ExprStmt{
 64:               X: &syntax.CallExpr{
 65:                 Fun: &syntax.SelectorExpr{
 66:                   X:   &syntax.Name{Value: "fmt"},
 67:                   Sel: &syntax.Name{Value: "Println"},
 68:                 },
 69:                 ArgList: []syntax.Expr{
 70:                   &syntax.BasicLit{
 71:                     Value: strconv.Quote("start " + d.Name.Value + "..."),
 72:                     Kind:  syntax.StringLit,
 73:                   },
 74:                 },
 75:               },
 76:             },
 77:             &syntax.CallStmt{
 78:               Tok: syntax.Defer,
 79:               Call: &syntax.CallExpr{
 80:                 Fun: &syntax.SelectorExpr{
 81:                   X:   &syntax.Name{Value: "fmt"},
 82:                   Sel: &syntax.Name{Value: "Println"},
 83:                 },
 84:                 ArgList: []syntax.Expr{
 85:                   &syntax.BasicLit{
 86:                     Value: strconv.Quote("stop " + d.Name.Value + "..."),
 87:                     Kind:  syntax.StringLit,
 88:                   },
 89:                 },
 90:               },
 91:             },
 92:           }, d.Body.List...)
 93:         }
 94:       }
 95:       if hasHello {
 96:         hasFmtImport := false
 97:         for _, d := range p.file.DeclList {
 98:           d, _ := d.(*syntax.ImportDecl)
 99:           if d == nil {
100:             continue
101:           }
102:           if d.Path.Value != "fmt" {
103:             continue
104:           }
105:           hasFmtImport = true
106:           break
107:         }
108:         if !hasFmtImport {
109:           p.file.DeclList = append([]syntax.Decl{
110:             &syntax.ImportDecl{
111:               Path: &syntax.BasicLit{
112:                 Value: `"fmt"`, Kind: syntax.StringLit,
113:               },
114:             },
115:           }, p.file.DeclList...)
116:         }
117:       }  

重新编译golang,执行go run

$ go clean -cache; go run helloworld/main.go
# command-line-arguments
helloworld/main.go:1:9: can't find import: "fmt"

根据错误信息找到,抛错位置,发现parseFile过程中,有findpkg的过程,起调用栈如下。

digraph G {
"cmd/compile/internal/gc.Main at main.go:512" -> "cmd/compile/internal/gc.parseFiles at noder.go:128" -> "cmd/compile/internal/gc.(*noder).node at noder.go:310" -> "cmd/compile/internal/gc.(*noder).decls at noder.go:355" -> "cmd/compile/internal/gc.(*noder).importDecl at noder.go:379" -> "cmd/compile/internal/gc.importfile at main.go:1120" -> "cmd/compile/internal/gc.findpkg at main.go:989"
}

包变量 packageFile 这个map中获取相关包的信息。

988:   if packageFile != nil {
989:     file, ok = packageFile[name]
990:     return file, ok
991:   }  

包变量 packageFilereadImportCfg 方法中初始化。

digraph G {
"main.main at main.go:51" -> "cmd/compile/internal/gc.Main at main.go:269" -> "cmd/internal/objabi.Flagparse at flag.go:34" -> "flag.Parse at flag.go:996" -> "flag.(*FlagSet).Parse at flag.go:968" -> "flag.(*FlagSet).parseOne at flag.go:949" -> "cmd/internal/objabi.fn1.Set at flag.go:158" -> "cmd/compile/internal/gc.readImportCfg at main.go:806"
}

所以 readImportCfg 方法的入参就是,调用compile命令时,选项 -importcfg 的值。

805: func readImportCfg(file string) {
806:   packageFile = map[string]string{}
807:   data, err := ioutil.ReadFile(file)
808:   if err != nil {
809:     log.Fatalf("-importcfg: %v", err)
810:   }

要解决这个问题,就需要在 importcfg 文件中加入新加的包。

问题又回到的了 importcfg 文件的生成过程了。

文件 cmd/go/internal/work/exec.go 第634行,有关importcfg的内容的逻辑。

634:   // Prepare Go import config.
635:   // We start it off with a comment so it can't be empty, so icfg.Bytes() below is never nil.
636:   // It should never be empty anyway, but there have been bugs in the past that resulted
637:   // in empty configs, which then unfortunately turn into "no config passed to compiler",
638:   // and the compiler falls back to looking in pkg itself, which mostly works,
639:   // except when it doesn't.
640:   var icfg bytes.Buffer
641:   fmt.Fprintf(&icfg, "# import config\n")
642:   for i, raw := range a.Package.Internal.RawImports {
643:     final := a.Package.Imports[i]
644:     if final != raw {
645:       fmt.Fprintf(&icfg, "importmap %s=%s\n", raw, final)
646:     }
647:   }  

现在归结于 Package.Imports 的值的设置了。

文件 go/build/build.go=,先用 =go/parser.ParseFile 解析源文件,然后获取其中的imports。

847:     pf, err := parser.ParseFile(fset, filename, data, parser.ImportsOnly|parser.ParseComments)
848:     if err != nil {
849:       badFile(err)
850:       continue
851:     }

修改代码如下:

847:     pf, err := parser.ParseFile(fset, filename, data, parser.ImportsOnly|parser.ParseComments)
848:     if err != nil {
849:       badFile(err)
850:       continue
851:     }
852: 
853:     if strings.HasSuffix(filename, "helloworld/main.go") {
854:       hasFmtImport := false
855:       for _, i := range pf.Imports {
856:         if i.Path.Value == `"fmt"` {
857:           hasFmtImport = true
858:           break
859:         }
860:       }
861:       if !hasFmtImport {
862:         pf.Imports = append(pf.Imports, &ast.ImportSpec{
863:           Path: &ast.BasicLit{
864:             Value: `"fmt"`,
865:             Kind:  token.STRING,
866:           },
867:         })
868:         if len(pf.Decls) > 0 {
869:           d, ok := pf.Decls[0].(*ast.GenDecl)
870:           if ok {
871:             d.Specs = append(d.Specs, &ast.ImportSpec{
872:               Path: &ast.BasicLit{
873:                 Kind:  token.STRING,
874:                 Value: `"fmt"`,
875:               },
876:             })
877:           }
878:         }
879:       }
880:     }  

重新编译golang,执行go run

$ go clean -cache; go run helloworld/main.go
start hello...
stop hello...  

成功加上了import。

创业未半

至此对golang编译器的解析和hack的过程也结束了。可见通过修改编译器生成的AST的方式,我们可以给特定文件、特定包、特定方法加上自定义代码。这里给golang实现aop提供了一种另类的思路。诚然要完成像 AspectJ 这样完整的解决方案,还有很大一段路路要走。

Comments

comments powered by Disqus