image

Velodrome

velodrome.k8s.io/
是一个dashboard, monitoring and metrics for Kubernetes Developer Productivity. 一系列组件,用户监测developer productivity, 这个应该比较通用,略修改可以用于其他repo

与github交互的代码复用于github robot也比较容易

架构

  • Grafana stack: 前端 (用的都是开源组件,里面存的都是配置)
    • InfluxDB: save precalculated metrics
    • Prometheus: save poll-based metrics
    • Grafana: display graphs based on these metrics
    • nginx: proxy all of these services in a single URL
  • SQL base: containing a copy of the issues, events, and PRs in Github repositories. It is used for calculating statistics about developer productivity.
    • Fetcher: fetches Github data and stores in a SQL database (主要的go代码在这,调用github sdk去拉取数据)
    • SQL Proxy: SQL Proxy deployment to Cloud SQL (存的配置)
    • Transform: Transform SQL (Github db) into valuable metrics
  • Other monitoring tools
    • token-counter: Monitors RateLimit usage of your github

Fetcher

使用的资源

  • Issues (including pull-requests)
  • Events (associated to issues)
  • Comments (regular comments and review comments)

使用这些资源来

  • Compute average time-to-resolution for an issue/pull-request
  • Compute time between label creation/removal: lgtm'd, merged
  • break-down based on specific flags (size, priority, ...)
// ClientInterface describes what a client should be able to do
type ClientInterface interface {
    RepositoryName() string
    FetchIssues(last time.Time, c chan *github.Issue)
    FetchIssueEvents(issueID int, last *int, c chan *github.IssueEvent)
    FetchIssueComments(issueID int, last time.Time, c chan *github.IssueComment)
    FetchPullComments(issueID int, last time.Time, c chan *github.PullRequestComment)
}复制代码

流程

// 1. 入口
1. main -> cobra.Command(root) -> runProgram -> UpdateIssues
// 2. UpdateIssues test-infra/velodrome/fetcher/issues.go
// 调用client FetchIssues, channel 传递 issue model
go client.FetchIssues(latest, c)
for issue := range c {
    // 2.1 
    NewIssue(..)

    UpdateComments(*issue.Number, issueOrm.IsPR, db, client)
    // and find if we have new events
    UpdateIssueEvents(*issue.Number, db, client)
}

// 2.2 UpdateComments test-infra/velodrome/fetcher/comments.go
func UpdateComments(issueID int, pullRequest bool, db *gorm.DB, client ClientInterface) {
    latest := findLatestCommentUpdate(issueID, db, client.RepositoryName())

    updateIssueComments(issueID, latest, db, client)
    if pullRequest {
        updatePullComments(issueID, latest, db, client)
    }
}

func updateIssueComments(issueID int, latest time.Time, db *gorm.DB, client ClientInterface) {
    c := make(chan *github.IssueComment, 200)
    go client.FetchIssueComments(issueID, latest, c)
    for comment := range c {
        commentOrm, err := NewIssueComment(issueID, comment, client.RepositoryName())
        ...
    }
}

func updatePullComments(issueID int, latest time.Time, db *gorm.DB, client ClientInterface) {
    c := make(chan *github.PullRequestComment, 200)

    go client.FetchPullComments(issueID, latest, c)

    for comment := range c {
        commentOrm, err := NewPullComment(issueID, comment, client.RepositoryName())
        ...
    }
}


// 2.3 UpdateIssueEvents test-infra/velodrome/fetcher/issue-events.go
func UpdateIssueEvents(issueID int, db *gorm.DB, client ClientInterface) {
    ...
    c := make(chan *github.IssueEvent, 500)

    go client.FetchIssueEvents(issueID, latest, c)
    for event := range c {
        eventOrm, err := NewIssueEvent(event, issueID, client.RepositoryName())
        ...
    }
}复制代码

token-counter

transform

sql 中的github数据 --> transform --> metrics

func (config *transformConfig) run(plugin plugins.Plugin) error {
    ...

    // 处理 issue, comment 数据成为point -> influxdb
    go Dispatch(plugin, influxdb, fetcher.IssuesChannel,
        fetcher.EventsCommentsChannel)

    ticker := time.Tick(time.Hour / time.Duration(config.frequency))
    for {
        // Fetch new events from MySQL, push it to plugins
        if err := fetcher.Fetch(mysqldb); err != nil {
            return err
        }
        // 处理好的batch point,批量推送到influx db
        if err := influxdb.PushBatchPoints(); err != nil {
            return err
        }

        if config.once {
            break
        }
        // 最短多久跑一次
        <-ticker
    }
}

// Dispatch receives channels to each type of events, and dispatch them to each plugins.
func Dispatch(plugin plugins.Plugin, DB *InfluxDB, issues chan sql.Issue, eventsCommentsChannel chan interface{}) {
    for {
        var points []plugins.Point
        select {
        case issue, ok := <-issues:
            if !ok {
                return
            }
            points = plugin.ReceiveIssue(issue)
        case event, ok := <-eventsCommentsChannel:
            if !ok {
                return
            }
            switch event := event.(type) {
            case sql.IssueEvent:
                points = plugin.ReceiveIssueEvent(event)
            case sql.Comment:
                points = plugin.ReceiveComment(event)
            default:
                glog.Fatal("Received invalid object: ", event)
            }
        }

        for _, point := range points {
            if err := DB.Push(point.Tags, point.Values, point.Date); err != nil {
                glog.Fatal("Failed to push point: ", err)
            }
        }
    }
}复制代码

plugin

plugin 需要实现Plugin interface

type Plugin interface {
    ReceiveIssue(sql.Issue) []Point
    ReceiveComment(sql.Comment) []Point
    ReceiveIssueEvent(sql.IssueEvent) []Point
}复制代码

入口root.AddCommand(plugins.NewCountPlugin(config.run))
pulgin 是 authorFilter test-infra/velodrome/transform/plugins/count.go

// test-infra/velodrome/transform/plugins/count.go
// 多个plugin wrap 成了一个
func NewCountPlugin(runner func(Plugin) error) *cobra.Command {
    stateCounter := &StatePlugin{}
    eventCounter := &EventCounterPlugin{}
    commentsAsEvents := NewFakeCommentPluginWrapper(eventCounter)
    commentCounter := &CommentCounterPlugin{}
    authorLoggable := NewMultiplexerPluginWrapper(
        commentsAsEvents,
        commentCounter,
    )
    authorLogged := NewAuthorLoggerPluginWrapper(authorLoggable)
    fullMultiplex := NewMultiplexerPluginWrapper(authorLogged, stateCounter)

    fakeOpen := NewFakeOpenPluginWrapper(fullMultiplex)
    typeFilter := NewTypeFilterWrapperPlugin(fakeOpen)
    authorFilter := NewAuthorFilterPluginWrapper(typeFilter)
    ..;复制代码

triage

一个代码很简单但是页面却很丰富的metric展示页面,什么作用还没看明白
=》 Kubernetes Aggregated Failures

testgrid

testgrid.k8s.io 的前后端, 是jenkins test的metrics 统计, grid的方式,很直观
前端可配置,config.yaml 里面是所有的test
比如 1.6-1.7-kubectl-skew 是其中一个dashborad 下面有多个tab,每个是一个test group, 如 gce-1.6-1-7-cvm

The testgrid site is accessible at testgrid.k8s.io. The site is
configured by [config.yaml].
Updates to the config are automatically tested and pushed to production.

Testgrid is composed of:

  • A list of test groups that contain results for a job over time.
  • A list of dashboards that are composed of tabs that display a test group
  • A list of dashboard groups of related dashboards.

scenarios

测试脚本,python 脚本,调用k8s.io/kubernetes/test/....go

Test jobs are composed of two things:
1) A scenario to test
2) Configuration options for the scenario.

Three example scenarios are:

  • Unit tests
  • Node e2e tests
  • e2e tests

Example configurations are:

  • Parallel tests on gce
  • Build all platforms

The assumption is that each scenario will be called a variety of times with
different configuration options. For example at the time of this writing there
are over 300 e2e jobs, each run with a slightly different set of options.

rebots

issue-creator

source 是创建的来源,比如FlakyJobReporter会对flaky 的 jenkins job 创建issue

queue health

This app monitors the submit queue and produces the chart at submit-queue.k8s.io/#/e2e.
就是给submit-queue.k8s.io生成统计图,比如这个https://storage.go…

It does this with two components:

  • a poller, which polls the current state of the queue and appends it to a
    historical log.
  • a grapher, which gets the historical log and renders it into charts.

prow

这个这里面最有意思的app,可以作为处理github command的rebot,plugin的设计
组件包括:

  • cmd/hook is the most important piece. It is a stateless server that listens
    for GitHub webhooks and dispatches them to the appropriate handlers.
  • cmd/plank is the controller that manages jobs running in k8s pods.
  • cmd/jenkins-operator is the controller that manages jobs running in Jenkins.
  • cmd/sinker cleans up old jobs and pods.
  • cmd/splice regularly schedules batch jobs.
  • cmd/deck presents a nice view of recent jobs.
  • cmd/phony sends fake webhooks.
  • cmd/tot vends incrementing build numbers.
  • cmd/horologium starts periodic jobs when necessary.
  • cmd/mkpj creates ProwJobs.

config

deck

prow.k8s.io/的前后端,展示rece… prow jobs (third party resource).

hook

核心,listen github webhook,然后分发,主要是交给plugin处理

k8s Bot Commands

k8s-ci-robot and k8s-merge-robot understand several commands. They should all be uttered on their own line, and they are case-sensitive.

CommandImplemented ByWho can run itDescription
/approvemungegithub approversownersapprove all the files for which you are an approver
/approve no-issuemungegithub approversownersapprove when a PR doesn't have an associated issue
/approve cancelmungegithub approversownersremoves your approval on this pull-request
/area [label1 label2 ...]prow labelanyoneadds an area/<> label(s) if it exists
/remove-area [label1 label2 ...]prow labelanyoneremoves an area/<> label(s) if it exists
/assign [@userA @userB @etc]prow assignanyoneAssigns specified people (or yourself if no one is specified). Target must be a kubernetes org member.
/unassign [@userA @userB @etc]prow assignanyoneUnassigns specified people (or yourself if no one is specified). Target must already be assigned.
/cc [@userA @userB @etc]prow assignanyoneRequest review from specified people (or yourself if no one is specified). Target must be a kubernetes org member.
/uncc [@userA @userB @etc]prow assignanyoneDismiss review request for specified people (or yourself if no one is specified). Target must already have had a review requested.
/closeprow closeauthors and assigneescloses the issue/PR
/reopenprow reopenauthors and assigneesreopens a closed issue/PR
/holdprow holdanyoneadds the do-not-merge/hold label
/hold cancelprow holdanyoneremoves the do-not-merge/hold label
/jokeprow yuksanyonetells a bad joke, sometimes
/kind [label1 label2 ...]prow labelanyoneadds a kind/<> label(s) if it exists
/remove-kind [label1 label2 ...]prow labelanyoneremoves a kind/<> label(s) if it exists
/lgtmprow lgtmassigneesadds the lgtm label
/lgtm cancelprow lgtmauthors and assigneesremoves the lgtm label
/ok-to-testprow triggerkubernetes org membersallows the PR author to /test all
/test all
/test <some-test-name>
prow triggeranyone on trusted PRsruns tests defined in config.yaml
/retestprow triggeranyone on trusted PRsreruns failed tests
/priority [label1 label2 ...]prow labelanyoneadds a priority/<> label(s) if it exists
/remove-priority [label1 label2 ...]prow labelanyoneremoves a priority/<> label(s) if it exists
/sig [label1 label2 ...]prow labelanyoneadds a sig/<> label(s) if it exists
@kubernetes/sig-<some-github-team>prow labelkubernetes org membersadds the corresponding sig label
/remove-sig [label1 label2 ...]prow labelanyoneremoves a sig/<> label(s) if it exists
/release-noteprow releasenoteauthors and kubernetes org membersadds the release-note label
/release-note-action-requiredprow releasenoteauthors and kubernetes org membersadds the release-note-action-required label
/release-note-noneprow releasenoteauthors and kubernetes org membersadds the release-note-none label
// 一个叫个plugin处理的例子 k8s.io/test-infra/prow/hook/events.go
func (s *Server) handleGenericComment(ce *github.GenericCommentEvent, log *logrus.Entry) {
    for p, h := range s.Plugins.GenericCommentHandlers(ce.Repo.Owner.Login, ce.Repo.Name) {
        go func(p string, h plugins.GenericCommentHandler) {
            pc := s.Plugins.PluginClient
            pc.Logger = log.WithField("plugin", p)
            pc.Config = s.ConfigAgent.Config()
            pc.PluginConfig = s.Plugins.Config()
            if err := h(pc, *ce); err != nil {
                pc.Logger.WithError(err).Error("Error handling GenericCommentEvent.")
            }
        }(p, h)
    }
}


// 开启的plugin 列表 k8s.io/test-infra/prow/hook/plugins.go

import (
    _ "k8s.io/test-infra/prow/plugins/assign"
    _ "k8s.io/test-infra/prow/plugins/cla"
    _ "k8s.io/test-infra/prow/plugins/close"
    _ "k8s.io/test-infra/prow/plugins/golint"
    _ "k8s.io/test-infra/prow/plugins/heart"
    _ "k8s.io/test-infra/prow/plugins/hold"
    _ "k8s.io/test-infra/prow/plugins/label"
    _ "k8s.io/test-infra/prow/plugins/lgtm"
    _ "k8s.io/test-infra/prow/plugins/releasenote"
    _ "k8s.io/test-infra/prow/plugins/reopen"
    _ "k8s.io/test-infra/prow/plugins/shrug"
    _ "k8s.io/test-infra/prow/plugins/size"
    _ "k8s.io/test-infra/prow/plugins/slackevents"
    _ "k8s.io/test-infra/prow/plugins/trigger"
    _ "k8s.io/test-infra/prow/plugins/updateconfig"
    _ "k8s.io/test-infra/prow/plugins/wip"
    _ "k8s.io/test-infra/prow/plugins/yuks"
)复制代码

plugin-lgtm

// k8s.io/test-infra/prow/plugins/lgtm/lgtm.go
// plugin 类型 plugin 基本是实现 下面的一个或者多个handler
genericCommentHandlers     = map[string]GenericCommentHandler{}
issueHandlers              = map[string]IssueHandler{}
issueCommentHandlers       = map[string]IssueCommentHandler{}
pullRequestHandlers        = map[string]PullRequestHandler{}
pushEventHandlers          = map[string]PushEventHandler{}
reviewEventHandlers        = map[string]ReviewEventHandler{}
reviewCommentEventHandlers = map[string]ReviewCommentEventHandler{}
statusEventHandlers        = map[string]StatusEventHandler{}

// 比如lgtm, 做的事主要是 检查"lgtm (cancel)" 看看是不是合法,assign issue, 添加活着删除
// lgtm label, 如果有问题还会创建comment
func init() {
    plugins.RegisterIssueCommentHandler(pluginName, handleIssueComment)
    plugins.RegisterReviewEventHandler(pluginName, handleReview)
    plugins.RegisterReviewCommentEventHandler(pluginName, handleReviewComment)
}复制代码

plugin-assign

// 处理 (un)assign xx; (un)cc; assing issue to xxx
func init() {
    plugins.RegisterIssueCommentHandler(pluginName, handleIssueComment)
    plugins.RegisterIssueHandler(pluginName, handleIssue)
    plugins.RegisterPullRequestHandler(pluginName, handlePullRequest)
}复制代码

plugin-golint

// 会git checkout 代码,对改动的代码调用golang lint 
func init() {
    plugins.RegisterIssueCommentHandler(pluginName, handleIC)
}复制代码

plugin-heart

没看懂是干嘛的

plugin-hold

do-not-merge/hold label

plugin-label

用得最多的plugin, 很多命令都是他实现,比如area,sig,kind...

(?m)^/(area|priority|kind|sig)\s*(.*)$复制代码

plugin-trigger

这个plugin处理的命令比较重要,也比较特殊,会触发prowjob

// 处理 /retest  /ok-to-test

func init() {
    plugins.RegisterIssueCommentHandler(pluginName, handleIssueComment)
    plugins.RegisterPullRequestHandler(pluginName, handlePullRequest)
    plugins.RegisterPushEventHandler(pluginName, handlePush)
}

// 对于pr 的处理 k8s.io/test-infra/prow/plugins/trigger/pr.go
func handlePR(c client, trustedOrg string, pr github.PullRequestEvent) error {
    author := pr.PullRequest.User.Login
    switch pr.Action {
    case github.PullRequestActionOpened:
        // ismember -> buildAll
        // else welcome
    case github.PullRequestActionReopened, github.PullRequestActionSynchronize:
        // if trusted -> buildAll
    case github.PullRequestActionLabeled:
        // When a PR is LGTMd, if it is untrusted then build it once.
    }
    return nil
}

// buildAll -> CreateProwJob


// 对于 comment:retest的处理
// 会收集status中失败的 ->一个presubmit 结构 -> prowjob
// 参考github 的status api, https://developer.github.com/v3/repos/statuses/复制代码

horologium

starts periodic jobs when necessary.

jenkins-operator

controller that manages jobs running in Jenkins, 基本上是从jenkins job的状态sync 到prow job 的status

mkpj

一个cmd 可以手动 creates ProwJobs

plank

controller that manages jobs running in k8s pods, 从pod的状态sync到prowjob 的status

sinker

cleans up old jobs and pods.

splice

regularly schedules batch jobs.

tide

todo

tot

todo

Planter

Planter is a container + wrapper script for your bazel builds.
It will run a docker container as the current user that can run bazel builds
in your $PWD.

mungegithub

a deprecated system befor prow

metrics

todo

maintenance/migratestatus

Status Context Migrator
The migratestatus tool is a maintenance utility used to safely switch a repo from one status context to another.
For example if there is a context named "CI tests" that needs to be moved by "CI tests v2" this tool can be used to copy every "CI tests" status into a "CI tests v2" status context and then mark every "CI tests" context as passing and retired. This ensures that no PRs are ever passing when they shouldn't be and doesn't block PRs that should be passing. The copy and retire phases can be run seperately or together at once in move mode.

LogExporter

LogExporter is a tool that runs post-test on our kubernetes test clusters.
It does the job of computing the set of logfiles to be exported (based on the
node type (master/node), cloud provider, and the node's system services),and
then actually exports them to the GCS path provided to it.

把日志文件上传到gcs

label_sync

todo

kubetest

Kubetest is the interface for launching and running e2e tests.

参考 github.com/kubernetes/…

KETTLE

Kubernetes Extract Tests/Transform/Load Engine

This collects test results scattered across a variety of GCS buckets,
stores them in a local SQLite database, and outputs newline-delimited JSON files
for import into BigQuery.

jenkins

deprecated, 参考 github.com/kubernetes/…

现在不会直接创建jenkins job,而是用prow job

gubernator

k8s-gubernator.appspot.com/ 前端,应该是test-infra最重要的前端,status check 里面的链接,比如k8s-gubernator.appspot.com/build/kuber… 也是来自这里

gcsweb

转载于:https://juejin.im/post/5a1141a7f265da43284072de

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐