Spring Cloud 全链路日志traceId

​ 随着业务量的增加,线上出现越来越多的bug,但是由于使用的是Spring Cloud,微服务之间调用,输出的日志没有固定上下文管理,定位具体问题存在诸多不便,因此相当有必要引入全链路日志traceId。

MDC(Mapped Diagnostic Context,映射调试上下文)是 log4j 、logback及log4j2 提供的一种方便在多线程条件下记录日志的功能。

借助MDC可以非常方便的将traceId自动输出到日志中,无需手动拼写traceId,代码简单,风格统一。

基本用法

这里使用spirngBoot进行测试

@RunWith(SpringRunner.class)
@Log4j2
public class Test {
    @Test
    public void testMDC() throws InterruptedException {
       MDC.put("traceId", UUID.randomUUID().toString());
        log.info("开始调用服务A,进行业务处理");
        log.info("业务处理完毕,可以释放空间了,避免内存泄露");
    	//MDC put 一定要对应一个 remove
        MDC.remove("traceId");
        log.info("traceId {}", MDC.get("traceId"));
    }
}

在了log4j.xml 或 logback.xml中进行配置,%X{traceId} 获取traceId信息。 以log4j2为例:

<?xml version="1.0" encoding="UTF-8"?>
<configuration status="DEBUG" monitorInterval="30">
    <Properties>
        <property name="log_pattern">[%t] [%X{traceId}] %m %n</property>
    </Properties>
    <appenders>
        <console name="Console" target="SYSTEM_OUT">
            <PatternLayout pattern="${log_pattern}"/>
        </console>
    </appenders>
</configuration>
[main] [08cf04bd-795f-4fb1-83d5-a137dc222af8] 开始调用服务A,进行业务处理 
[main] [08cf04bd-795f-4fb1-83d5-a137dc222af8] 业务处理完毕,可以释放空间了,避免内存泄露 
[main] [] traceId  

可以看到在执行MDC put 后,日志中输出同样的traceId,MDC remove后日志中不在输出。

下面再来看下另外一个场景

@RunWith(SpringRunner.class)
@Log4j2
public class Test {
    @Test
    public void testMDC() throws InterruptedException {
        new RunTest("A").start();
        new RunTest("B").start();
        Thread.sleep(20000);
    }
	class RunTest extends Thread {
        private String code;
        public RunTest(String code) {
            this.code = code;
        }
        @Override
        public void run() {
            MDC.put("traceId", UUID.randomUUID().toString());
            log.info("开始调用服务{}", code);
            try {
                Thread.sleep(10000);
            } catch (InterruptedException e) {
                log.info(e.getMessage());
            }
            log.info("服务{}处理完毕", code);
            MDC.remove("traceId");
        }
}
[Thread-1] [bdbe4aa7-60d4-4528-bc45-928aa94e3660] 开始调用服务A 
[Thread-2] [d3d0fa7d-6204-4f35-b6ed-01a9a8e2a42f] 开始调用服务B 
[Thread-2] [d3d0fa7d-6204-4f35-b6ed-01a9a8e2a42f] 服务B处理完毕 
[Thread-1] [bdbe4aa7-60d4-4528-bc45-928aa94e3660] 服务A处理完毕 

从上面结果可以看出,MDC是和线程绑定在一起的。

MDC的特性
优点:
  • 代码简洁,日志风格统一,不需要在log打印中手动拼写traceId,即LOGGER.info("traceId:{} ", traceId)
MDC 存在的问题
  • 子线程中打印日志丢失
  • 跨服务调用打印日志丢失
使用时需要注意点
  • MDC put 操作一定要对应有一个 remove操作
MDC 的实现原理

​ 要想知道 MDC为什么会有上面的表现,有必要了解下MDC的实现原理。

​ MDC 来源于slf4j包,SLF4J,全称是 Simple Logging Facade for Java,翻译过来就是「一套简单的日志门面」。是为了让研发人员在项目中切换日志组件的方便,特意抽象出的一层。其中 Logger 就来自于 SLF4J 的规范包,项目中一旦这样定义 Logger,在底层就可以无缝切换 logback、log4j等日志组件。

​ 查看源码可以看到, MDC 主要是通过 MDCAdapter 来完成 put、get、remove 等操作。

public class MDC {
    static final String NULL_MDCA_URL = "http://www.slf4j.org/codes.html#null_MDCA";
    static final String NO_STATIC_MDC_BINDER_URL = "http://www.slf4j.org/codes.html#no_static_mdc_binder";
    static MDCAdapter mdcAdapter;
    public static void put(String key, String val) throws IllegalArgumentException {
        if (key == null) {
            throw new IllegalArgumentException("key parameter cannot be null");
        }
        if (mdcAdapter == null) {
            throw new IllegalStateException("MDCAdapter cannot be null. See also " + NULL_MDCA_URL);
        }
        mdcAdapter.put(key, val);
    }
    public static String get(String key) throws IllegalArgumentException {
        if (key == null) {
            throw new IllegalArgumentException("key parameter cannot be null");
        }
        if (mdcAdapter == null) {
            throw new IllegalStateException("MDCAdapter cannot be null. See also " + NULL_MDCA_URL);
        }
        return mdcAdapter.get(key);
    }
    private static MDCAdapter bwCompatibleGetMDCAdapterFromBinder() throws NoClassDefFoundError {
        try {
            return StaticMDCBinder.getSingleton().getMDCA();
        } catch (NoSuchMethodError nsme) {
            // binding is probably a version of SLF4J older than 1.7.14
            return StaticMDCBinder.SINGLETON.getMDCA();
        }
    }

    static {
        try {
            mdcAdapter = bwCompatibleGetMDCAdapterFromBinder();
        } catch (NoClassDefFoundError ncde) {
            mdcAdapter = new NOPMDCAdapter();
            String msg = ncde.getMessage();
            if (msg != null && msg.contains("StaticMDCBinder")) {
                Util.report("Failed to load class \"org.slf4j.impl.StaticMDCBinder\".");
                Util.report("Defaulting to no-operation MDCAdapter implementation.");
                Util.report("See " + NO_STATIC_MDC_BINDER_URL + " for further details.");
            } else {
                throw ncde;
            }
        } catch (Exception e) {
            // we should never get here
            Util.report("MDC binding unsuccessful.", e);
        }
    }
 	// ....
}

在这里插入图片描述

​ MDCAdapter 使用到的实现类是 Log4jMDCAdapter

Log4jMDCAdapter.class

public class Log4jMDCAdapter implements MDCAdapter {

    @Override
    public void put(final String key, final String val) {
        ThreadContext.put(key, val);
    }

    @Override
    public String get(final String key) {
        return ThreadContext.get(key);
    }
   //...
}

可以看到 Log4jMDCAdapter所有的操作都是借助ThreadContext来实现的。

org.apache.logging.log4j.spi.DefaultThreadContextMap#createThreadLocalMap

ThreadContext默认情况下,使用纯ThreadLocal。

根据ThreadLocal的特性我们不难理解MDC的特性

如果你使用logback MDCAdapter 的实现是 LogbackMDCAdapter ,也是使用 TreadLocal 实现

Spring Cloud 引入MDC

​ Spring Cloud 使用 spring gateway + fegin + log4j2

​ 首先需要知道的几点

  • 子线程中打印日志丢失

  • 跨服务调用打印日志丢失

  • MDC put 操作一定要对应有一个 remove操作

我们大致梳理下那些地方需要传递 traceId。一般的微服务大致以如下架构(这个仅考虑需要传递 traceId的地方)
在这里插入图片描述

  1. http调入 ,调出

  2. 微服务调用

  3. 线程调用
    根据如上几点,我们需要一一进行配置、

1.http调入调出

http 调入情况

http调用traceId一般都是配置再header信息上。

gateway配置,增加 GlobalFilter,代码如下

@Log4j2
@Component
public class LogGlobalFilter implements GlobalFilter, Ordered {

    @Override
    public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
        ServerHttpRequest request = exchange.getRequest();
        Object traceId = request.getHeaders().get(AppConstant.TRACE_ID);
        if (traceId == null) {
            traceId = IdUtil.objectId();
            request = exchange.getRequest().mutate()
                        .header(AppConstant.TRACE_ID, traceId.toString())
                        .build();
        }
        ServerHttpResponse response = exchange.getResponse();
        response.getHeaders().add(AppConstant.TRACE_ID,traceId.toString());
        exchange = exchange.mutate().request(request).response(response).build();
        return chain.filter(exchange);
    }

    @Override
    public int getOrder() { return 0; }
}

微服务 spring boot 配置

需要添加拦截器拿到request header中的traceId并添加到MDC中

public class LogInterceptor implements HandlerInterceptor {
    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
       //如果有上层调用就用上层的ID
        String traceId = request.getHeader(AppConstant.TRACE_ID);
        if (traceId == null) {
            traceId = IdUtil.objectId();
        }
        MDC.put(AppConstant.TRACE_ID, traceId);
        return true;
    }
    @Override
    public void postHandle(HttpServletRequest request, HttpServletResponse response, Object handler, ModelAndView modelAndView)
            throws Exception {
    }
    @Override
    public void afterCompletion(HttpServletRequest request, HttpServletResponse response, Object handler, Exception ex)
            throws Exception {
        MDC.remove(AppConstant.TRACE_ID);
    }
}

对于http调出的情况

HttpClient: HttpClient拦截器

public class HttpClientTraceIdInterceptor implements HttpRequestInterceptor {
    @Override
    public void process(HttpRequest httpRequest, HttpContext httpContext) throws HttpException, IOException {
        String traceId = MDC.get(AppConstant.TRACE_ID);
        //当前线程调用中有traceId,则将该traceId进行透传
        if (traceId != null) {
            //添加请求体
            httpRequest.addHeader(AppConstant.TRACE_ID, traceId);
        }
    }
}

通过addInterceptorFirst方法为HttpClient添加拦截器

private static CloseableHttpClient httpClient = HttpClientBuilder.create()
              .addInterceptorFirst(new HttpClientTraceIdInterceptor())

OKHttp:实现OKHttp拦截器

public class OkHttpTraceIdInterceptor implements Interceptor {
      @Override
      public Response intercept(Chain chain) throws IOException {
          String traceId = MDC.get(AppConstant.TRACE_ID);
          Request request = null;
          if (StringUtil.isNotEmpty(traceId)) {
              //添加请求体
              request = chain.request().newBuilder().addHeader(AppConstant.TRACE_ID, traceId).build();
          }
          Response originResponse = chain.proceed(request);
          return originResponse;
      }
}

为OkHttp添加拦截器

private static OkHttpClient client = new OkHttpClient.Builder()
              .addNetworkInterceptor(new OkHttpTraceIdInterceptor())
              .build();

RestTemplate: 实现RestTemplate拦截器

  public class RestTemplateTraceIdInterceptor implements ClientHttpRequestInterceptor {
      @Override
      public ClientHttpResponse intercept(HttpRequest httpRequest, byte[] bytes, ClientHttpRequestExecution clientHttpRequestExecution) throws IOException {
          String traceId = MDC.get(AppConstant.TRACE_ID);
          if (traceId != null) {
              httpRequest.getHeaders().add(AppConstant.TRACE_ID, traceId);
          }
  
          return clientHttpRequestExecution.execute(httpRequest, bytes);
      }
}

为RestTemplate添加拦截器

  restTemplate.setInterceptors(Arrays.asList(new RestTemplateTraceIdInterceptor()));
2.微服务之间调用

由于是 fegin 进行微服务调用,可以通过Feign的拦截器RequestInterceptor实现,Feign调用实际上是requestTemplate http调用。

@Slf4j
@Component
public class FeignLogInterceptor implements RequestInterceptor {
    @Override
    public void apply(RequestTemplate template) {
 		String traceId = MDC.get(AppConstant.TRACE_ID);
        if(StringUtil.isNotEmpty(traceId)){
            requestTemplate.header(AppConstant.TRACE_ID, traceId);
        }
    }
}

其他微服务调用思路类似

grpc 可以通过 interceptor 实现

dubbo 和 motan 可以通过 filter 实现

3.服务内部线程间调用

服务间调用就是存在子线程的时候调用需要传递traceId。线程调用我们一般都是通过线程池调用,可以通过重写线程池传递traceId。

线程池封装:ThreadPoolExecutorMdcWrapper.java

public class ThreadPoolExecutorMdcWrapper extends ThreadPoolExecutor {
      public ThreadPoolExecutorMdcWrapper(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit,
                                          BlockingQueue<Runnable> workQueue) {
          super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue);
      }
  
      public ThreadPoolExecutorMdcWrapper(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit,
                                          BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory) {
          super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, threadFactory);
      }
  
      public ThreadPoolExecutorMdcWrapper(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit,
                                          BlockingQueue<Runnable> workQueue, RejectedExecutionHandler handler) {
          super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, handler);
      }
  
      public ThreadPoolExecutorMdcWrapper(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit,
                                          BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory,
                                          RejectedExecutionHandler handler) {
          super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, threadFactory, handler);
      }
      @Override
      public void execute(Runnable task) {
          super.execute(wrap(task, MDC.getCopyOfContextMap()));
      }
      @Override
      public <T> Future<T> submit(Runnable task, T result) {
          return super.submit(wrap(task, MDC.getCopyOfContextMap()), result);
      }
      @Override
      public <T> Future<T> submit(Callable<T> task) {
          return super.submit(wrap(task, MDC.getCopyOfContextMap()));
      }
      @Override
      public Future<?> submit(Runnable task) {
          return super.submit(wrap(task, MDC.getCopyOfContextMap()));
      }
      public  <T> Callable<T> wrap(final Callable<T> callable, final Map<String, String> context) {
          return () -> {
              if (context == null) {
                  MDC.clear();
              } else {
                  MDC.setContextMap(context);
              }
              setTraceIdIfAbsent();
              try {
                  return callable.call();
              } finally {
                  MDC.clear();
              }
          };
      }
  
      public  Runnable wrap(final Runnable runnable, final Map<String, String> context) {
          return () -> {
              if (context == null) {
                  MDC.clear();
              } else {
                  MDC.setContextMap(context);
              }
              setTraceIdIfAbsent();
              try {
                  runnable.run();
              } finally {
                  MDC.clear();
              }
          };
      }
      public  void setTraceIdIfAbsent() {
          if (MDC.get(AppConstant.TRACE_ID) == null) {
              MDC.put(AppConstant.TRACE_ID, IdUtil.getTraceId());
          }
      }
}

线程运行前put TRACE_ID ,线程运行完成后 MDC.clear()

4.其他方式调用

​ 有些框架调用可能不在上面的场景上,这种可以通过 AOP实现。

​ 如 xxl-job 调用,这种可以通过AOP实现。

/**
 * 日志拦截器
 */
@Component
@Aspect
public class LogAspect {
    /**
     * 拦截入口
     */
    @Pointcut("@annotation(com.xxl.job.core.handler.annotation.XxlJob)")
    public void pointCut() {
    }
    /**
     * 拦截处理
     * @param point point 信息
     * @return result
     * @throws Throwable if any
     */
    @Around("pointCut()")
    public Object around(ProceedingJoinPoint point) throws Throwable{
        try {
             //添加 TRACE_ID
        	MDC.put(AppConstant.TRACE_ID, IdUtil.objectId());
            return point.proceed();
        }finally {
            //移除 TRACE_ID
            MDC.remove(AppConstant.TRACE_ID);
        }
    }
}

参考:https://cloud.tencent.com/developer/article/1951233

​ https://cloud.tencent.com/developer/article/1617911

Logo

权威|前沿|技术|干货|国内首个API全生命周期开发者社区

更多推荐