【修改源码】hadoop 3.3.1 failed with status code 401 Response message: Authentication required

 
本文:
先整体了解hadoop对于web请求的安全认证逻辑,然后通过源码及结合具体的执行日志了解鉴权的逻辑,最后根据hadoop内部通讯所需的http请求,针对性地修改源码,以便修复hadoop3.3.1 的安全认证。

一. 问题描述

详见:【运维与修复】hadoop 3.3.1 bug修复:failed with status code 401 Response message: Authentication required
文章大概描述了:hadoop 3.3.1 安装时添加了安全认证,导致hadoop内部通讯鉴权失败,集群不能启动的问题。

本文主要解决hadoop内部通讯鉴权的问题。

 

二. 问题分析

hadoop simple认证的逻辑

1. 官网解读

通过了解官网希望了解到simple认证的大致逻辑 都有哪些进程角色需要进行http请求,为最小化、针对性的修改源码提供理论基础。

1.1. Authentication for Hadoop HTTP web-consoles

官网1:Authentication for Hadoop HTTP web-consoles
 
By default Hadoop HTTP web-consoles (ResourceManager, NameNode, NodeManagers and DataNodes) allow access without any form of authentication.
Hadoop HTTP web-consoles support the equivalent of Hadoop’s Pseudo/Simple authentication. If this option is enabled, the user name must be specified in the first browser interaction using the user.name query string parameter. e.g. http://localhost:8088/cluster?user.name=babu.

这里我们可以获取一些信息:

  1. 默认地,对于ResourceManager, NameNode, NodeManagers and DataNodes的http请求是不需要任何的鉴权的,通过hadoop的simple认证设置实现请求的校验
  2. 怎样配置simple认证:具体见:【配置/认证】Authentication for Hadoop(3.3.1) HTTP web-consoles : Hadoop的simple认证 不是银弹

 

1.2. Hadoop Auth, Java HTTP SPNEGO

官网:Hadoop Auth, Java HTTP SPNEGO

对于Hadoop Auth我们只关注simple认证的相关信息。

Hadoop Auth also supports additional authentication mechanisms on the client and the server side via 2 simple interfaces.
 
How Does Auth Works?
Hadoop Auth enforces authentication on protected resources, once authentiation has been established it sets a signed HTTP Cookie that contains an authentication token with the user name, user principal, authentication type and expiration time.
 
Subsequent HTTP client requests presenting the signed HTTP Cookie have access to the protected resources until the HTTP Cookie expires.
 
The secret used to sign the HTTP Cookie has multiple implementations that provide different behaviors, including a hardcoded secret string, a rolling randomly generated secret, and a rolling randomly generated secret synchronized between multiple servers using ZooKeeper.

从官网我们获取的一些有用信息:

  1. hadoop 支持simple下client和server的鉴权机制
  2. hadoop 鉴权的逻辑:
    Hadoop Auth会设置一个Http Cookie,这个token用于鉴权,其中包含用户名、用户规则,鉴权类型、过期时间等。
    client端通过带有上述cookie的请求,可以访问被保护的资源。
    cookie的实现可以是硬编码、随机的密码,或者是通过zk实现滚动的密钥。

 
接着:看下client和server构建 hadoop auth请求的例子和逻辑

Hadoop Auth, Java HTTP SPNEGO - Examples

以下有两种方式去触发hadoop auth请求:

  1. Use the AuthenticatedURL class to obtain an authenticated HTTP connection:
...
URL url = new URL("http://localhost:8080/hadoop-auth/kerberos/who");
AuthenticatedURL.Token token = new AuthenticatedURL.Token();
...
HttpURLConnection conn = new AuthenticatedURL().openConnection(url, token);
...
conn = new AuthenticatedURL().openConnection(url, token);
...
  1. Accessing the server using curl
# 匿名请求不需要鉴权
$ curl http://localhost:8080/hadoop-auth-examples/anonymous/who
# 一般的simple鉴权需要添加user.name=xxx
$ curl http://localhost:8080/hadoop-auth-examples/simple/who?user.name=foo

 
 

2. 源码浅析

通过对官网以及相关日志的解读,可以知道,hadoop auth的逻辑主要在,hadoop-auth 这个模块下,我们看一下源码的结构。
在这里插入图片描述

hadoop-auth 主要实现了client和server端的逻辑,其中:

  1. client侧实现并封装了怎么构建带有认证(cookie with auth)的请求
  2. server侧实现了当请求过来时,对请求进行校验。

接下来我们具体分析,client、server的主要逻辑:
 

1. client端是如何封装带有认证(cookie with auth)请求的

1.2. Hadoop Auth, Java HTTP SPNEGO 我们可以了解到如何通过代码构建请求:

 * // establishing an initial connection
 *
 * URL url = new URL("http://foo:8080/bar");
 * AuthenticatedURL.Token token = new AuthenticatedURL.Token();
 * AuthenticatedURL aUrl = new AuthenticatedURL();
 * HttpURLConnection conn = new AuthenticatedURL().openConnection(url, token);
 * ....
 * // use the 'conn' instance
 * ....
 *
 * // establishing a follow up connection using a token from the previous connection
 *
 * HttpURLConnection conn = new AuthenticatedURL().openConnection(url, token);
 * ....
 * // use the 'conn' instance

AuthenticatedURL是构建请求的主要逻辑:1. 创建token对象,2. openConnection:封装(带有例如simple认证的)请求,并进行请求。

看下AuthenticatedURL.openConnection(URL url, Token token)的逻辑:

  /**
   * Returns an authenticated {@link HttpURLConnection}.
   *
   * @param url the URL to connect to. Only HTTP/S URLs are supported.
   * @param token the authentication token being used for the user.
   */
  public HttpURLConnection openConnection(URL url, Token token) throws IOException, AuthenticationException {
  //一些校验
    if (url == null) {
      throw new IllegalArgumentException("url cannot be NULL");
    }
    if (!url.getProtocol().equalsIgnoreCase("http") && !url.getProtocol().equalsIgnoreCase("https")) {
      throw new IllegalArgumentException("url must be for a HTTP or HTTPS resource");
    }
    if (token == null) {
      throw new IllegalArgumentException("token cannot be NULL");
    }
    authenticator.authenticate(url, token);

    //通过token实例关联connection和cookie handler,以便对cookie的管理
    // allow the token to create the connection with a cookie handler for managing session cookies.
    return token.openConnection(url, connConfigurator);
  }

关键代码:

 authenticator.authenticate(url, token);

在这里插入图片描述

authenticator:指不同认证的实现,常见的有kerberos、simple、抑或自定义。
authenticate:是具体的封装逻辑。

我们看下simple认证的实现:

org.apache.hadoop.security.authentication.client.PseudoAuthenticator
  /**
   *对给定的url添加simple认证(user.name=xxx),并解析请求的response
   */
  public void authenticate(URL url, AuthenticatedURL.Token token) throws IOException, AuthenticationException {
    String strUrl = url.toString();
    String paramSeparator = (strUrl.contains("?")) ? "&" : "?";
    //这里拼接了:user.name=xxx
    strUrl += paramSeparator + USER_NAME_EQ + getUserName();
    url = new URL(strUrl);
    HttpURLConnection conn = token.openConnection(url, connConfigurator);
    conn.setRequestMethod("OPTIONS");
    //创建连接,并请求
    conn.connect();
    //解析response
    AuthenticatedURL.extractToken(conn, token);
  }

 

概述一下simple认证下,client构建请求的逻辑:通过判断具体url的情况,对url添加安全认证(user.name=xxx),接着发起请求并对响应进行解析。 同时对此次请求添加cookie handler 以便对cookie的管理。

 
 

2. server端是如何对请求进行拦截的

server端的AuthenticationFilter实现了javax.servlet.Filter接口,会对请求进行拦截,进行鉴权检查。

先简单看一下Filter的逻辑
相关资源:Servlet 之 Filter的实现原理

public abstract interface Filter{
     //创建filter实例之前做的初始化工作
    public abstract void init(FilterConfig paramFilterConfig) throws ServletException;
    //对request进行拦截,执行dofilter,
    public abstract void doFilter(ServletRequest paramServletRequest, ServletResponse paramServletResponse, FilterChain paramFilterChain) throws IOException, ServletException;
    //销毁filter
    public abstract void destroy();
}

 

2.1. init方法

init逻辑:主要是从core-site.xml获取鉴权的实现类(对于simple来说是PseudoAuthenticationHandler)、规则(是否允许匿名请求,domain的设置)等,进而初始化authHandler。

    @Override
    public void init(FilterConfig filterConfig) throws ServletException {
        String configPrefix = filterConfig.getInitParameter(CONFIG_PREFIX);
        configPrefix = (configPrefix != null) ? configPrefix + "." : "";
        config = getConfiguration(configPrefix, filterConfig);

        //获取authHandlerName的类型:我们这里是simple
        String authHandlerName = config.getProperty(AUTH_TYPE, null);
        String authHandlerClassName;
        if (authHandlerName == null) {
            throw new ServletException("Authentication type must be specified: " +
                    PseudoAuthenticationHandler.TYPE + "|" +
                    KerberosAuthenticationHandler.TYPE + "|<class>");
        }
        //对于 simple 来说是PseudoAuthenticationHandler
        authHandlerClassName =
                AuthenticationHandlerUtil
                        .getAuthenticationHandlerClassName(authHandlerName);
        maxInactiveInterval = Long.parseLong(config.getProperty(
                AUTH_TOKEN_MAX_INACTIVE_INTERVAL, "-1")); // By default, disable.
        if (maxInactiveInterval > 0) {
            maxInactiveInterval *= 1000;
        }
        //从core-site.xml的配置中获取token的过期时间
        validity = Long.parseLong(config.getProperty(AUTH_TOKEN_VALIDITY, "36000"))
                * 1000; //10 hours
        initializeSecretProvider(filterConfig);

        //(通过反射)获取PseudoAuthenticationHandler实例,并从core-site.xml获取是否允许匿名请求
        initializeAuthHandler(authHandlerClassName, filterConfig);

        //domain的设置
        cookieDomain = config.getProperty(COOKIE_DOMAIN, null);
        cookiePath = config.getProperty(COOKIE_PATH, null);
        isCookiePersistent = Boolean.parseBoolean(
                config.getProperty(COOKIE_PERSISTENT, "false"));

    }

欣赏下hadoop初始化AuthHandler的代码。

    protected void initializeAuthHandler(String authHandlerClassName, FilterConfig filterConfig)
            throws ServletException {
        try {
          //通过反射加载类 并将配置文件(比如core-site的安全校验)的配置配置到 authHandlerClass
            Class<?> klass = Thread.currentThread().getContextClassLoader().loadClass(authHandlerClassName);
            authHandler = (AuthenticationHandler) klass.newInstance();
            //设置是否允许匿名请求
            authHandler.init(config);
        } catch (ClassNotFoundException | InstantiationException |
                IllegalAccessException ex) {
            throw new ServletException(ex);
        }
    }

 

2.2. doFilter
    /**
     *如果一个请求带有authentication token 它将被允许请求target resource 否则它将触发AuthenticationHandler 的
     */
    @Override
    public void doFilter(ServletRequest request,
                         ServletResponse response,
                         FilterChain filterChain)
            throws IOException, ServletException {
        boolean unauthorizedResponse = true;
        int errCode = HttpServletResponse.SC_UNAUTHORIZED;
        AuthenticationException authenticationEx = null;
        HttpServletRequest httpRequest = (HttpServletRequest) request;
        HttpServletResponse httpResponse = (HttpServletResponse) response;
        boolean isHttps = "https".equals(httpRequest.getScheme());
        try {
            //拦截request获取cookie
            //对于内部通讯来说,无cookie
            boolean newToken = false;
            AuthenticationToken token;
            try {
                token = getToken(httpRequest);
                if (LOG.isDebugEnabled()) {
                    LOG.info("Got token {} from httpRequest {}", token,
                            getRequestURL(httpRequest));
                }
            } catch (AuthenticationException ex) {
                LOG.warn("AuthenticationToken ignored: " + ex.getMessage());
                // will be sent back in a 401 unless filter authenticates
                authenticationEx = ex;
                token = null;
            }
            //对于内部通讯来说,此时的token为null, 对于simple认证来说,进入此方法
            if (authHandler.managementOperation(token, httpRequest, httpResponse)) {
                //token为空时进行鉴权判断
                if (token == null) {
                    if (LOG.isDebugEnabled()) {
                        LOG.info("Request [{}] triggering authentication. handler: {}",
                                getRequestURL(httpRequest), authHandler.getClass());
                    }
                    //进行鉴权:
                    token = authHandler.authenticate(httpRequest, httpResponse);
                    //token不为空且允许匿名请求,设置过期时间
                    if (token != null && token != AuthenticationToken.ANONYMOUS) {
                        if (token.getMaxInactives() > 0) {
                            token.setMaxInactives(System.currentTimeMillis()
                                    + getMaxInactiveInterval() * 1000);
                        }
                        if (token.getExpires() != 0) {
                            token.setExpires(System.currentTimeMillis()
                                    + getValidity() * 1000);
                        }
                    }
                    newToken = true;
                }
                //token不为空时设置请求的cookie
                if (token != null) {
                    unauthorizedResponse = false;
                    if (LOG.isDebugEnabled()) {
                        LOG.info("Request [{}] user [{}] authenticated",
                                getRequestURL(httpRequest), token.getUserName());
                    }
                    final AuthenticationToken authToken = token;
                    httpRequest = new HttpServletRequestWrapper(httpRequest) {

                        @Override
                        public String getAuthType() {
                            return authToken.getType();
                        }

                        @Override
                        public String getRemoteUser() {
                            return authToken.getUserName();
                        }

                        @Override
                        public Principal getUserPrincipal() {
                            return (authToken != AuthenticationToken.ANONYMOUS) ?
                                    authToken : null;
                        }
                    };

                    // If cookie persistence is configured to false,
                    // it means the cookie will be a session cookie.
                    // If the token is an old one, renew the its maxInactiveInterval.
                    if (!newToken && !isCookiePersistent()
                            && getMaxInactiveInterval() > 0) {
                        token.setMaxInactives(System.currentTimeMillis()
                                + getMaxInactiveInterval() * 1000);
                        token.setExpires(token.getExpires());
                        newToken = true;
                    }
                    if (newToken && !token.isExpired()
                            && token != AuthenticationToken.ANONYMOUS) {
                        String signedToken = signer.sign(token.toString());
                        createAuthCookie(httpResponse, signedToken, getCookieDomain(),
                                getCookiePath(), token.getExpires(),
                                isCookiePersistent(), isHttps);
                    }
                    doFilter(filterChain, httpRequest, httpResponse);
                }
            } else {
                if (LOG.isDebugEnabled()) {
                    LOG.info("managementOperation returned false for request {}."
                            + " token: {}", getRequestURL(httpRequest), token);
                }
                unauthorizedResponse = false;
            }
        } catch (AuthenticationException ex) {
            // exception from the filter itself is fatal
            errCode = HttpServletResponse.SC_FORBIDDEN;
            authenticationEx = ex;
            if (LOG.isDebugEnabled()) {
                LOG.info("Authentication exception: " + ex.getMessage(), ex);
            } else {
                LOG.warn("Authentication exception: " + ex.getMessage());
            }
        }
        //对于第一次请求,为请求的response设置cookie
        if (unauthorizedResponse) {
            if (!httpResponse.isCommitted()) {
                createAuthCookie(httpResponse, "", getCookieDomain(),
                        getCookiePath(), 0, isCookiePersistent(), isHttps);
                // If response code is 401. Then WWW-Authenticate Header should be
                // present.. reset to 403 if not found..
                if ((errCode == HttpServletResponse.SC_UNAUTHORIZED)
                        && (!httpResponse.containsHeader(
                        KerberosAuthenticator.WWW_AUTHENTICATE))) {
                    errCode = HttpServletResponse.SC_FORBIDDEN;
                }
                // After Jetty 9.4.21, sendError() no longer allows a custom message.
                // use setStatus() to set a custom message.
                String reason;
                if (authenticationEx == null) {
                    reason = "Authentication required";
                } else {
                    reason = authenticationEx.getMessage();
                }

                httpResponse.setStatus(errCode, reason);
                httpResponse.sendError(errCode, reason);
            }
        }
    }

dofiter大概的逻辑是:dofilter会拦截hadoop的http请求前会进行拦截,判断httpRequest里面的cookie是否符合鉴权要求。

对于第一次请求会进行鉴权的判断,如下代码:

token = authHandler.authenticate(httpRequest, httpResponse);

如下对应不同类型的鉴权有不同的实现:
在这里插入图片描述

我们关注simple的认证逻辑

org.apache.hadoop.security.authentication.server.PseudoAuthenticationHandler
    @Override
    public AuthenticationToken authenticate(HttpServletRequest request, HttpServletResponse response)
            throws IOException, AuthenticationException {
        AuthenticationToken token;
        String userName = getUserName(request);
        int serverPort = request.getServerPort();
        LOG.info("[authenticate] get request serverPort {}", serverPort);
        if (userName == null) {
            if (getAcceptAnonymous() ) {
                token = AuthenticationToken.ANONYMOUS;
            } else {
                response.setStatus(HttpServletResponse.SC_FORBIDDEN);
                response.setHeader(WWW_AUTHENTICATE, PSEUDO_AUTH);
                token = null;
            }
        } else {
            token = new AuthenticationToken(userName, userName, getType());
        }
        return token;
    }

simple认证的逻辑比较简单,首先判断请求中是否有user.name=xxx,没有则判断是否允许匿名请求,接着设置token并返回,如果两个条件都不满足则直接设置401。

 

2.3 destroy

关闭资源

    @Override
    public void destroy() {
        if (authHandler != null) {
            authHandler.destroy();
            authHandler = null;
        }
        if (secretProvider != null && destroySecretProvider) {
            secretProvider.destroy();
            secretProvider = null;
        }
    }

 
 

三. 【实现】最小化修改方案

源码修改原则: 源码修改时一定要保证不能破环原来的逻辑,所以修改源码时要尽可能的缩小范围,保证修改的逻辑是可控的。

通过上面的分析再结合hadoop添加simple认证之后启动失败的现象,我们可以得到以下思路:

hadoop-auth包实现了hadoop http 校验的主要逻辑:其中client包下实现了对鉴权请求的封装,sever包下实现了对请求的拦截和校验。

我们可以猜测到,对于hadoop内部的http请求,是通过client端对鉴权请求的封装,然后发起请求,但是根据报错看到我们请求没有添加:user.name=xxx。

org.apache.hadoop.hdfs.server.common.HttpGetFailedException: Image transfer servlet at http://XXXXX/imagetransfer?ge
timage=1&txid=0&storageInfo=-65:271209174:1614287921618:CID-f21dbb8a-8660-4ef6-8045-f80daf067c38&bootstrapstandby=true failed with status code 401
Response message:
Authentication required

至于为什么没有添加,我们这里不深究,因为hadoop内部http的通讯有多少我们似乎不能给到正确答案,如果挨个修改有问题的请求,修改范围庞大,且因为是硬编码,非常不利于后期维护。

 
这里我们换一种思路:

既然每次请求都会被server侧的dofilter进行拦截,那我们就对那些没有做鉴权封装的内部请求放开鉴权就好了。

具体的,

  1. 首先我们设置simple认证允许匿名请求,这样先允许所有的http请求都经过验证
    <property>
        <name>hadoop.http.authentication.simple.anonymous.allowed</name>
        <value>true</value>
    </property>
  1. 其次我们只拦截web端请求
    对于web端请求,包括HDFS的web页面和Yarn的web页面,他们有固定的web端口,我们对带有这些请求的端口拦截即可,我们对simple的authenticate进行适配:
PseudoAuthenticationHandler.authenticate
    @Override
    public AuthenticationToken authenticate(HttpServletRequest request, HttpServletResponse response)
            throws IOException, AuthenticationException {
        AuthenticationToken token;

        String userName = getUserName(request);
        int serverPort = request.getServerPort();
        LOG.info("[authenticate] get request serverPort {}", serverPort);
        //匿名访问的情况下的处理
        String[] ports = {"HDFS web port", "Yarn web port"};
        String requestURI = request.getRequestURI();
        LOG.info("[authenticate]  authenticate get request requestURI {}", requestURI);
        List portList = new ArrayList<>(Arrays.asList(ports));
        //snn进行元数据拷贝的适配
        boolean snnCheck = portList.contains(String.valueOf(serverPort)) && !requestURI.contains("/imagetransfer");
        if (userName == null) {
            if (getAcceptAnonymous() && !snnCheck) {
                LOG.info("[cloudwise  authenticate] pass authenticate ");
                token = AuthenticationToken.ANONYMOUS;
            } else {
                response.setStatus(HttpServletResponse.SC_FORBIDDEN);
                response.setHeader(WWW_AUTHENTICATE, PSEUDO_AUTH);
                token = null;
            }
        } else {
            token = new AuthenticationToken(userName, userName, getType());
        }
        return token;
    }

重新打包hadoop-auth,替换

${HADOOP_HOME}/share/hadoop/common/lib

重装hadoop,能启动,直接访问页面,跳转401。 方案生效!!!

在这里插入图片描述

 

四. 修改simple认证为何没有生效ing

既然有了上面的源码基础,我们可以分析为什么simple认证没有生效,这放到后续的篇章分析。
相关文章:【配置/认证】Authentication for Hadoop(3.3.1) HTTP web-consoles : Hadoop的simple认证 不是银弹

暂时可以想到的思路是:查看源码是如何加载密钥文件的,然后是如何获取设置密钥文件的。
 

再放几个开源项目自定义鉴权方案:

oozie 的 Creating Custom Authentication

eday的方案

We can follow the steps below to plug in custom authentication mechanism.

Implement interface AuthenticationHandler, which is under the  org.apache.hadoop.security.authentication.server package.
Specify the implementation class in the configuration. Make sure that the implementation class is available in the classpath of the Hadoop server.



Composite AuthenticationHandler
At eBay, we like to provide multiple authentication mechanisms in addition to the Kerberos and anonymous authentication. The operators prefer to turn off any authentication mechanism by modifying the configuration rather than rolling out new code. For this reason, we implemented a CompositeAuthenticationHandler.

The CompositeAuthenticationHandler accepts a list of authentication mechanisms via the property hadoop.http.authentication.composite.handlers. This property contains a list of classes that are implementations for AuthenticationHandler corresponding to different authentication mechanisms.
  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

roman_日积跬步-终至千里

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值