UTF-8编码的j_security_check用户名在Tomcat领域中被错误地解码为Latin-1

一尘不染

UTF-8编码的j_security_check用户名在Tomcat领域中被错误地解码为Latin-1

tomcat

我正在调查在登录表单中引入具有Latin-1字符的用户名的问题。用户名包含字符á。我调查了我所在的服务器部分：

公共类MyRealm扩展RealmBase实现Realm {public Principal authenticate（String
username，String password）{…此处实施的实际认证}}

如果我打印出字节：username.getBytes（），我看到字符á具有：C3 83 C2 A1通常，以UTF8编码的字符á必须具有：C3
A1。如果再次使用UTF8对此进行编码，则会得到：C3 83 C2 A1我的软件打印出什么。

我检查了网络捕获，发现用户名已通过C3 A1正确发送。登录页面表单的源代码为：

        <form name="loginForm" action="j_security_check" method="post" enctype="application/x-www-form-urlencoded">
        <table>
            <tr>
                <td colspan="2" align="right">Secure connection:
                    <input type="checkbox" name="checkbox" class="style5" onclick="javascript:httpHttps();"></td>
            </tr>
            <tr>
                <td class="style5">Login:</td>
                <td><input type="text" name="j_username" autocomplete="off" style="width:150px" /></td>
            </tr>

因此，我认为客户端没有错（2倍UTF8转换）。如果我在authenticate（）函数中从UTF8解码两次，则身份验证可以正常工作，但是我担心无法将此解决方案应用于我的问题

我应该在Realm的authenticate（String username，String
password）函数中在哪里寻找用户名的编码？服务器端在具有httpd-2.2.15和tomcat6-6.0.24的Linux（RedHat）上运行。

阅读 285

2020-06-16

共1个答案

一尘不染

在您的示例中，表单使用％编码将“á”的UTF-8字符发送给Tomcat（因此，它是％C3％A1）。但是，Tomcat将其解释为Latin1，这是POST的默认编码。

因此，Tomcat将在内部将C3A1存储为“Ã”，因为在Latin1编码中C3为“Ã”，而A1为“¡”。

当您要求输入username.getBytes（）时，它将创建一个UTF-8编码的字节数组，因此它将在UTF-8字符集C383
C2A1中查找两个字符’Ã¡’。

详细描述此问题的FAQ和建议的解决方案：http
:
//wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q3

在server.xml中更改FormAuthenticator的Valve以指定 characterEncoding="UTF-8"

    <Context path="/YourSercureApp">
            <Valve
            className="org.apache.catalina.authenticator.FormAuthenticator"
            disableProxyCaching="false"
            characterEncoding="UTF-8" />
    </Context>

2020-06-16