IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    [原]Java同样的汉字在服务器和本地的电脑上URLencode 出来的结果不一致

    testcs_dn发表于 2017-01-02 21:48:53
    love 0

    Java同样的汉字在服务器和本地的电脑上URLencode 出来的结果不一致

    在CSDN问答中看到这个问题,通常这类问题都是由于字符串的编码导致的。

    代码如下:

             String oldStr = new String("中文字符");  //我的环境默认UTF-8
            System.out.println(URLEncoder.encode(oldStr));
            try {
                String newStr = new String(oldStr.getBytes(), "gb2312");
                System.out.println(URLEncoder.encode(newStr));
            } catch (UnsupportedEncodingException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }  
    输出:
    %E4%B8%AD%E6%96%87%E5%AD%97%E7%AC%A6
    %E6%B6%93%EF%BF%BD%EF%BF%BD%EF%BF%BD%E7%80%9B%EF%BF%BD%E7%BB%97%EF%BF%BD

    UTF-8的才是正确的。

    URLEncoder.encode(String s) 过时了,使用下面的方法:

            String oldStr = new String("中文字符");
            try {
                System.out.println(URLEncoder.encode(oldStr, "utf-8"));
                System.out.println(URLEncoder.encode(oldStr, "gb2312"));
            } catch (UnsupportedEncodingException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
    输出:

    %E4%B8%AD%E6%96%87%E5%AD%97%E7%AC%A6
    %D6%D0%CE%C4%D7%D6%B7%FB

    查看默认编码:

    System.out.println(Charset.defaultCharset()); //查看默认编码

    问题又来了,两次输出的 GB2312 编码的不一样!一次那么长,一次却很短!

    查看的实现源码吧!

    package java.net;
    
    import java.io.CharArrayWriter;
    import java.io.UnsupportedEncodingException;
    import java.nio.charset.Charset;
    import java.nio.charset.IllegalCharsetNameException;
    import java.nio.charset.UnsupportedCharsetException;
    import java.security.AccessController;
    import java.util.BitSet;
    import sun.security.action.GetPropertyAction;
    
    public class URLEncoder
    {
      static BitSet dontNeedEncoding;
      static final int caseDiff = 32;
      static String dfltEncName = null;
    
      @Deprecated
      public static String encode(String paramString)
      {
        String str = null;
        try
        {
          str = encode(paramString, dfltEncName);
        }
        catch (UnsupportedEncodingException localUnsupportedEncodingException)
        {
        }
        return str;
      }
    
      public static String encode(String paramString1, String paramString2)
        throws UnsupportedEncodingException
      {
        int i = 0;
        StringBuffer localStringBuffer = new StringBuffer(paramString1.length());
        CharArrayWriter localCharArrayWriter = new CharArrayWriter();
        if (paramString2 == null)
          throw new NullPointerException("charsetName");
        Charset localCharset;
        try
        {
          localCharset = Charset.forName(paramString2);
        }
        catch (IllegalCharsetNameException localIllegalCharsetNameException)
        {
          throw new UnsupportedEncodingException(paramString2);
        }
        catch (UnsupportedCharsetException localUnsupportedCharsetException)
        {
          throw new UnsupportedEncodingException(paramString2);
        }
        int j = 0;
        while (j < paramString1.length())
        {
          int k = paramString1.charAt(j);
          if (dontNeedEncoding.get(k))
          {
            if (k == 32)
            {
              k = 43;
              i = 1;
            }
            localStringBuffer.append((char)k);
            ++j;
          }
          else
          {
            do
            {
              localCharArrayWriter.write(k);
              if ((k < 55296) || (k > 56319) || (j + 1 >= paramString1.length()))
                continue;
              int l = paramString1.charAt(j + 1);
              if ((l < 56320) || (l > 57343))
                continue;
              localCharArrayWriter.write(l);
              ++j;
            }
            while ((++j < paramString1.length()) && (!dontNeedEncoding.get(k = paramString1.charAt(j))));
            localCharArrayWriter.flush();
            String str = new String(localCharArrayWriter.toCharArray());
            byte[] arrayOfByte = str.getBytes(localCharset);
            for (int i1 = 0; i1 < arrayOfByte.length; ++i1)
            {
              localStringBuffer.append('%');
              char c = Character.forDigit(arrayOfByte[i1] >> 4 & 0xF, 16);
              if (Character.isLetter(c))
                c = (char)(c - ' ');
              localStringBuffer.append(c);
              c = Character.forDigit(arrayOfByte[i1] & 0xF, 16);
              if (Character.isLetter(c))
                c = (char)(c - ' ');
              localStringBuffer.append(c);
            }
            localCharArrayWriter.reset();
            i = 1;
          }
        }
        return (i != 0) ? localStringBuffer.toString() : paramString1;
      }
    
      static
      {
        dontNeedEncoding = new BitSet(256);
        for (int i = 97; i <= 122; ++i)
          dontNeedEncoding.set(i);
        for (i = 65; i <= 90; ++i)
          dontNeedEncoding.set(i);
        for (i = 48; i <= 57; ++i)
          dontNeedEncoding.set(i);
        dontNeedEncoding.set(32);
        dontNeedEncoding.set(45);
        dontNeedEncoding.set(95);
        dontNeedEncoding.set(46);
        dontNeedEncoding.set(42);
        dfltEncName = (String)AccessController.doPrivileged(new GetPropertyAction("file.encoding"));
      }
    }
    通过查看源码我们可以明白,传入的第二个参数是输出的编码,而不是指将传入的第一个参数转换为什么类型的编码;

    默认它应该是读取的容器(如:Tomcat)的默认编码;

    我们可以通过执行以下代码查看:

    	    dfltEncName = (String)AccessController.doPrivileged(new GetPropertyAction("file.encoding"));
    	    
    	    System.out.println(dfltEncName);
    需要在项目中添加一个类:GetPropertyAction.java
    package sun.security.action;
    
    import java.security.PrivilegedAction;
    
    public class GetPropertyAction
      implements PrivilegedAction<String>
    {
      private String theProp;
      private String defaultVal;
    
      public GetPropertyAction(String paramString)
      {
        this.theProp = paramString;
      }
    
      public GetPropertyAction(String paramString1, String paramString2)
      {
        this.theProp = paramString1;
        this.defaultVal = paramString2;
      }
    
      public String run()
      {
        String str = System.getProperty(this.theProp);
        return (str == null) ? this.defaultVal : str;
      }
    }
    
    所以正确的姿势是先判断你的默认编码是不是“UTF-8”,使用:Charset.defaultCharset();

    如果不是,就将其转换为“UTF-8”,使用:String newStr = new String(oldStr.getBytes(), "utf-8");

    然后再编码:URLEncoder.encode(newStr, "utf-8");

    ======================文档信息===========================

    版权声明:非商用自由转载-保持署名-注明出处

    署名(BY) :testcs_dn(微wx笑)

    文章出处:[无知人生,记录点滴](http://blog.csdn.net/testcs_dn)



沪ICP备19023445号-2号
友情链接