Java and UTF-8 encoding

If the J2SE platform has come a long way in internationalization, entering non-ASCII text in the J2EE world isn’t nearly as easy.

To achieve the same result you have to make some changes in your code and in your web server settings.

Firstly, to make sure that the right value in the Content-Type header precedes the text/html content so your browser correctly auto-detects the right encoding, place the following declaration at the beginning of the JSP:

<%@ page contentType="text/html; charset=utf-8" pageEncoding="UTF-8" %>

Next you have to create a filter that implements the ‘javax.servlet.Filter’ interface so you can have the request parameters encoded with UTF-8:

package com.samaxes.filters;

import javax.servlet.*;

 * Filter called before every action.
 * @author : samaxes
public class UTF8Filter implements Filter {

    public void init(FilterConfig filterConfig) {

    public void destroy() {

    public void doFilter(ServletRequest servletRequest,
                         ServletResponse servletResponse,
                         FilterChain filterChain)
            throws IOException, ServletException {
        filterChain.doFilter(servletRequest, servletResponse);

Now, your server reads the URL POST parameters correctly…

But there still is an issue – during a GET operation.

The trouble is that none of the charset information gets sent back to the web server during a GET or POST operation. The server has no way of knowing how to interpret the url-encoded GET parameters, so it assumes ISO-8859-1.

Fortunately the solution to address this is pretty simple, just specify URIEncoding="UTF-8" in your Tomcat’s connector settings within the server.xml file.

Your application shall now handle UTF-8 just fine.

Published by

Samuel Santos

Java developer, Open Source hacker, Web technologist, JUG Leader.

  • Pingback: L’angolo del Basetta | Alessios’ blog()

  • Pandian

    Good one.
    But is there anyway by which I may get UTF characters in catalina log (take the case of tomcat)? If so, what kind of modifications we need to do?

  • Samuel Santos

    [quote comment=”291″]Hi,
    Good one.
    But is there anyway by which I may get UTF characters in catalina log (take the case of tomcat)? If so, what kind of modifications we need to do?[/quote]
    It may be related with the encoding of the machine where you are running Tomcat.
    Are you opening the file as UTF-8?

  • Pandian

    Dear Sam,
    Thanks for the reply, I have changed the encoding as UTF-8 in server.xml; Though, my System.out.printlns coudnt give me unicode characters. they are printed in ASCII only. Is there any other setting We need to changed to get Unicode characters in System.out stream?

  • Samuel Santos

    Try adding the attribute -Dfile.encoding=UTF-8 in your server starting script, then restart your server.

    In a DOS console you won’t see any Unicode character; you should use an editor to open your server log in UTF-8 encoding.

  • Pingback: Encoding issues. Solutions for linux and within Java apps. « Java and more …()

  • baba

    as for the POST solution using your filter, you still need to edit web.xml from tomcat to make it handle the filter, right?

    • Samuel Santos

      Correct, you must declare it in your web application deployment descriptor (web.xml).
      Alternatively you can use the @WebFilter annotation (only if your container supports the Servlet 3.0 spec).