NSilverBullet

Complex solutions for simple problems.

  • Posts in category: Java

BizTalk 2006 MQSC Adapter Encoding Solution


Monday 03 November, 2008 (.Net | BizTalk | Fixes | Java)

Working with Websphere MQ and the MQSC adapter is quite new for me and I have been struggling with a couple of problems which have been very difficult to find solutions to. One of our main issues has been getting the correct character encoding of outgoing messages to MQ. BizTalk would output the message with the correct encoding (using a file send port to verify) and it looked like the messages on the WebSphere MQ queue were correctly encoded when we checked them using rhutilc. In our case we encoded our unicode messages as UTF-8 but when the Java system which is the destination read the messages any non-standard US characters came out as rubbish. They were correctly encoded as two bytes in the message but the destination system was reading them using codepage 850. Also all unicode messages that the Java system sent to us encoded with UTF-8 over MQ worked correctly in BizTalk.

What I have found out through much trial and error and the sparse clues I have been able to find on the Internet is that the WebSphere client that BizTalk uses under the covers will set the character encoding of any messages that you write to the systems default codepage, which for windows is 850 according to IBM. This setting does not take into account the actual encoding that BizTalk 2006 has done to the message. If you force your message to be encoded as 850 in BizTalk then it will most likely work as long as you don't use characters that are outside the codepage. There is a context property in MQSeries.dll called MQMD_CodedCharSetId which will allow you to set the codepage value (CCSID in the MQ world) for your outgoing message.

I have found three ways to remedy the encoding problem:

  1. Set the context property in an orchestration to whatever value you require. The drawback with this is that if you need to change your encoding you have to recompile and redeploy everything. Also changing your context property does not actually change the encoding just how MQ should interpret the message.
  2. Set an MQ environment variable named MQCCSID to the codepage that you send your messages as. This will override the system default value and force messages to be tagged with your required codepage. Obviously you can only have one system wide default so if you need to send messages out with different encodings this isn't going to help. Again the actual encoding of your messages is not going to be changed.
  3. Create an encoding pipeline component that allows you to set the MQ codepage for an outgoing message. This is the option that we have settled on for now. The encoding of your message is still not actually changed, unless you implement it, but now you can have different encodings for different destinations and you can change the encoding value using MMC and binding files.

One important point to remember is that UTF-8 is codepage 65001 in Windows but 1208 in IBM, so wherever you set the codepage you need to have this in mind - as far as I know all other codepages have the same ids.

Using rfhutilc you can verify the codepage of a message in an MQ queue by browsing to the message and then checking what codepage has been set under the MQMD tab.

I have not verified this but the solutions that I have outlined above should work for EBCDIC and any other odd encodings that you may require as long as you can actually create a message with that encoding.

Comments [5]


Java Web Service Client problems


Thursday 15 May, 2008 (.Net | Bugs | Java | Web)

I have been having some trouble lately trying to debug a .Net web service that I have constructed for a proof of concept. The scenario is that we have a Java application that calls a web service and the web service passes the call on to BizTalk for processing. Not a terribly complex scenario so I wasn't expecting much trouble setting up the POC. Unfortunately we have experienced  some really strange problems. Calling the web service from a .Net client works without problems but when the exact same SoapEnvelope is used in the Java client through a proxy we get a "HTTP/1.1 400 Bad Request" error!

How is this possible? Well after much trial and error, searching the net and generally scratching my head I finally found a clue to the root cause of the error. One of the scenarios when "400 Bad Request" is returned is if the value of the content-length header is smaller then the actual length of the message body. I could not find a simple way to get IIS 6 to log the actual HTTP headers which were received and my favourite HTTP debugger Fiddler couldn't be used when the calls came from the Java client so it was very difficult to verify. The WSE input tracing showed a correct request and the application log contained the following unhelpful message

Process information: 
    Process ID: 5856 
    Process name: w3wp.exe 
    Account name: AD\serviceaccount 
 
Exception information: 
    Exception type: HttpException 
    Exception message: Server cannot clear headers after HTTP headers have been sent. 

I managed to get a nice raw HTTP dump sent to the application log by adding the following to my Global.asax

void Application_BeginRequest(object sender, EventArgs e)
{
	//save position for reset
	long position = HttpContext.Current.Request.InputStream.Position;
	string requestBody = new StreamReader(HttpContext.Current.Request.InputStream).ReadToEnd();
	//reset the position
	HttpContext.Current.Request.InputStream.Position = position;
	EventLog.WriteEntry("Service HTTP Dump", Server.UrlDecode(Context.Request.Headers.ToString()).Replace("&", Environment.NewLine) + Environment.NewLine + requestBody);
}

Using this I could verify that it was in fact an incorrect content-length in the HTTP header which was causing the problem. Using Notepad++ and Fiddler I could see that the actual content-length was 634 using CRLF and 631 using only CR. The Java client was specifying 631 but the .Net web service was actually receiving 634. With the help of Telnet I could send in a call with 631 characters and CR as line-end and verify that the .Net web service handled the request properly. For the sake of clarity this is a sample Java HTTP header:

POST /path/service.asmx HTTP/1.1
Host: servername:80
Cache-Control: no-cache
Connection: close
Pragma: no-cache
Content-Type: text/xml; charset=utf-8
Accept: application/soap+xml, application/dime, multipart/related, text/*
User-Agent: Axis/1.1
SOAPAction: "http://schemas.example.com/2008/04/22/Service:ValidateRequest"
Content-Length: 631

The same request from a .Net client:

POST /path/service.asmx HTTP/1.1
Host: servername:80
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; MS Web Services Client Protocol 2.0.50727.1433)
Content-Type: text/xml; charset=utf-8
SOAPAction: "http://schemas.example.com/2008/04/22/Service:ValidateRequest"
Content-Length: 634
Expect: 100-continue

Both requests had a length of 634.

More info on the exact nature and calculation of the HTTP headers are in the RFC and in section 3.7.1 of the RFC valid line endings for the message body are defined. Basically it says that CR, LF or CRLF should be valid as long as they are used consistently but CRLF should still be treated as two characters for content-length (the spec doesn't explicitly say that but I presume that it is required). So the .Net web service has a correct behaviour, the specification also says that content-length may be omitted in HTTP 1.1 although it should be honoured if it is specified. Since it is required in 1.0 and it is usually very easy to calculate it isn't something that you usually have to think about trying to omit and it is hidden in the web calling infrastructure of Java and .Net.

I am not sure where our error is introduced. It could be when the original message is constructed in Java, whatever library is used to calculate the content-length could do so incorrectly (not likely). A more probable source is that the message is constructed with correct line-ends in the body (CR or LF) but that the intermediary proxy recodes the call with new line-ends (CRLF) without updating the content-length.

How can we fix it? I believe that there are three possible solutions for the problem (we haven't solved it yet).

  1. Remove the proxy, should be the easiest but due to the network configuration that isn't possible at the moment.
  2. Change the way that the Java client creates it's content so that it uses CRLF by default thereby forcing a correct larger content-length.
  3. Use the Application_BeginRequest method in Global.asax or an HTTPHandler to somehow rewrite the input stream or recalculate the content-length. At the moment I am not sure that either is possible.

Update

We managed to remove the proxy (which was actually part of a VPN connection) and our troubles disappeared.

Comments [0]


Page 1 of 1 in the Java category