Working with Websphere MQ and the MQSC adapter is quite new for me and I have been struggling with a couple of problems which have been very difficult to find solutions to. One of our main issues has been getting the correct character encoding of outgoing messages to MQ. BizTalk would output the message with the correct encoding (using a file send port to verify) and it looked like the messages on the WebSphere MQ queue were correctly encoded when we checked them using rhutilc. In our case we encoded our unicode messages as UTF-8 but when the Java system which is the destination read the messages any non-standard US characters came out as rubbish. They were correctly encoded as two bytes in the message but the destination system was reading them using codepage 850. Also all unicode messages that the Java system sent to us encoded with UTF-8 over MQ worked correctly in BizTalk.
What I have found out through much trial and error and the sparse clues I have been able to find on the Internet is that the WebSphere client that BizTalk uses under the covers will set the character encoding of any messages that you write to the systems default codepage, which for windows is 850 according to IBM. This setting does not take into account the actual encoding that BizTalk 2006 has done to the message. If you force your message to be encoded as 850 in BizTalk then it will most likely work as long as you don’t use characters that are outside the codepage. There is a context property in MQSeries.dll called MQMD_CodedCharSetId which will allow you to set the codepage value (CCSID in the MQ world) for your outgoing message.
I have found three ways to remedy the encoding problem:
- Set the context property in an orchestration to whatever value you require. The drawback with this is that if you need to change your encoding you have to recompile and redeploy everything. Also changing your context property does not actually change the encoding just how MQ should interpret the message.
- Set an MQ environment variable named MQCCSID to the codepage that you send your messages as. This will override the system default value and force messages to be tagged with your required codepage. Obviously you can only have one system wide default so if you need to send messages out with different encodings this isn’t going to help. Again the actual encoding of your messages is not going to be changed.
- Create an encoding pipeline component that allows you to set the MQ codepage for an outgoing message. This is the option that we have settled on for now. The encoding of your message is still not actually changed, unless you implement it, but now you can have different encodings for different destinations and you can change the encoding value using MMC and binding files.
One important point to remember is that UTF-8 is codepage 65001 in Windows but 1208 in IBM, so wherever you set the codepage you need to have this in mind – as far as I know all other codepages have the same ids.
Using rfhutilc you can verify the codepage of a message in an MQ queue by browsing to the message and then checking what codepage has been set under the MQMD tab.
I have not verified this but the solutions that I have outlined above should work for EBCDIC and any other odd encodings that you may require as long as you can actually create a message with that encoding.