Commit 1bc3ea45 authored by Andre Freyssinet's avatar Andre Freyssinet
Browse files

Description of encoding evolutions.

parent 392f1fc4
Since Joram 5.19 UTF-8 is used by default to encode / decode String objects.
This has impacts on the communications between the client and the broker, but also on the way
String objects are encoded in the broker persistence.
Previously the JVM charset was used to encode and decode String objects. This could generate
interpretation errors when the charset of the client receiving a message was different from
that of the client sending this message (*1).
Joram offers several properties to influence the new encoding behavior, for the communication
between the client and the broker
- fr.dyade.aaa.common.stream.useJVMcharset: Forces the use of the JVM charset as in previous
versions of Joram.
- fr.dyade.aaa.common.stream.charset: Allows the use of a specific charset instead of the UTF-8
charset used by default.
For the encoding of String objects in the broker:
- fr.dyade.aaa.common.encoding.useJVMcharset: Forces the use of the JVM charset as in previous
versions of Joram.
- fr.dyade.aaa.common.encoding.charset: Allows the use of a specific charset instead of the UTF-8
charset used by default.
/!\ Pay attention. Do not use these properties unless you want to maintain strong backward compatibility
with previously deployed versions.
(*1) This issue only affected the properties of the message and not the message body itself.
----
A lot of thought has been given to this problem of incompatibility of charsets, and we have implemented different mechanisms in Joram 5.19. I present them to you to get your feelings, but we will have the opportunity to talk about them later if you wish. First the potential problem:
Until now Joram used the JVM charset to encode and decode String objects. This encoding was used:
- When transferring a message from the client to the broker, or from the broker to the client.
- In this case, there may be a potential problem interpreting String objects if the charsets of the client and the broker are different, or if the charsets of the sending and receiving clients are different. In fact, we only noticed a problem in the second case. In all cases the body of the message is not impacted.
- When writing or reading a message in the persistence database.
- In this case the potential problem can arise if the charset of the broker is changed during a reboot. We could not highlight this problem.
To prevent future difficulties when JVMs default to UTF-8 we have decided to change Joram's default policy:
- From Joram 5.19 the UTF-8 charset will be systematically used for all encoding / decoding operations regardless of the JVM charset. This choice protects us from any incompatibility and allows us to prepare for the evolution of Java.
- In order to preserve already deployed configurations we provide a set of configuration properties:
- Allowing to configure the encoding during the communication, for the client and the broker:
- A property allowing to preserve the previous functioning, ie to use the charset of the JVM.
- A property to specify the charset to use.
- Allowing to configure the encoding during storage for the broker:
- A property allowing to preserve the previous functioning, ie to use the charset of the JVM.
- A property to specify the charset to use.
At the same time, we are considering a mechanism to seamlessly evolve from the current mode to the exclusive use of UTF-8.
\ No newline at end of file
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment