問(wèn)題描述
埋點(diǎn)系統(tǒng)負(fù)責(zé)接收客戶端、H5等系統(tǒng)發(fā)送過(guò)來(lái)的用戶行為埋點(diǎn)數(shù)據(jù),經(jīng)過(guò)統(tǒng)一的接收、解析,最終發(fā)到Kafka中,提供給下游業(yè)務(wù)方進(jìn)行消費(fèi)。在一個(gè)變更測(cè)試中,發(fā)現(xiàn)原本是整型的數(shù)據(jù)轉(zhuǎn)換后變成了double。為了問(wèn)題描述簡(jiǎn)單,便于大家理解,簡(jiǎn)化為下面的例子,本文源碼基于gson-2.8.0。
{
"rate": 1.0,
"extend": {
"number": 30,
"amount": 120.3
}
}
處理后邊變?yōu)椋?/p>
{
"rate":1.0,
"extend":{
"number":30.0,
"amount":120.3
}
}
如extend字段中的number的值的從整型30變?yōu)榱薲ouble類(lèi)型,剛好下游業(yè)務(wù)方有些是把數(shù)值型轉(zhuǎn)換為字符串類(lèi)型進(jìn)行邏輯判斷,比如兩張表的join操作的時(shí)候,由于類(lèi)型發(fā)生了切換,導(dǎo)致關(guān)聯(lián)不上,字符串"30"和"30.0"不相等。
代碼分析
看到上面的部分,可能有些同學(xué)會(huì)說(shuō)了,為什么不直接把number字段定義為整型來(lái)規(guī)避這個(gè)問(wèn)題。此處的原因是extend字段是擴(kuò)展字段,不確定里面包含哪些字段,跟業(yè)務(wù)的上報(bào)方密切相關(guān)。
Data類(lèi)定義
public class Data {
private Double rate;
private Object extend;
public Double getRate() {
return rate;
}
public void setRate(Double rate) {
this.rate = rate;
}
@Override
public String toString() {
return "Data{" +
"rate=" + rate +
", extend=" + extend +
'}';
}
}
再看下測(cè)試代碼:
public class GsonTest {
public static void main(String[] args) {
String dataJson = "{\"rate\" : 1.0, \"extend\" : {\"number\" : 30, \"amount\" : 120.3}}";
Gson gson = buildGson();
Data data = gson.fromJson(dataJson, Data.class);
System.out.println(data.toString());
}
private static Gson buildGson() {
GsonBuilder gsonBuilder = new GsonBuilder();
return gsonBuilder.create();
}
}
輸出結(jié)果:

根因分析
接下來(lái)我們簡(jiǎn)述下反序列化的過(guò)程,Gson根據(jù)待解析的類(lèi)型定位到具體的TypeAdaptor<T>類(lèi),其接口的主要方法如下:
public abstract class TypeAdapter<T> {
/**
* Writes one JSON value (an array, object, string, number, boolean or null)
* for {@code value}.
*
* @param value the Java object to write. May be null.
*/
public abstract void write(JsonWriter out, T value) throws IOException;
/**
* Reads one JSON value (an array, object, string, number, boolean or null)
* and converts it to a Java object. Returns the converted object.
*
* @return the converted Java object. May be null.
*/
public abstract T read(JsonReader in) throws IOException;
}
通過(guò)read方法從JsonReader中讀取相應(yīng)的數(shù)據(jù)組裝成最終的對(duì)象,由于Data類(lèi)中的extend字段的聲明類(lèi)型是Object,最終Gson會(huì)定位到內(nèi)置的ObjectTypeAdaptor類(lèi),我們來(lái)分析一下該類(lèi)的邏輯過(guò)程。
/**
* Adapts types whose static type is only 'Object'. Uses getClass() on
* serialization and a primitive/Map/List on deserialization.
*/
public final class ObjectTypeAdapter extends TypeAdapter<Object> {
public static final TypeAdapterFactory FACTORY = new TypeAdapterFactory() {
@SuppressWarnings("unchecked")
@Override public <T> TypeAdapter<T> create(Gson gson, TypeToken<T> type) {
if (type.getRawType() == Object.class) {
return (TypeAdapter<T>) new ObjectTypeAdapter(gson);
}
return null;
}
};
private final Gson gson;
ObjectTypeAdapter(Gson gson) {
this.gson = gson;
}
@Override public Object read(JsonReader in) throws IOException {
JsonToken token = in.peek();
switch (token) {
case BEGIN_ARRAY:
List<Object> list = new ArrayList<Object>();
in.beginArray();
while (in.hasNext()) {
list.add(read(in));
}
in.endArray();
return list;
case BEGIN_OBJECT:
Map<String, Object> map = new LinkedTreeMap<String, Object>();
in.beginObject();
while (in.hasNext()) {
map.put(in.nextName(), read(in));
}
in.endObject();
return map;
case STRING:
return in.nextString();
//數(shù)值類(lèi)型全部轉(zhuǎn)換為了Double類(lèi)型
case NUMBER:
return in.nextDouble();
case BOOLEAN:
return in.nextBoolean();
case NULL:
in.nextNull();
return null;
default:
throw new IllegalStateException();
}
}
@SuppressWarnings("unchecked")
@Override public void write(JsonWriter out, Object value) throws IOException {
if (value == null) {
out.nullValue();
return;
}
TypeAdapter<Object> typeAdapter = (TypeAdapter<Object>) gson.getAdapter(value.getClass());
if (typeAdapter instanceof ObjectTypeAdapter) {
out.beginObject();
out.endObject();
return;
}
typeAdapter.write(out, value);
}
}
看到該邏輯過(guò)程我們看到,如果Json對(duì)應(yīng)的是Object類(lèi)型,最終會(huì)解析為Map<String, Object>類(lèi)型;其中Object類(lèi)型跟Json中具體的值有關(guān),比如雙引號(hào)的""值翻譯為STRING。我們可以看下數(shù)值類(lèi)型(NUMBER)全部轉(zhuǎn)換為了Double類(lèi)型,所以就有了我們之前的問(wèn)題,整型數(shù)據(jù)被翻譯為了Double類(lèi)型,比如30變?yōu)榱?0.0??吹竭@,大家是不是也在想應(yīng)該細(xì)分下NUMBER數(shù)值類(lèi)型,按照整型和浮點(diǎn)型分開(kāi)處理,我們看下JsonToken是否有更細(xì)分的類(lèi)型。
public enum JsonToken {
/**
* The opening of a JSON array. Written using {@link JsonWriter#beginArray}
* and read using {@link JsonReader#beginArray}.
*/
BEGIN_ARRAY,
/**
* The closing of a JSON array. Written using {@link JsonWriter#endArray}
* and read using {@link JsonReader#endArray}.
*/
END_ARRAY,
/**
* The opening of a JSON object. Written using {@link JsonWriter#beginObject}
* and read using {@link JsonReader#beginObject}.
*/
BEGIN_OBJECT,
/**
* The closing of a JSON object. Written using {@link JsonWriter#endObject}
* and read using {@link JsonReader#endObject}.
*/
END_OBJECT,
/**
* A JSON property name. Within objects, tokens alternate between names and
* their values. Written using {@link JsonWriter#name} and read using {@link
* JsonReader#nextName}
*/
NAME,
/**
* A JSON string.
*/
STRING,
/**
* A JSON number represented in this API by a Java {@code double}, {@code
* long}, or {@code int}.
*/
NUMBER,
/**
* A JSON {@code true} or {@code false}.
*/
BOOLEAN,
/**
* A JSON {@code null}.
*/
NULL,
/**
* The end of the JSON stream. This sentinel value is returned by {@link
* JsonReader#peek()} to signal that the JSON-encoded value has no more
* tokens.
*/
END_DOCUMENT
}
居然沒(méi)有細(xì)分類(lèi)型,那這怎么辦。?沒(méi)事,我們?cè)俜治鱿翵sonReader.peek方法
/**
* Returns the type of the next token without consuming it.
*/
public JsonToken peek() throws IOException {
int p = peeked;
if (p == PEEKED_NONE) {
p = doPeek();
}
switch (p) {
case PEEKED_BEGIN_OBJECT:
return JsonToken.BEGIN_OBJECT;
case PEEKED_END_OBJECT:
return JsonToken.END_OBJECT;
case PEEKED_BEGIN_ARRAY:
return JsonToken.BEGIN_ARRAY;
case PEEKED_END_ARRAY:
return JsonToken.END_ARRAY;
case PEEKED_SINGLE_QUOTED_NAME:
case PEEKED_DOUBLE_QUOTED_NAME:
case PEEKED_UNQUOTED_NAME:
return JsonToken.NAME;
case PEEKED_TRUE:
case PEEKED_FALSE:
return JsonToken.BOOLEAN;
case PEEKED_NULL:
return JsonToken.NULL;
case PEEKED_SINGLE_QUOTED:
case PEEKED_DOUBLE_QUOTED:
case PEEKED_UNQUOTED:
case PEEKED_BUFFERED:
return JsonToken.STRING;
case PEEKED_LONG:
case PEEKED_NUMBER:
return JsonToken.NUMBER;
case PEEKED_EOF:
return JsonToken.END_DOCUMENT;
default:
throw new AssertionError();
}
}
可以看到其實(shí)在JsonReader的讀取過(guò)程中是有細(xì)分整型和浮點(diǎn)型,可以對(duì)外轉(zhuǎn)換后不再區(qū)分?jǐn)?shù)值類(lèi)型了,一種改法是直接修改源碼,在JsonToken多定義定義一個(gè)整型Long,然后在讀取的過(guò)程中細(xì)分下類(lèi)型,修改ObjectTypeAdaptor的方法后大概如下所示
@Override public Object read(JsonReader in) throws IOException {
JsonToken token = in.peek();
switch (token) {
..........................
case LONG:
return in.nextLong();
case NUMBER:
return in.nextDouble();
..........................
}
}
什么,居然要修改源碼,是不是改動(dòng)太大了?。?!我們?cè)倩氐街暗闹R(shí)點(diǎn),解析方式是根據(jù)類(lèi)型找到具體的TypeAdaptor,同時(shí)我們不希望改變JsonToken等類(lèi)的實(shí)現(xiàn)。所以我們首先為Data定義一個(gè)適配器,命名為DataTypeAdaptor,具體實(shí)現(xiàn)如下:
public class DataTypeAdaptor extends TypeAdapter<Data> {
public static final TypeAdapterFactory FACTORY = new TypeAdapterFactory() {
@SuppressWarnings("unchecked")
@Override
public <T> TypeAdapter<T> create(Gson gson, TypeToken<T> type) {
if (type.getRawType() == Data.class) {
return (TypeAdapter<T>) new DataTypeAdaptor(gson);
}
return null;
}
};
private final Gson gson;
DataTypeAdaptor(Gson gson) {
this.gson = gson;
}
@Override
public void write(JsonWriter out, Data value) throws IOException {
if (value == null) {
out.nullValue();
return;
}
out.beginObject();
out.name("rate");
gson.getAdapter(Double.class).write(out, value.getRate());
out.name("extend");
gson.getAdapter(Object.class).write(out, value.getExtend());
out.endObject();
}
@Override
public Data read(JsonReader in) throws IOException {
Data data = new Data();
Map<String, Object> dataMap = (Map<String, Object>) readInternal(in);
data.setRate((Double) dataMap.get("rate"));
data.setExtend(dataMap.get("extend"));
return data;
}
private Object readInternal(JsonReader in) throws IOException {
JsonToken token = in.peek();
switch (token) {
case BEGIN_ARRAY:
List<Object> list = new ArrayList<Object>();
in.beginArray();
while (in.hasNext()) {
list.add(readInternal(in));
}
in.endArray();
return list;
case BEGIN_OBJECT:
Map<String, Object> map = new LinkedTreeMap<String, Object>();
in.beginObject();
while (in.hasNext()) {
map.put(in.nextName(), readInternal(in));
}
in.endObject();
return map;
case STRING:
return in.nextString();
case NUMBER:
//將其作為一個(gè)字符串讀取出來(lái)
String numberStr = in.nextString();
//返回的numberStr不會(huì)為null
if (numberStr.contains(".") || numberStr.contains("e")
|| numberStr.contains("E")) {
return Double.parseDouble(numberStr);
}
return Long.parseLong(numberStr);
case BOOLEAN:
return in.nextBoolean();
case NULL:
in.nextNull();
return null;
default:
throw new IllegalStateException();
}
}
}
改動(dòng)點(diǎn)為讀取數(shù)值類(lèi)型的時(shí)候按照字符串讀取,如果原始數(shù)據(jù)中包含小數(shù)點(diǎn)或者是科學(xué)表示法則認(rèn)為是浮點(diǎn)型,否則則是整型。再回過(guò)頭的看下原始的例子
public class GsonTest {
public static void main(String[] args) {
String dataJson = "{\"rate\" : 1.0, \"extend\" : {\"number\" : 30, \"amount\" : 120.3}}";
Gson gson = buildGson();
Data data = gson.fromJson(dataJson, Data.class);
System.out.println(data.toString());
System.out.println(gson.toJson(data, Data.class));
}
private static Gson buildGson() {
GsonBuilder gsonBuilder = new GsonBuilder();
gsonBuilder.registerTypeAdapterFactory(DataTypeAdaptor.FACTORY);
return gsonBuilder.create();
}
}
運(yùn)行結(jié)果
Data{rate=1.0, extend={number=30, amount=120.3}}
{"rate":1.0,"extend":{"number":30,"amount":120.3}}
Process finished with exit code 0
結(jié)果正確,整型的依然是整型,浮點(diǎn)型依舊為浮點(diǎn)型,問(wèn)題得到解決。對(duì)于問(wèn)題本身其實(shí)應(yīng)該推動(dòng)業(yè)務(wù)方去按照schema類(lèi)型進(jìn)行整改,由于本文主要討論gson,在此不再贅述其它解決方式。另外其實(shí)個(gè)人覺(jué)得Gson本身應(yīng)該區(qū)分開(kāi)來(lái)整型和浮點(diǎn)型,從代碼的情況來(lái)看,其應(yīng)該是考慮了該問(wèn)題,但是最終卻沒(méi)有開(kāi)發(fā)給用戶,暫不得其解,后續(xù)準(zhǔn)備在社區(qū)里咨詢(xún)?cè)搯?wèn)題。