Www_2020_review

cDear Bofang,

Thank you for your submission to TheWebConf 2020. The TheWebConf 2020 rebuttal period will be between Dec 11 and Dec 16 (AOE time).

During this time, you will have access to the current state of your reviews and have the opportunity to submit a response of up to 500 words. Please keep in mind the following during this process: 1) The response must focus on any factual errors in the reviews and any questions posed by the reviewers. It must not provide new research results or reformulate the presentation. Try to be as concise and to the point as possible.

2) The rebuttal period is an opportunity to react to the reviews, but not a requirement to do so. Thus, if you feel the reviews are accurate and the reviewers have not asked any questions, then you do not have to respond.

3) The reviews are as submitted by the PC members. Thus, there may be inconsistencies. Furthermore, these are not the final versions of the reviews. The reviews can later be updated to take into account the discussions at the program committee meeting, and we may find it necessary to solicit other outside reviews after the rebuttal period.

4) The program committee will read your responses carefully and take this information into account during the discussions. On the other hand, the program committee will not directly respond to your responses, either before the program committee meeting or in the final versions of the reviews.

5) Your response will be seen by all PC members who have access to the discussion of your paper. We require you to adopt a non-offensive and constructive tone.

The reviews on your paper are attached to this letter. There might be one review which looks different – it summarizes additional questions raised during the discussions of the PC members. To submit your response to all these reviews, please log on the EasyChair Web page for TheWebConf 2020 and select your submission on the menu.

Best wishes, Tie-Yan Liu and Maarten van Steen PC Chairs, TheWebConf 2020

———————– REVIEW 1 ——————— SUBMISSION: 2054 TITLE: Neighbor BERT: Enhanced Query and Item Representations for Semantic Matching with Neighbor Similarity in E-Commerce AUTHORS: Zixuan Ke, Bofang Li, Haochuan Sun, Hongbo Deng, Rong Xiao, Ji Jianhui, Jiawei Wu, Kun Zhao and Baole Ai

———– Strengths ———–

The paper studies an important problem.
The proposed model aims to improve both query to item matching and query rewriting tasks at the same time.
The paper is well written. ———– Weaknesses ———–
The technical contribution is limited.
Important baselines are missing from the evaluation.
The review of related works is very brief. ———– Summary and review comments ———– This paper presents a BERT based model for query to item matching in the context of e-commerce search. The model is jointly trained on two tasks, query to item similarity and query to query similarity. Different ways of computing the loss function and sampling positive/negatives are discussed. The proposed model is trained and evaluated with a large dataset from a real search log offline, and is also evaluated with an online A/B test.

The technical contribution is limited. The model is a rather straightforward application of the BERT model for semantic matching tasks. Although different ways of coming up with positive/negative samples are explored, using clicks and non-clicks are standard practice in the industry for such problems, getting negatives from queries in the same category is variant of the basic idea. Using attention on the co-clicked queries to reconstruct the original query is also not very novel. Similar ideas have been proposed for query suggestion tasks in the literature.

The offline evaluation compares the proposed model with 5 baselines. However, 4 of them are variants of the same model proposed in [43]. The only other model compared is DeepWalk [24], but this work was not initially proposed for product search related tasks. In the literature, there are a number of existing deep semantic matching approaches for product search that should have been included in the evaluation, such as LSE, ARC-II, MatchPyramid, etc. There is also no comparison with standard query rewriting approaches.

The review of related works is very brief. Only one paper is cited for product search. A number of missing references should be discussed and compared in the experiments given this is an active research area in recent years.

To summarize, this paper studies an important problem and proposes a practical solution that leverages BERT to improve query to item matching and query rewriting simultaneously. Although it is a well written paper and shows promising results in online A/B test, given it is limited technical contribution and biased experimental evaluation, I do not think it meets the bar of the Web conference. With comparison with more baselines, this paper may be a good fit for the industrial track of a top conference. ———– Overall score ———– SCORE: -1 (weak reject)

———————– REVIEW 2 ——————— SUBMISSION: 2054 TITLE: Neighbor BERT: Enhanced Query and Item Representations for Semantic Matching with Neighbor Similarity in E-Commerce AUTHORS: Zixuan Ke, Bofang Li, Haochuan Sun, Hongbo Deng, Rong Xiao, Ji Jianhui, Jiawei Wu, Kun Zhao and Baole Ai

———– Strengths ———– -1) The paper tried to exploit BERT and neighborhood information in query-item click graph to improve the effectiveness of the semantic matching phase in e-commerce search. -2) The paper utilized large-scale datasets and online A/B testing for evaluation. -3) The authors compared multiple baselines in the paper. ———– Weaknesses ———– -1) The idea of utilizing neighborhood information from query-item click graph is not a novel idea. -2) It is not so clear why authors do not choose combine GraphNN and BERT. A straightforward idea would be combining these ideas versus introducing ad-hoc neighborhood modeling ideas from BERT. -3) It is interesting to see that the authors did not use click or purchase in any offline evaluation while using them as metrics in online testing. Why not training models from click and/or purchase as labels? ———– Summary and review comments ———– The paper tried to exploit BERT and neighborhood information in query-item click graph to improve the effectiveness of the semantic matching phase in e-commerce search. The paper utilized large-scale datasets and online A/B testing for evaluation. The authors compared multiple baselines in the paper.

The idea of utilizing neighborhood information from query-item click graph is not a novel idea. It is not so clear why authors do not choose combine GraphNN and BERT. A straightforward idea would be combining these ideas versus introducing ad-hoc neighborhood modeling ideas from BERT. It is interesting to see that the authors did not use click or purchase in any offline evaluation while using them as metrics in online testing. Why not training models from click and/or purchase as labels? ———– Overall score ———– SCORE: -2 (reject)

———————– REVIEW 3 ——————— SUBMISSION: 2054 TITLE: Neighbor BERT: Enhanced Query and Item Representations for Semantic Matching with Neighbor Similarity in E-Commerce AUTHORS: Zixuan Ke, Bofang Li, Haochuan Sun, Hongbo Deng, Rong Xiao, Ji Jianhui, Jiawei Wu, Kun Zhao and Baole Ai

———– Strengths ———–

Proposes a neural network architecture using BERT to find semantically similar queries.
Presents experiments, comparing the proposed approach against several existing approaches.
Presents offline and online results and analysis of the proposed approaches. ———– Weaknesses ———–
Semantically similar queries could mean queries with exactly the same meaning (Ex: “white tshirt”, “t-shirt white”) or where one is a bit more narrow than the other (Ex: “white tshirt full sleeves”, “t-shirt white”) or where one query could have multiple interpretations (Ex: “tshirt lilac” could mean lilac colored or with lilac prints). This illustrates the complexity of this problem. The paper does not discuss how the nuances of the finding semantically similar queries is handled.
Although queries are represented as a graph, only the immediate neighborhood of the queries are used in the solutions. The full graph is not used in the solution, and thus a graph representation of queries and query-items is really required for the proposed solutions.
The writing could be clearer and a thorough proof-reading could help fix grammatical errors. For instance, a clear problem definition of semantic matching and query re-writing is essential. ———– Summary and review comments ———–
AUC is a good metric to evaluate classification tasks. However, to evaluate semantic matching, i.e. whether a given item is relevant to a given query, a more fine-grained label and evaluation metric would be more helpful. For example, item relevance for a query is often rated on a scale of 1-5, and one could measure the improvement in fraction of highly-relevant items for a given query. Similarly, as mentioned in a separate comment, queries can be semantically similar in many ways. Being more precise about the kind of semantic similarity and the impact of an approach on it would help highlight the strengths and weaknesses of the proposed approach.
When a bipartite graph of co-clicks is converted to a unipartite graph of queries it is likely to have many cliques of queries. Cliques, especially large ones can often have an undesirable impact on walk-based graph algorithms. A discussion on that might be useful to the reader.
There is existing work on finding similar queries based on click-graphs. It might be worthwhile to compare the proposed approach against an existing approach that is not based on neural networks, but also better than inverted index. ———– Overall score ———– SCORE: -1 (weak reject)
目标
- 倒排这个需要找。召回 + 进海选排序
索引
- 是否需要调整下，目前是根据gmv分来的
- 海选分计算
方案
- 各个环节的评估。
- 倒排排序的分数是否需要更新。
把第二路融合到第一路里面
- 第二路的i2i倒排并没有考虑query信息
怎么评估
seek之后也会判断下是否

评估：可以先找个桶在新的召回覆盖query下关闭动态截断，同时调整rank_size，保证引擎机器水位不要涨太多。看下有没有空间

搜全域
- 精排要做好
S2目标
- 分行业来推动知识
3月，9月，亮点
个性化相关性。个性化类目
倒排检索 —> 图检索
两层事情：做的更准，做的更全
目标
- 需求供给 —》query增长用户这个习惯生态培养比较周期会比较长。
- 那短期内怎么平均搜全域的结果？
- 前期算法能够起到的作用，
- 和琉璃石有什么差别？思路都是一样的。当时也是供给不行，目前也就直播可以
- 提高活跃用户
- 考虑用户换query的频率(是否做的足够好)
- 需要做的更加精准，提高效率。
- 口语化的query：目标怎么表达, 教育用户：引导用户。比较合适的产品？不能是show case，需要一定量的
- query并不一定准确描述了用户意图，有没有办法(产品)来fix这种，在这种语境下
- 从问题出发，我们是要解决哪类问题，而不是先做出什么，然后找怎么用
和其它团队
- 和直播，内容本身团队的问题，是他们做还是咱们做比较好？
- 物料的丰富性供给
- 咱们只做意图识别和导流，其它内容团队来搞？
商品理解
- 两边是怎么合作：query tagging + 品类打标
- 标题堆砌，属性错乱，类目错放
- 图谱从数据的完备性来考虑，公达：业务的角度考虑
1：用户知道可以这么搜
- 底纹，下拉引导
- “锦囊”推荐
2：用户搜了可以有好的承接
- 意图识别
- 物料系统构建
TW
- 怎么合作，合作目标
和个性化排序的overlap是多少
少于3页的进行补充
session行为信息利用起来

喜阕：

term weighting方向 - 九谦
产品创新：前置导航 - 开锋
智能交互：锦囊，交互式推宽泛

九谦:

口语化的query

棋落：

Groot项目继续做 - 相关性1.5%（品类体系，同义词，上下位）
常识性、口语型的query理解 +2 @公达 @棋落

公达：

query tagging （95%，空间不大） v.s concept tagging
商品理解：商品打标 @公达 @华仔
常识性、口语型 @公达 @棋落
语音搜索
跨属性、跨品类的query

铁衣：

搜索更精准

仲宁：

认知在搜索上的应用要更明显
用户无法清晰表达，下拉承接，锦囊中能够更加明确
锦囊，交互的形式，有交互的感觉
长尾query，在下拉、产品形式上引导
用户意图收敛，提升效率