News

[FY2023] [FY2022] [FY2021] [FY2020] [FY2019] [FY2018] [FY2017] [FY2016] [FY2015] [FY2014] [FY2013] [FY2012] [FY2011] [FY2010]

FY2023

2023.08.16:
Our paper on polishing the analytic scheme for translation differences has been accepted at HumEval 2023. It uses our MultiEnJa dataset.
2023.07.29:
Our paper on automatic paraphrasing + machine translation + reranking has been accepted at WAT 2023.
2023.07.13:
Delighted to be recognized as an Outstanding Area Chair for ACL 2023! My sincere gratitude goes to the Senior Area Chairs of Sentence-level Semantics Track for their careful management during the review process and nominating me for this award.

FY2022

2023.02.24:
MultiEnJa is now publicly available. It contains English-to-Japanese document-level translations produced by two different processes: professional translation and machine translation followed by post-editing. We hope this dataset encourages translation studies, translation education, and domain adaptation of machine translation and its evaluation.
2022.07.06:
"Metalanguages for Dissecting Translation Processes: Theoretical Development and Practical Applications" is out. I have authored chapters 8, 11, and 16.

FY2021

2021.12.06:
Our contributions in the KAKENHI project will be published as a book entitled "Metalanguages for Dissecting Translation Processes: Theoretical Development and Practical Applications" in May 2022. I have authored chapters 8, 11, and 16.
2021.08.18:
A dataset created for analyzing neural machine translation is now publicly available at GitHub. The corresponding paper is here
2021.07.06:
Our paper for ACL-IJCNLP 2021 has been selected as one of the outstanding papers!
2021.06.30:
We have published a pre-print of our forthcoming ACL-IJCNLP paper "Scientific Credibility of Machine Translation Research: A Meta-Evaluation of 769 Papers" at arXiv. Annotations for this study are also released at GitHub.
2021.06.21:
We have published a pre-print of our extended work of our past AAAI work at arXiv.
2021.06.16:
Two papers have been accepted at MT Summit 2021.
2021.05.06:
Our paper on meta-evaluation of machine translation papers has been accepted at ACL-IJCNLP 2021.

FY2020

2021.02.10:
Our paper entitled "Extremely Low-Resource Neural Machine Translation for Asian Languages" has been published.
2021.02.08:
We have published a pre-print of our forthcoming EACL paper "Understanding Pre-Editing for Black-Box Neural Machine Translation" at arXiv.
2021.02.02:
We have published a manuscript entitled "Synthesizing Monolingual Data for Neural Machine Translation" at arXiv.
2021.01.12:
Our paper entitled "Understanding Pre-Editing for Black-Box Neural Machine Translation" has been accepted at EACL 2021.
2020.11.15
Our TACL article entitled "Synthesizing Parallel Data of User-Generated Texts with Zero-Shot Neural Machine Translation" has been published.
2020.10.01
Our paper entitled "Combining Sequence Distillation and Transfer Learning for Efficient Low-Resource Neural Machine Translation Models" has been accepted at WMT 2020.
2020.07.28:
Our article on domain adaptation of NMT without any in-domain parallel data has get accepted for Transactions of the Association for Computational Linguistics (TACL).
2020.06.02:
Our TALLIP article entitled "Iterative Training of Unsupervised Neural and Statistical Machine Translation Systems" has been published. Check it out!
2020.05.13:
Our paper entitled "Balancing Cost and Benefit with Tied-Multi Transformers" has been accepted at WNGT 2020.
2020.04.04:
Our paper entitled "Tagged Back-translation Revisited: Why Does It Really Work?" has been accepted at ACL 2020.

FY2019

2020.03.22:
Our article on iterative training of unsupervised SMT and NMT has get accepted for ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP).
2019.08.28:
We have published a manuscript entitled "Multi-Layer Softmaxing during Training Neural Machine Translation for Flexible Decoding with Fewer Layers" at arXiv.
2019.08.14:
Our paper entitled "Exploiting Multilingualism through Multistage Fine-Tuning for Low-Resource Neural Machine Translation" has been accepted at EMNLP-IJCNLP 2019.
2019.05.20:
Our paper entitled "Exploiting Out-of-Domain Parallel Data through Multilingual Transfer Learning for Low-Resource Neural Machine Translation" has been accepted at Machine Translation Summit XVII. We also make our benchmark data set, Japanese-Russian-English News Commentary Parallel Data, public at GitHub.
2019.05.14:
Our paper entitled "Unsupervised Joint Training of Bilingual Word Embeddings" has been accepted at ACL 2019.

FY2018

2019.02.23:
Our paper entitled "Unsupervised Extraction of Partial Translations for Neural Machine Translation" has been accepted at NAACL-HLT 2019.
2019.02.06:
Our paper and poster presented at AAAI 2019 are now available. [Dabre & Fujita, 2019]
2018.11.08:
Our paper entitled "Recurrent Stacking of Layers for Compact Neural Machine Translation Models" has been accepted at AAAI 2019. It is an extended version of our manuscript at arXiv.
2018.10.31:
We have published a manuscript entitled "Unsupervised Neural Machine Translation Initialized by Unsupervised Statistical Machine Translation" at arXiv.

FY2017

2018.02.16:
Our TALLIP article entitled "Phrase Table Induction Using Monolingual Data for Low-Resource Statistical Machine Translation" has been published. Check it out!
2018.01.31:
Our TALLIP article entitled "Expanding Paraphrase Lexicons by Exploiting Generalities" has been published. Check it out!
2018.01.16:
Our paper on the combination of SMT and NMT has get accepted by AMTA. See you in Boston!
2017.11.27:
Our TACL article entitled "Phrase Table Induction Using In-Domain Monolingual Data for Domain Adaptation in Statistical Machine Translation" has been published. Check it out!
2017.11.21:
Our article on extending phrase-based SMT models for low-resource language pairs has get accepted for ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP).
2017.10.27:
We are happy to announce the release of our dataset for translation quality estimation and automatic post-editing, NICT QE/APE Dataset. We will present it at the upcoming WAT 2017 in Taipei.
2017.10.21:
Our article on expanding paraphrase lexicons, which is an extended version of our NAACL paper, has get accepted for ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP).
2017.09.26:
We will present two papers at IJCNLP (Taipei), one is at WAT, and the other appears in the main conference.
2017.06.21:
Our article on domain adaptation of machine translation systems has get accepted for Transactions of the Association for Computational Linguistics (TACL).
2017.06.16:
Our article on translation quality estimation has get accepted for IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP). This is my first occasion to acquire the Impact Factor.[Preprint]
2017.05.22:
Our two papers on learning materials for academic writing have been accepted for ICCS 2017.
2017.05.16:
We'll organize a workshop "Language Technologies and Human Translators" in conjunction with Machine Translation Summit XVI.

FY2016

2016.07.27:
We are happy to announce the release of our paraphrase database Lexpanded-PPDB version 1.0, which comprises 692 million English paraphrase pairs, as well as 942 million, 628 million, and 532 million pairs for French, Spanish, and German, respectively. We do not give features for each pair due to its scale; we instead recommend users to first extract a subset for your targeted texts and then give some features for each pair within it.

FY2015

2015.08.11:
A co-authored paper is accepted to this year of MT Summit [Paper].
2015.06.02:
Paper and poster presented at NAACL-HLT 2015 are now available [Expansion of paraphrase lexicons].

FY2014

2015.02.21:
Our paper entitled "Expanding Paraphrase Lexicons by Exploiting Lexical Variants" has been accepted at NAACL-HLT 2015.
2014.09.01:
I am (adjunctly) designated as a Seniar Researcher at a new research center at NICT, called ASTREC.
2014.04.01:
I start working at National Institute of Information and Communications Technology, as a Seniar Researcher.

FY2013

2014.03.23:
From 2014.04.01, I'll be working at Multilingual Translation Laboratory, Universal Communication Research Institute, National Institute of Information and Communications Technology, located in Kyoto, Japan. My research themes there will be paraphrasing, machine translation, and computer-assisted human translation.
2014.03.09:
This site is now hosted by a new Web server.
2013.06.23:
The first update after my return to Japan.

FY2012

2013.03.14:
I'll be giving a talk at the TAMALE seminar, University of Ottawa about FUN-NRC’s Paraphrase-augmented Phrase-based SMT for NTCIR-10 Patent MT.
2013.03.04:
Office relocation is postponed. I'll work at the same place until the end of my project period!
2013.02.14:
My office at NRC will be relocated to another campus in early March.
2013.01.24:
Mail services of our university will be unavailable during the following timeframes due to the hardware replacement. Please make sure that you are out of these timeframes (JST, +0900) when sending me an email.
  • 2013.02.04 18:00 - 2013.02.05 04:00 (10 hrs)
  • 2013.02.07 20:00 - 2013.02.08 06:00 (10 hrs)
  • 2013.02.08 20:00 - 24:00 (4 hrs)
  • 2013.02.13 20:00 - 24:00 (4 hrs)

FY2011

2011.09.06:
Call for participation! The technical program of the 6th NLP Symposium for Young Researchers is now available. Please feel free to come to the symposium and enjoy chatting with each other.
2011.07.11:
Call for submissions of free-and-easy poster presentation! We are delighted to announce that the 6th NLP Symposium for Young Researchers will be held under the sponsorship of the Association for Natural Language Processing in Japan. We are welcoming submissions that have potential of future innovativation.
2011.05.12:
I'll be giving a talk at the TAMALE seminar, University of Ottawa about Typology of Paraphrases and Recent Advances in Automatic Paraphrasing.

FY2010

2011.03.11:
An incredibly huge earthquake (M8.8 M9.0) and consequent tidal waves (tsunami) hit the northern east part of Japan today. But, fortunately I and all of my relatives have no damage. Many thanks go to those who worried about me. As frequent aftershocks still continue, secondary and tertiary damages are expected. I am praying for safety of people in the disaster area including my ex-supervisor Prof. Kentaro Inui and his students in Tohoku Univ.
2011.03.01:
I will stay at Institute for Information Technology, National Research Council of Canada as a visiting researcher from April 2011 to March 2013. This stay is granted by the program of Postdoctoral Fellowship for Research Abroad of Japan Society for the Promotion of Science.
2010.08.19:
All the servers including the host for paraphrasing.org will be stopped from August 20th evening to August 22th morning for the electrical maintenance of the building. But, I can receive e-mails to my university account.
2010.04.01:
I am relocated to the new department according to the reorganization of departments in our university. See aim and detail if you are interested in.