Homage to all buddhas and bodhisattvas! May all sentient beings attain peace and happiness.
Today is Vesak Day in 2025. The Dharma Torch Project officially launches today.
Dharma Torch is a long-term project run by me. My goal is to translate all Chinese Buddhist scriptures into English with the help of LLMs (large language models).
Specifically, I aim to translate all the Sūtra portions of the Taishō Tripiṭaka. If a certain sutra has already been translated by others, I’ll temporarily skip it. For texts with multiple versions or parallel translations, I will choose one version to translate. Also, I plan to temporarily exclude the Prajñāpāramitā section and Yogācāra texts from translation. At present, my priority is to translate as many Chinese-written sutras as quickly as possible, rather than to study or interpret deeply esoteric doctrines.
After filtering based on the above criteria, the total workload amounts to about 50 MB of text, or roughly 16 million Chinese characters. Given my current level of time and energy, I estimate it will take around 20 years to complete. Although that’s a long time, it is still within the span of a lifetime. As of now, the project is 1.7% complete. Among the remaining texts, some have already been translated, so the final completion time might be compressed to under 15 years. Of course, others may be unwilling to share their translations in an open-source format—if that’s the case, I’ll retranslate those myself.
The reason I want to do this is because I’ve noticed that in the English-speaking Buddhist world, there seems to be a stronger emphasis on the Pali and Tibetan versions of the scriptures, while the Chinese canon is relatively overlooked. I believe it’s unfortunate that researchers and practitioners alike often lack access to the 16 million characters of the Chinese Buddhist canon. The advent of LLMs has made full-scale translation feasible.
Current models do not handle Classical Chinese well when translating directly into English. Classical Chinese could be highly ambiguous, and the language used in the Buddhist scriptures is not even standard Classical Chinese—just as Buddhist Sanskrit is not standard Sanskrit, but rather Buddhist Hybrid Sanskrit (BHS). This increases the difficulty of both translation and proofreading. Between the Chinese source text and the final English output, there are many intermediate steps that I must handle manually. I suspect that even future AI models may not improve much in handling Classical Chinese, as it is a very niche need.
Since I need to compare every AI-generated sentence against the original, this limits the speed of translation. I use more than one model, and even outputs from the same-named model change over time, so the quality and phrasing of translations in the Dharma Torch project may not be consistent. I can only guarantee that the meaning in English and Classical Chinese corresponds, but as for literary elegance, poetic form, rhyme, subtle differences in philosophical terminology, and consistency in word choice, I can’t pay too much attention to all that for now.
Of course, this way of working and this mode of translation do not conform to the aesthetic expectations many people have regarding faith. As traditionally believed, the translation of religious scriptures was carried out by virtuous practitioners and scholars, who, in a pure state of mind, pour out their hearts and exhaust their strength for years before a scripture could come forth. Modern tools have lowered the threshold for undertaking translation, and while they accelerate the process, they also accelerate the generation of errors within the translation. These can only be reduced as much as possible, but never entirely avoided. Although, in terms of faith, devotees believe that the translations of the ancient masters were flawless, from the perspective of modern philology this is not really the case.
The English translated by AI can only be said to be readable, but hardly elegant—especially when it comes to rendering verses: it is neither poetry nor does it rhyme. I do not intend to polish it; at times I even deliberately preserve that sense of awkwardness, because some of the original Chinese Buddhist scriptures themselves are indeed awkward and difficult. To employ ornate and flowery diction was never the Buddha’s teaching in the first place. Both in the Pāli and in the Chinese scriptures it is recorded:
At this time there were two monks called Yameḷa and Kekuṭa, brothers born into a brahmin family, who were well-spoken and had good voices. They went to the Buddha, bowed, sat down, and said, “Sir, the monks now have a variety of names and come from a variety of families, castes, and households. They corrupt the word of the Buddha by using their own expressions.
The Buddha rebuked them, “Foolish men, how can you suggest such a thing? This will affect people’s confidence …” After rebuking them … the Buddha gave a teaching and addressed the monks:
“You shouldn’t give metrical form to the word of the Buddha. If you do, you commit an offense of wrong conduct. You should learn the word of the Buddha using its own expressions.”
The Chinese version explained ‘using its own expressions’ in more details.
“In my Dharma, it is not flowery language that is held as noble; so long as the meaning and principle are not lost, that is my intent. Whichever words living beings accept, one should conform accordingly to speak to them. This is called ‘adapting to the country in which one dwells.’” (T1463)
I try to ensure that both transliteration and translation correspond to the original. For example, if the original says “阿耨多羅三藐三菩提,” the English translation will use the Sanskrit term anuttarā-samyak-saṃbodhi, while “unsurpassed perfect enlightenment” issued for the free translation “無上正等正覺”. That said, I also personally prefer translating “如来” as tathāgata rather than “Thus-Come One.”
I do my best to base all translations on the Chinese texts, but sometimes the original is extremely difficult to understand or obviously missing parts—in those cases, I may consult the corresponding Pali or Tibetan English translations for reference.
I only work on this in my spare time, and I don’t have the high level of spiritual cultivation that the great translators of the past had. Whether English-speaking Buddhist practitioners wish to use these translations for chanting or as guides to practice is entirely up to them. I, of course, take sole responsibility for any flaws in the English translations.
All translations from the Dharma Torch project will be made freely available under the CC BY-NC-SA 4.0 license. This means that anyone may distribute, remix, adapt, and build upon the material in any medium or format, for noncommercial purposes only. If others modify or adapt the material, they must license their modified material under the same terms.
Dharmamitra: What This New Tool Gets Right—and Where It Still Trips
