Giter Site home page Giter Site logo

j_bar_exam's Introduction

Evaluating GPT in Japanese Bar Examination: Insights and Limitations

Abstract

Large-scale language models like ChatGPT have been reported to exceed the accuracy of human experts in a wide range of tasks. Recent research reports that ChatGPT passed the Japanese National Medical Examination, confirming its high performance in Japanese. We evaluated the accuracy of GPT-3, GPT-4, and ChatGPT in the Japanese Bar Examination (the multiple-choice format section), focusing on Constitutional Law, Civil Law, and Criminal Law over the past five years. The results revealed that the current correct answer rate for these exams is only 30-40% (compared to the average pass rate of 70%), which is significantly low. This study went beyond just the correct answer rate, dissecting the necessary reasoning and knowledge for the responses, and examining the performance of large-scale language models from each perspective. The findings show that 1) large-scale language models possess extensive knowledge of many statutes, 2) they have a high correct answer rate for questions that require understanding of legal theories but not specific knowledge of law, and 3) they have a low correct answer rate for questions requiring knowledge of case law. The primary reason for their lower performance compared to the American Bar Examination is thought to be a lack of knowledge in Japanese law, especially in case law.

概要

ChatGPTなどの大規模言語モデルが,多岐にわたるタスクにおいて人間の専門家の精度を上回ると報告されている.とくに日本の医師国家試験にChatGPTが合格したという最近の研究報告からも,日本語についての高い性能が確認されている. 本研究では,日本の司法試験(短答式)の憲法,民法,刑法それぞれ過去5年分を対象に,GPT-3, GPT-4およびChatGPTの精度を評価した.結果として,現段階では日本の司法試験に対する正答率が3〜4割と非常に低いことが明らかになった. 本研究では,単なる正解率にとどまらず,回答に必要な知識,能力を分解し,それぞれの観点での大規模言語モデルの性能を検証した.その結果,1)大規模言語モデルは多くの条文の知識を有していること,2)特定の条文や判例の知識を必要としないが学説の理解を必要とする問題に関しては正解率が高いこと,3)判例の知識を必要とする問題に関しては正解率が低いこと,が示された.アメリカの司法試験と比較して性能が低い原因の大部分は,日本法の知識,とくに判例の知識の乏しさにあると考えられる.

bib (en)

@techreport{choi_et_al_2023_j_bar_exam_en,
  author = {Choi, Jungmin and Kasai, Jungo and Sakaguchi, Keisuke},
  title = {{Evaluating GPT in Japanese Bar Examination: Insights and Limitations}},
  url = {https://github.com/keisks/j_bar_exam},
  year = {2023},
  month = {12},
}

bib (jp)

@techreport{choi_et_al_2023_j_bar_exam_jp,
  author = {チェ,ジョンミン and 笠井,淳吾 and 坂口,慶祐},
  title = {{日本の司法試験を題材としたGPTモデルの評価}},
  url = {https://github.com/keisks/j_bar_exam},
  year = {2023},
  month = {12},
}

Paper PDF

https://jxiv.jst.go.jp/index.php/jxiv/preprint/view/559

j_bar_exam's People

Contributors

jungminc avatar keisks avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.