Giter Site home page Giter Site logo

akazah / prompt-anonymizer Goto Github PK

View Code? Open in Web Editor NEW
17.0 1.0 1.0 3.23 MB

Anonymize / mask personal information before sending prompts to chat AI (like ChatGPT provided by OpenAI)

License: MIT License

Python 100.00%
anonymization chatai chatgpt nlp openai pii-detection prompt-engineering

prompt-anonymizer's Introduction

English | 日本語

Prompt Anonymizer

Prompt Anonymizer is a demo scripts to anonymize text before sending it to the OpenAI API. It uses the Presidio Analyzer and Presidio Anonymizer to identify and replace PII entities in the text.

  • Prompt Anonymizer anonymize the original text in a format that allows the AI to recognize that the same thing is the same.
  • Prompt Anonymizer is available in Japanese and English.

Setup

  1. git clone https://github.com/akazah/prompt-anonymizer.git
  2. cd prompt-anonymizer
  3. poetry install
  4. Set your OpenAI API key; export OPENAI_API_KEY=YOUR_API_KEY or you can use environment variables management tools like direnv, dotenv and so on. (I'm using direnv)
  5. Download spacy model; poetry run python -m spacy download en_core_web_lg and poetry run python -m spacy download ja_core_news_lg

Usage

English

poetry run python main.py --text "John will have a birthday next month. What kind of gift would be appropriate? John loves nice cuisine. John lives in New York. His email is [email protected]. His mobile is (333)333-3333." --language en

Japanese

poetry run python main.py --text "山田太郎は、来月、誕生日を迎えます。どんなプレゼントが適しているでしょうか。山田太郎は、おいしいものが大好きです。山田太郎は、東京都**区に在住しています。彼のメールアドレスは [email protected] です。彼の電話番号は 090-0000-0000 です。" --language ja

Demo

demo_en demo_ja

TODO

  • Improve the anonymization process. The current version often fails to anonymize.
  • Find efficient ways for human to find fails to anonymize.
  • Separate the demo portion from the library portion so that they can be used independently.
  • Refactor the code. The first version has a lot of code that is not DRY.
  • Follow the best practices of Python.
  • Add validation and exception handling.
  • Add tests.

Contributing

Pull requests are welcome. This is my first python project, so I'm not sure if I'm following the best practices. If you have any suggestions, please let me know.

License

Licensed under the MIT License.

prompt-anonymizer's People

Contributors

akazah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

horiso0921

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.