asgeirrr / pgantomizer Goto Github PK
View Code? Open in Web Editor NEWAnonymize data in your PostgreSQL dabatase with ease
License: BSD 3-Clause "New" or "Revised" License
Anonymize data in your PostgreSQL dabatase with ease
License: BSD 3-Clause "New" or "Revised" License
Hey ! I would like help for understand format file and where i put the table name. Can you help me ? Thank's
I have two tables that have a foreign key, and I want to truncate both of them. When I do it, though, I get the following error.
psycopg2.errors.FeatureNotSupported: cannot truncate a table referenced in a foreign key constraint
DETAIL: Table "tutor_bot_messages" references "tutor_bot_conversations".
HINT: Truncate table "tutor_bot_messages" at the same time, or use TRUNCATE ... CASCADE.
I asked ChatGPT for an answer:
It would be great if that were actually true
Thank you for your work on this project. I noticed the following when trying this out:-
Even when I manually drop cascaded or rename the customer & customer_address tables before running the anonymizer the rows look the same. Looking at anonymize.py I see hard coded reference the the public schema instead of using a parameter value. It should not be assumed that all databases alway suse the public schema or indeed have one at all.
In anonymize.py I see code to anonymize with hardcoded time & date values and inet values. It might be better to change this by adding random time periods for the time date values and randomise the inet. If the tables and being randomised prior to be used for some application testing then fixed values are unhelpful to test correct processing.
The default Yaml definition is to only exclude columns specified as raw. With hundreds of tables most of which do not need to be anonymised this means all the tables and columns need to be specified as raw. This is why I moved the tables to another schema for anonymising purposes in the expectation of moving them back after processing.
Starting with Psycopg 2.7.4, pscopg2 will have 2 pip packages, one from source and another psycopg2-binary
. Right now, pgantomizer
uses binary but that will be removed in 2.8[1].
I'm having version 2.7.6.1
and each time when I import I get:
/myenv/python3.6/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
I don't mind opening PR with updated setup.py
.
What do you think?
[1] http://initd.org/psycopg/articles/2018/02/08/psycopg-274-released/
It'd be great to have in-built support to use faker to generate realistic content
Hello,
I am testing your software but I can't make it run
When I launched:
pgantomizer_dump --schema dolibarr.yml --dbname dolibarr --user labo
I get the following error:
Traceback (most recent call last):
File "/usr/local/bin/pgantomizer_dump", line 9, in <module>
load_entry_point('pgantomizer==0.1.1', 'console_scripts', 'pgantomizer_dump')()
File "/usr/local/lib/python3.4/dist-packages/pgantomizer/dump.py", line 53, in main
if args.dbname and args.user else [])
File "/usr/local/lib/python3.4/dist-packages/pgantomizer/dump.py", line 23, in dump_db
subprocess.run(cmd, shell=True)
AttributeError: 'module' object has no attribute 'run'
I'm running it on Debian 8 jessie with Python 3.4.2
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.