cantino / my_obfuscate Goto Github PK
View Code? Open in Web Editor NEWStandalone Ruby code for the selective re-writing of SQL dumps in order to protect user privacy.
License: MIT License
Standalone Ruby code for the selective re-writing of SQL dumps in order to protect user privacy.
License: MIT License
I want to generate ids
of type integer from 1..100000
, but for some reason I'm getting 2 or 3 duplicate values due to which dump of that one particular table fails. Any way to generate unique numbers throughout
Hello,
I tried to explore the code, but have not found answer. Is it possible to get in proc already obfuscated data?
Like here:
:users => {
:username => :email,,
:username_canonical => { :type => :fixed, :string => proc { |row| row[:username] }},
:email => { :type => :fixed, :string => proc { |row| row[:username] }},
:email_canonical => { :type => :fixed, :string => proc { |row| row[:username] }},
:first_name => :first_name,
},
I would like to set same obfuscated value for 1th - 4th fields. In above example 2th - 4th are not obfuscated.
The script doesn't seem to work with postgres tablenames using "":
I get:
Deprecated: "user" was not specified in the config. A future release will cause this to be an error. Please specify the table definition or set it to :keep.
with the following specified in obfuscate.rb
:"user" => {...}
Perhaps due to CVE-2018-1058 pg_dump now prefaces tables with the schema, by default that will be public
. The README documentation is now out of date with regards to postgres 10.3+, I was able update an obfuscator config like this to make it work:
obfuscator = MyObfuscate.new({
:'public.users' => :keep
})
my_obfuscate-0.5.3/lib/my_obfuscate/copy_statement_parser.rb:20:in `block in parse': Cannot obfuscate Postgres dumps containing INSERT statements. Please use COPY statments. (RuntimeError)
copy_statement_parser.rb Throws an error when found inserts in PL/pgSQL stored procedures definition, the rest of the SQL have the COPY statments,
CREATE FUNCTION actualizar_existencia() RETURNS void
LANGUAGE plpgsql
AS $$
declare
...
if not found then
-- si no existe, creamos
insert into lote_deposito_temp(cantidad, lote_id, deposito_id)
...
Will be good if my_obfuscate just ingore the inserts from store procedures,
Is there a way to selectively truncate certain rows?
e.g.
Obfuscating this:
INSERT INTO table (id, value) VALUES (1, "a")
INSERT INTO table (id, value) VALUES (1, "a")
INSERT INTO table (id, value) VALUES (1, "b")
INSERT INTO table (id, value) VALUES (1, "b")
Results in this:
INSERT INTO table (id, value) VALUES (1, "a")
INSERT INTO table (id, value) VALUES (1, "a")
Sorry for what may be an easy question...
I have made a couple of improvements and want to contribute them back. However before doing that naturally want to include some rspec tests.
For some reason I can't get the rspec suite to run successfully (even before my changes).
For 29 of the 90 tests it is throwing the error: uninitialized constant MyObfuscate::ConfigApplicator::Faker
.
For example:
1) MyObfuscate::ConfigApplicator.apply_table_config should work on email addresses Failure/Error: new_row = MyObfuscate::ConfigApplicator.apply_table_config(["blah", "something_else"], {:a => {:type => :email}}, [:a, :b])
NameError:
uninitialized constant MyObfuscate::ConfigApplicator::Faker
# ./lib/my_obfuscate/config_applicator.rb:33:in
block in apply_table_config'
# ./lib/my_obfuscate/config_applicator.rb:8:in each'
# ./lib/my_obfuscate/config_applicator.rb:8:in
apply_table_config'
# ./spec/my_obfuscate/config_applicator_spec.rb:8:in block (4 levels) in <top (required)>'
# ./spec/my_obfuscate/config_applicator_spec.rb:7:in
times'
# ./spec/my_obfuscate/config_applicator_spec.rb:7:in block (3 levels) in <top (required)>'
Any clues?
Some companies will only use gems with a certain license.
The canonical and easy way to check is via the gemspec
via e.g.
spec.license = 'MIT'
# or
spec.licenses = ['MIT', 'GPL-2']
There is even a License Finder to help companies ensure all gems they use
meet their licensing needs. This tool depends on license information being available in the gemspec.
Including a license in your gemspec is a good practice, in any case.
If you need help choosing a license, github has created a license picker tool
How did I find you?
I'm using a script to collect stats on gems, originally looking for download data, but decided to collect licenses too,
and make issues for missing ones as a public service :)
https://gist.github.com/bf4/5952053#file-license_issue-rb-L13 So far it's going pretty well.
I've written a blog post about it
Hello,
I'm reaching such error with utf8mb4_general_ci collation:
/var/lib/gems/2.3.0/gems/my_obfuscate-0.3.7/lib/my_obfuscate/mysql.rb:6:in `match': invalid byte sequence in UTF-8 (ArgumentError)
from /var/lib/gems/2.3.0/gems/my_obfuscate-0.3.7/lib/my_obfuscate/mysql.rb:6:in `parse_insert_statement'
from /var/lib/gems/2.3.0/gems/my_obfuscate-0.3.7/lib/my_obfuscate.rb:41:in `block in obfuscate'
Any idea how to fix it?
Hey guys,
I'm trying to do a scaffold, but it is throwing error:
mysqldump -c --hex-blob -uroot -p db | ruby scaffolder.rb > obfuscator_scaffold.rb_snippet
scaffolder.rb:6:in `<main>': undefined method `scaffold' for #<MyObfuscate:0x007fc91597ef38 @config={}> (NoMethodError)
When I try to see the methods in MyObfuscate, I'm not able to see scaffold - any idea?
irb(main):011:0> ob = MyObfuscate.new
=> #<MyObfuscate:0x007f88c222b490 @config={}>
irb(main):012:0> ob.methods.sort
=> [:!, :!=, :!~, :<=>, :==, :===, :=~, :__id__, :__send__, :check_for_defined_columns_not_in_table, :check_for_table_columns_not_in_definition, :class, :clone, :config, :config=, :database_helper, :database_type, :database_type=, :define_singleton_method, :display, :dup, :enum_for, :eql?, :equal?, :extend, :fail_on_unspecified_columns, :fail_on_unspecified_columns=, :fail_on_unspecified_columns?, :freeze, :frozen?, :globally_kept_columns, :globally_kept_columns=, :hash, :inspect, :instance_eval, :instance_exec, :instance_of?, :instance_variable_defined?, :instance_variable_get, :instance_variable_set, :instance_variables, :is_a?, :kind_of?, :method, :methods, :missing_column_list, :nil?, :obfuscate, :obfuscate_bulk_insert_line, :object_id, :private_methods, :protected_methods, :public_method, :public_methods, :public_send, :reassembling_each_insert, :remove_instance_variable, :respond_to?, :send, :singleton_class, :singleton_method, :singleton_methods, :taint, :tainted?, :tap, :to_enum, :to_s, :trust, :untaint, :untrust, :untrusted?]
Amazon Linux 2015.03 gem for my_obfuscate 0.5.3 has inconsistent my_obfuscate.rb and my_obfuscate/config_applicator.rb. The former requires ffaker, the latter references Faker, resulting in uninitialized constant MyObfuscate::ConfigApplicator::Faker (NameError).
Is there a way to truncate the data to the last (or first) x records?
Production databases often contain more data than needed in development. Say I have an invoices table with a few 1000 records. I obviously need them in production but, to keep my development database small and fast, I probably only want to keep like 10 or so in development.
Thanks.
"MyObfuscate.apply_table_config should work on email addresses" is failing occasionally. Here is the output:
.....................................F................................................
Failures:
1) MyObfuscate MyObfuscate.apply_table_config should work on email addresses
Failure/Error: new_row.first.should =~ /^[\w\.]+\@\w+\.\w+\.[a-f0-9]{5}\.example\.com$/
expected: /^[\w\.]+\@\w+\.\w+\.[a-f0-9]{5}\.example\.com$/
got: "[email protected]" (using =~)
Diff:
@@ -1,2 +1,2 @@
-/^[\w\.]+\@\w+\.\w+\.[a-f0-9]{5}\.example\.com$/
+"[email protected]"
# ./spec/my_obfuscate_spec.rb:30:in `block (3 levels) in <top (required)>'
Finished in 0.0774 seconds
86 examples, 1 failure
Failed examples:
rspec ./spec/my_obfuscate_spec.rb:27 # MyObfuscate MyObfuscate.apply_table_config should work on email addresses
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.