Tools for finding and manipulating differences between files
This project is available under the terms of either the Apache 2.0 license or the MIT license.
Tools for finding and manipulating differences between files
License: Apache License 2.0
Tools for finding and manipulating differences between files
This project is available under the terms of either the Apache 2.0 license or the MIT license.
This crate seems to fail if either the patch text or the text being patched contain Windows line endings. For example, the following test cases:
let text_lf = "Allomancy\nFeruchemy\n";
let text_crlf = "Allomancy\r\nFeruchemy\r\n";
let patch_lf =
"\
--- original\n\
+++ modified\n\
@@ -1,2 +1,3 @@\n Allomancy\n Feruchemy\n+Hemalurgy\n";
let patch_crlf =
"\
--- original\r\n\
+++ modified\r\n\
@@ -1,2 +1,3 @@\r\n Allomancy\r\n Feruchemy\r\n+Hemalurgy\r\n";
let p_lf = Patch::from_str(&patch_lf).unwrap();
let p_crlf = Patch::from_str(&patch_crlf);
assert!(diffy::apply(text_lf, &p_lf).is_ok()); // OK
assert!(p_crlf.is_ok()); // ParsePatchError("invalid char in unquoted filename")
assert!(diffy::apply(text_crlf, &p_lf).is_ok()); // ApplyError(1)
The latter two asserts fail. Would you be open to supporting Windows line endings in this crate? I'd be happy to PR if one is welcome.
The patch program tries to apply all hunks even if some hunk fails and outputs applied data and failed hunk list.
Consider the following merge scenario:
{
int a = 2;
}
{
}
{
int a = 2;
int b = 3;
}
Successful merge without any conflict:
{
}
Conflicting merge:
{
<<<<<<< ours
||||||| ancestor
int a = 2;
=======
int a = 2;
int b = 3;
>>>>>>> theirs
}
One can check that git merge-file -p /tmp/ours /tmp/ancestor /tmp/theirs
indeed produces the output above.
Right now users are unable to set their own filenames, which appear when displaying a patch or when a merge has conflicts. There should be some interface which makes it easy for user's to overwrite the currently used defaults. A few options include:
DiffOptions
and MergeOptions
File
which is a tuple (filename, contents) which are passed into the create_patch
and merge
methods instead of just passing in their contents.At the moment, in order to create a patch from two files, both of these files must be read and copied into memory at the same time, then a reference to these files must be sent to the create_patch
method.
This method is not only costly in memory (as both files must exist together in memory) but also costly in time (as two iterations of each file are needed, one to load into memory and one to create the patch)
Would it therefore be possible to modify the functions to accept a reader like Read
or ReadBuf
instead of a slice?
I've tried to integrate diffy
into diesel_cli
as suggested on reddit by @sgrif. While this is conceptually working it fails because diffy
expects patches to include a file name header. As far as I understand the code this is caused by those lines
In detail the following patch file from our test suite fails to parse:
@@ -1,12 +1,13 @@
diesel::table! {
users1 (id) {
- id -> Nullable<Integer>,
+ id -> Integer,
}
}
diesel::table! {
- users2 (id) {
- id -> Nullable<Integer>,
+ users2 (myid) {
+ #[sql_name = "id"]
+ myid -> Integer,
}
}
Thanks for this great crate!
Akthough several crates on crates.io
implement Myers' algorithm, diffy
is the only one (If I understood correctly) that easily performs 3-way merges.
I need to make 3-way merges for non-textual types (lists of integers), and that would be awesome that this crate would make it possible.
I want --fuzzy
option in patch
#4 identified that the use of the matches!
macro pushed the msrv to 1.42. Once that PR is landed determine what the new msrv is (or should be) and add a check in CI.
Usually while comparing Strings, this warning comes up. This is fine when we compare files, still there should be a way to suppress this warning.
Here is a similar issue in Ruby's diffy package
Ref: samg/diffy#88
let patch = "\
@@ -10,6 +1000000,8 @@
First:
Life before death,
strength before weakness,
journey before destination.
Second:
- I will put the law before all else.
+ I swear to seek justice,
+ to let it guide me,
+ until I find a more perfect Ideal.
";
let original = "\
First:
Life before death,
strength before weakness,
journey before destination.
Second:
I will put the law before all else.
";
let patch = Patch::from_str(patch).unwrap();
let result = apply(original, &patch);
This code works very slovely because of incorrect value 1000000 of patch
I'm creating a patch by comparing strings from two files, e.g., let patch = create_patch(&original, &modified);
where original
and modified
are the contents of two separate files.
I'd like to use the file paths to those files in the patch output, e.g.:
--- a/<path-to-original>
+++ b/<path-to-modified>
But it looks like the filenames are being hard-coded as original
and modified
. Am I missing another way of specifying the paths to the files?
Test Case:
fn main() {
let base = r#"
class GithubCall(db.Model):
`url`: URL of request Example.`https://api.github.com`"#;
let theirs = r#"
class GithubCall(db.Model):
`repo`: String field. Github repository fields. Example: `amitu/python`"#;
let ours = r#"
class Call(models.Model):
`body`: String field. The payload of the webhook call from the github.
`repo`: String field. Github repository fields. Example: `amitu/python`"#;
println!("Diffy merge test");
match diffy::merge(base, ours, theirs) {
Ok(s) => {
println!("{}", s);
}
Err(s) => {
println!("{}", s);
}
}
}
Findings:
clean_conficts
function is commented.ConflictStyle
is changed from ConflictStyle::Diff3
to ConflictStyle::Merge
Within the documentation most parameters are described as file
. But it seems like the parameters are actually strings
(at least in many cases).
I think it makes sense to update the documentation accordingly.
Hello there!
I saw in your reddit announcement post that "For right now the library only operates on utf8 strings but it shouldn't be too much work to also handle non-utf8 input (which would be needed if this was to actually be used in a VCS)", which I definitely agree with, so I thought I would open this issue to encourage the bytes API as a fully-supported interface.
Thanks for building this diffing tool.
Bug report from Georg Hopp:
Everything worked quite well until I created a
change with two changes in adjacent Hunks. The stored patch looked like
this:let dummy = "\ --- original +++ modified @@ -110,7 +110,7 @@ --- Wie Links lassen sich auch Bilder wie mein -![Fiona und Ian](/api/v0/images/5?size=small) +![Gravatar](https://www.gravatar.com/avatar/fd016c954ec4ed3a4315eeed6c8b97b8) in den Text ein. Im Fließtext sieht das allerdings ein bisschen dumm aus es sei denn man hat @@ -117,7 +117,7 @@ entsprechend angepasste styles. Besser scheint mir daher Bilder nur zwischen Paragraphen zu platzieren. -![Mia und Ian](/api/v0/images/6?size=small) +![Gravatar](https://www.gravatar.com/avatar/fd016c954ec4ed3a4315eeed6c8b97b8) Etwas so wie hier. ";
When I tried to parse this I got the following error:
"Hunks not in order or overlap". Counting the lines I do not see the
overlap. The hunks are exactly adjacent.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.