Comments (13)
Fixing this today!
from identity-matching.
@carlosms Can you please send me cache-raw.csv
and cache-external.csv
from identity-matching.
Actually, I have identified the problem, not required.
from identity-matching.
Please confirm and I will close the issue.
from identity-matching.
Same problem, same error trace.
The contents of cache-raw.csv
:
repo,name,email
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/test-repo,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/test-repo,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/test-repo,Carlos Martín,[email protected]
github.com/carlosms-test-org/test-repo,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
github.com/carlosms-test-org/lookout-test,Carlos Martín,[email protected]
cache-external.csv
:
email,user,name,match
from identity-matching.
Are you sure you've updated? The error trace cannot be the same, I have changed the panic text heavily.
from identity-matching.
Yes, using this commit
commit b7c6e8d34e7f791cea33d4683c5120b529f66f08
Merge: bbaf008 75c1674
Author: Vadim Markovtsev <[email protected]>
Date: Thu Oct 3 13:40:02 2019 +0200
Merge pull request #70 from vmarkovtsev/master
Handle empty results
But looking at the changes, it looks like these condition can never be true:
f != f
mean != mean
from identity-matching.
Oh wait, the docker was built in multi stages and the binary itself was not built with the last commit. After forcing to rebuild, this is the new output:
identity-matching_1 | time="2019-10-03T12:21:45Z" level=info msg="Using caching for external matching" cachePath=cache-external.csv
identity-matching_1 | time="2019-10-03T12:21:45Z" level=info msg="Dumping CachedMatcher cache."
identity-matching_1 | time="2019-10-03T12:21:45Z" level=info msg="looking for people in commits"
identity-matching_1 | time="2019-10-03T12:21:45Z" level=info msg="not cached in cache-raw.csv, loading from the database"
identity-matching_1 time="2019-10-03T12:21:45Z" level=info msg="caching the result to cache-raw.csv"
identity-matching_1 | time="2019-10-03T12:21:45Z" level=info msg="found people" elapsed=16.707213ms people=1
identity-matching_1 | time="2019-10-03T12:21:45Z" level=info msg="reducing people"
identity-matching_1 | time="2019-10-03T12:21:46Z" level=warning msg="unable to find users for email: [email protected]"
identity-matching_1 | time="2019-10-03T12:21:46Z" level=warning msg="no matches for person :carlos martin||[email protected]"
identity-matching_1 | time="2019-10-03T12:21:46Z" level=panic msg="Commit(\"connected component size std\", NaN)"
identity-matching_1 | panic: (*logrus.Entry) (0x85f5a0,0xc00006e540)
identity-matching_1 |
identity-matching_1 | goroutine 1 [running]:
identity-matching_1 | github.com/sirupsen/logrus.Entry.log(0xc0000aa120, 0xc00074a150, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
identity-matching_1 | /go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:227 +0x2ce
identity-matching_1 | github.com/sirupsen/logrus.(*Entry).Log(0xc00006e3c0, 0x0, 0xc000144dc0, 0x1, 0x1)
identity-matching_1 | /go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:256 +0xe4
identity-matching_1 | github.com/sirupsen/logrus.(*Entry).Logf(0xc00006e3c0, 0xc000000000, 0x86e9ea, 0x10, 0xc000144e80, 0x2, 0x2)
identity-matching_1 | /go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:301 +0xc5
identity-matching_1 | github.com/sirupsen/logrus.(*Logger).Logf(0xc0000aa120, 0xc000000000, 0x86e9ea, 0x10, 0xc000144e80, 0x2, 0x2)
identity-matching_1 | /go/pkg/mod/github.com/sirupsen/[email protected]/logger.go:137 +0x96
identity-matching_1 | github.com/sirupsen/logrus.(*Logger).Panicf(...)
identity-matching_1 | /go/pkg/mod/github.com/sirupsen/[email protected]/logger.go:178
identity-matching_1 | github.com/sirupsen/logrus.Panicf(...)
identity-matching_1 | /go/pkg/mod/github.com/sirupsen/[email protected]/exported.go:168
identity-matching_1 | github.com/src-d/identity-matching/reporter.Commit(0x874dd4, 0x1c, 0x7c67a0, 0xc00047ef50)
identity-matching_1 | /go/src/identity-matching/reporter/reporter.go:19 +0x1c6
identity-matching_1 | github.com/src-d/identity-matching.ReducePeople(0xc00017a3f0, 0x8f73c0, 0xc0000241c0, 0xc00017a2a0, 0xc00017a480, 0xc00017a720, 0xc00017b620, 0xc000334150, 0xc000334c90, 0x14, ...)
identity-matching_1 | /go/src/identity-matching/matching.go:215 +0x1b16
identity-matching_1 | main.main()
identity-matching_1 | /go/src/identity-matching/cmd/match-identities/main.go:81 +0x886
from identity-matching.
it looks like these condition can never be true:
This is how you check NaN values
from identity-matching.
So checked the mean
but it is fine, the standard deviation was buggy. Fixing.
from identity-matching.
it looks like these condition can never be true:
This is how you check NaN values
Didn't know this trick, thanks.
from identity-matching.
I hope this is finally fixed.
from identity-matching.
Yes, now it's running fine, thank you for the quick fix.
from identity-matching.
Related Issues (20)
- Study how the quality depends on the hard identity size limit HOT 1
- Include the external identifiers into the result HOT 2
- Add another output format: Postgres HOT 2
- request external API only once
- Make the project open-source HOT 2
- Assume the output format is parquet when the output path points to a parquet file HOT 1
- Detect the primary name of an identified person HOT 1
- The list of popular names is too large HOT 1
- Bad precision and recall (~60%) on IBM and intel open source stacks HOT 4
- Use more efficient API for GitHub HOT 3
- Consider committer data, not just commit author HOT 3
- Debug the bot detection pipeline HOT 1
- Extract commit date for stats filtering
- Save and load the bot detection model from modelforge HOT 6
- v3.1.0 doesn't seem to finish after running on writeas org HOT 6
- Alter Docker image so it dumps output to a defined folder HOT 4
- Print version at startup HOT 1
- Incremental operation support
- Performance dropped critically HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from identity-matching.