Comments (12)
That would be the place to do the normalization. I'll try to find time to find out which browsers add weird nbsp characters and if there's a way to detect them with js.
from recogito-js.
The 5 sec timeout makes recogito set the annotation correctly - I will try to investigate how to get to a 'networkLoadingFinished' event and share my results here
from recogito-js.
Have you tried toggling the mode: pre
init setting? Without that, the offset corresponds to the char offset in the markup, including newlines and extra whitespace that the browser doesn't render.
With pre
enabled, Recogito normalizes the markup, so that it's a correct representation of the rendered markup.
My gut feeling its that browsers my act different without pre
. But I frankly didn't have any reports yet. (Probably because not too much cross browser testing has happened.)
Just to confirm: your site doesn't add any surrounding works elements that differ between browsers?
Is there any pattern or regular behavior your can observe with the offsets?
from recogito-js.
PS: BTW (may or my not be related...) I had an intriguing case once, where there were offset shifts. The shifts increased the further down they were in the document. The reason was related to the original markup file containing a mix of line break encodings: \r\n vs \n (or "LF" vs "CRLF").
All browsers seemed fine when the markup files had line breaks encoded as only linebreaks, or as only carriage return+linebreak. But documents with a mix of both encodings would cause some browsers to count CRLF as two characters, thus shifting the offsets by one extra char for every CRLF. (I think Chrome was fine, but not the others. Or maybe vice versa.)
from recogito-js.
Thanks @rsimon for the last comment - It seems that different browsers handle line breaks but also non-breaking spaces differently. But the actual difference is really, really hard to find...
I tried changing the mode but that didn't help, as we're working with html text.
from recogito-js.
Interesting. Is it possible on your end to normalize the output for the affected characters?
Alternatively, I think, the normalization could also happen inside RecogitoJS, in the HTML normalization code here:
https://github.com/recogito/recogito-js/blob/main/src/utils/index.js
It might be just a list of replace
directives. (But, obviously, the hard work would be to figure out which characters need replacing...)
Regarding pre: sorry, I got that the wrong way around. The HTML normalization happens in normal mode, while the in pre
, the original markup character offsets are retained. I.e. in your case, not using pre
is the right way.
from recogito-js.
I never get the same positon in text with mode: 'pre'
nor with mode: 'html'
To reproduce:
I'm selecting the text and saving the annoation, I copy the console.log'ed annotation object and try to re-create it with r.addAnnotation(copiedAnnotation)
I create this manually (on screen)
This is what Recogito does on the r.addAnnotation()
call for the same annotation data
I want to be able to use recogito on a broad range of websites, so a per-example adjustment is not working for me. How can I achieve that recogito reliably displays the same selection?
from recogito-js.
That definitely looks like a lot more than the small small offsets that would be caused by browser differences. My guess is that your Web page is dynamically populated with text snippets loading asynchronously, and there's a timing issue where Recogito is already rendering annotations at the stored offsets, while some text before the annotation hasn't actually loaded yet. (Thus pushing the annotation down by a paragraph or so when it's loading.)
Do you have control over the loading sequence of the text, and can reliably wait with initializing Recogito until everything has loaded?
Another alternative would be to init multiple Recogito instances, one per paragraph. (But then you'd be responsible yourself of course to associate the annotations with the right paragraph during load.)
At least that's what I can guess. Not much else I can think of without knowing more about what's going on in the host web page. But dynamic loading would obviously cause trouble inevitably.
from recogito-js.
so what I'm doing is not initialzing recogito until the DOMContentLoaded
event, which might not take into account network connections beeing open/waiting for replies. Thanks for the explaination of how you perceive my issue, I will try to find more reliable solutions than just waiting for the DOM, I'll try a 5s timeout and see if my issue disappears.
EDIT: I'm using as the element, since I'm working with random external websites, so I cannot find a div ID for each website.
from recogito-js.
Great, yes - the 5s wait will at least confirm (or not) whether async loading is indeed the issue. Do you happen to have a public URL for one of the websites you are trying this on? Then I can take a look and see if I can find anything.
from recogito-js.
Finally we found out the culprit - we use a TipTap wysiwyg editor where users can input their answers and if you copypaste stuff from Google Docs, it includes weird whitespace symbols, especially one that turns into "\u00a0". Some browsers (haven't been able to confirm which) handle it differently, some render it as "regular" whitespace, some as whitespace symbols, which ultimately caused the issue. After stripping out the nasty spaces, all is well.
from recogito-js.
Yikes... in fact now that you mention it: I think I had a similar case once with people copy-and-pasting text from MS Word. I wonder if it's worth filtering in RecogitoJS, or whether that's something that should be considered out of scope. If so, it should probably be an operation during "deflating" the HTML here.
from recogito-js.
Related Issues (20)
- Creating annotation does not seem to work when target element is in iframe HOT 3
- The editor does not allow selection of text in readonly mode HOT 3
- Uncaught DOMException: Failed to execute 'setEnd' on 'Range': The offset [...] is invalid
- Annotate text in text area? HOT 5
- Raise error when double clicking at the end of the paragraph HOT 1
- Unknown file extension ".css" HOT 4
- Not recognizing specific characters such as ยง HOT 2
- Choose annotatable items, not all wrapped items
- ERROR in ./node_modules/@recogito/recogito-js/dist/recogito.min.css HOT 1
- Support for Vue and Typescript? HOT 1
- getAnnotations doesn't work when annotating content in iframe
- Highlighting collapses completely HOT 5
- Two annotations on the exact same piece of text HOT 4
- Tags as buttons HOT 2
- Dropdown closes even if the input has content HOT 3
- Raise external 'cancelAnnotation' event
- "mouseup" event considered even for context menu action HOT 1
- Prevent creation of zero-width SPANs HOT 2
- Rendering highlights becomes painfully slow for longer documents HOT 1
- Autocomplete for relations does not work as intended HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from recogito-js.