Comments (2)
I have the same issue, picked up with ordinal indicators. It looks like this is a problem with the hunspell parser:
hunspell::hunspell_parse(c("1st", "RNA-seq", "EIF4G1"))
#> [[1]]
#> [1] "st"
#>
#> [[2]]
#> [1] "RNA" "seq"
#>
#> [[3]]
#> [1] "EIF" "G"
Created on 2021-02-06 by the reprex package (v0.3.0)
from spelling.
Implementing a pre filter right before the parse here could work:
Lines 118 to 123 in a2b5f29
It feels like more of a quick-fix because it parses with strsplit()
then paste()
s back together before being sent to the actual parsing function.
ignore_words <- c("1st", "RNA-seq", "EIF4G1")
lines <- c(
"This is the 1st line. It has first written in it.",
"The second has RNA-seq inside. But does not use RNAseq -- without the '-'",
"EIF4G1 but not EIF4G1fdsadf is used",
"This line's words are fine!"
)
pre_filter_plain <- function(lines, ignore = character()) {
word_list <- strsplit(lines, "([^-[:alnum:][:punct:]])")
vapply(
word_list,
function(i) {
paste(i[!i %in% ignore], collapse = " ")
},
character(1)
)
}
pre_filter_plain(lines, ignore_words)
#> [1] "This is the line. It has first written in it."
#> [2] "The second has inside. But does not use RNAseq -- without the '-'"
#> [3] "but not EIF4G1fdsadf is used"
#> [4] "This line's words are fine!"
Created on 2021-02-06 by the reprex package (v0.3.0)
from spelling.
Related Issues (20)
- Spell check Roxygen documentation comments
- spell_check_package() to include NEWS and ChangeLog too HOT 2
- Rmd files with LaTeX
- Error in read_xml.raw HOT 5
- Specify additional arbitrary package files to check
- CRAN spelling: spell_check_files with files in different directories
- "PCDATA invalid Char value" error HOT 3
- Exclude specific files from spell check similar to a `.gitignore` file HOT 1
- WISH: Add support for .aspell/defaults.R and .aspell/WORDLIST.rds
- Avoid spell check for `References` section of Roxygen2 function documentation HOT 1
- Links get treated as text in commonmark 1.9.0 HOT 2
- Error in read_xml.raw: Input is not proper UTF-8, indicate encoding !
- Declared encoding is not used in a package?
- Error in sub(dest, "", xml2::xml_text(node), fixed = TRUE) : zero-length pattern HOT 4
- Add support for using multiple dictionaries / languages HOT 3
- FR: Use clickable hyperlinks in `spell_check_*()`
- Mention `update_wordlist()` on spell check failure
- [feature request] `.qmd` (quarto) format
- Add support to multiple WORDLISTs HOT 4
- `update_wordlist()` claims that it will remove from WORDLIST words from `SPELLING_WORDLIST`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spelling.