Comments (4)
Something similar to this can be achieved with add_shadow_shift
, except for the miss_label
.
library(narnia)
library(tidyverse)
airquality %>%
add_shadow_shift(vars = c("Ozone", "Solar.R")) %>%
mutate(miss_label = label_missing_2d(Ozone,Solar.R)) %>%
head()
#> # A tibble: 6 x 9
#> Ozone Solar.R Wind Temp Month Day Ozone_shift Solar.R_shift
#> <int> <int> <dbl> <int> <int> <int> <dbl> <dbl>
#> 1 41 190 7.4 67 5 1 41.00000 190.00000
#> 2 36 118 8.0 72 5 2 36.00000 118.00000
#> 3 12 149 12.6 74 5 3 12.00000 149.00000
#> 4 18 313 11.5 62 5 4 18.00000 313.00000
#> 5 NA NA 14.3 56 5 5 -14.79436 -25.09977
#> 6 28 NA 14.9 66 5 6 28.00000 -19.64529
#> # ... with 1 more variables: miss_label <chr>
It would be interesting to add shadow
cols onto each variable added, and then a catch all label.
Perhaps there might be a more compact way of showing the below code:
airquality %>%
select(Ozone, Solar.R) %>%
add_shadow_shift(vars = c("Ozone", "Solar.R")) %>%
cast_shadow(vars = c("Ozone", "Solar.R")) %>%
select(-Ozone,-Solar.R) %>%
mutate(any_missing = label_missings(airquality))
#> # A tibble: 153 x 5
#> Ozone_shift Solar.R_shift Ozone_NA Solar.R_NA any_missing
#> <dbl> <dbl> <fctr> <fctr> <chr>
#> 1 41.00000 190.00000 !NA !NA Not Missing
#> 2 36.00000 118.00000 !NA !NA Not Missing
#> 3 12.00000 149.00000 !NA !NA Not Missing
#> 4 18.00000 313.00000 !NA !NA Not Missing
#> 5 -14.94837 -28.02921 NA NA Missing
#> 6 28.00000 -32.93632 !NA NA Missing
#> 7 23.00000 299.00000 !NA !NA Not Missing
#> 8 19.00000 99.00000 !NA !NA Not Missing
#> 9 8.00000 19.00000 !NA !NA Not Missing
#> 10 -16.26483 194.00000 NA !NA Missing
#> # ... with 143 more rows
from naniar.
I think that the best I can do now is this:
library(narnia)
library(tidyverse)
#> Loading tidyverse: ggplot2
#> Loading tidyverse: tibble
#> Loading tidyverse: tidyr
#> Loading tidyverse: readr
#> Loading tidyverse: purrr
#> Loading tidyverse: dplyr
#> Conflicts with tidy packages ----------------------------------------------
#> filter(): dplyr, stats
#> lag(): dplyr, stats
aq_shift <- airquality %>%
cast_shadow_shift(vars = c("Ozone", "Solar.R")) %>%
add_label_missings()
aq_shift
#> # A tibble: 153 x 7
#> Ozone Solar.R Ozone_NA Solar.R_NA Ozone_shift Solar.R_shift any_missing
#> <int> <int> <fctr> <fctr> <dbl> <dbl> <chr>
#> 1 41 190 !NA !NA 41.00000 190.00000 Not Missing
#> 2 36 118 !NA !NA 36.00000 118.00000 Not Missing
#> 3 12 149 !NA !NA 12.00000 149.00000 Not Missing
#> 4 18 313 !NA !NA 18.00000 313.00000 Not Missing
#> 5 NA NA NA NA -17.38178 -29.47859 Missing
#> 6 28 NA !NA NA 28.00000 -33.72173 Missing
#> 7 23 299 !NA !NA 23.00000 299.00000 Not Missing
#> 8 19 99 !NA !NA 19.00000 99.00000 Not Missing
#> 9 8 19 !NA !NA 8.00000 19.00000 Not Missing
#> 10 NA 194 NA !NA -11.54721 194.00000 Missing
#> # ... with 143 more rows
ggplot(aq_shift,
aes(x = Ozone_shift,
y = Solar.R_shift,
colour = any_missing)) +
geom_point()
This allows the user to get the same data structure out that powers geom_missing_point()
library(narnia)
library(tidyverse)
#> Loading tidyverse: ggplot2
#> Loading tidyverse: tibble
#> Loading tidyverse: tidyr
#> Loading tidyverse: readr
#> Loading tidyverse: purrr
#> Loading tidyverse: dplyr
#> Conflicts with tidy packages ----------------------------------------------
#> filter(): dplyr, stats
#> lag(): dplyr, stats
ggplot(airquality,
aes(x = Ozone,
y = Solar.R)) +
geom_missing_point()
from naniar.
I have shortened this code to be a little more concise, with the (slightly verbose) function cast_shadow_shift_label
library(tidyverse)
#> Loading tidyverse: ggplot2
#> Loading tidyverse: tibble
#> Loading tidyverse: tidyr
#> Loading tidyverse: readr
#> Loading tidyverse: purrr
#> Loading tidyverse: dplyr
#> Conflicts with tidy packages ----------------------------------------------
#> filter(): dplyr, stats
#> lag(): dplyr, stats
library(narnia)
# using cast is like transmute - it just casts shadow vars for the
# variables of interest, this facilitates plotting and other summaries
aq_shift <- airquality %>% cast_shadow_shift_label(c("Ozone", "Solar.R"))
aq_shift
#> # A tibble: 153 x 7
#> Ozone Solar.R Ozone_NA Solar.R_NA Ozone_shift Solar.R_shift any_missing
#> <int> <int> <fctr> <fctr> <dbl> <dbl> <chr>
#> 1 41 190 !NA !NA 41.00000 190.00000 Not Missing
#> 2 36 118 !NA !NA 36.00000 118.00000 Not Missing
#> 3 12 149 !NA !NA 12.00000 149.00000 Not Missing
#> 4 18 313 !NA !NA 18.00000 313.00000 Not Missing
#> 5 NA NA NA NA -17.13731 -17.57498 Missing
#> 6 28 NA !NA NA 28.00000 -32.76235 Missing
#> 7 23 299 !NA !NA 23.00000 299.00000 Not Missing
#> 8 19 99 !NA !NA 19.00000 99.00000 Not Missing
#> 9 8 19 !NA !NA 8.00000 19.00000 Not Missing
#> 10 NA 194 NA !NA -14.16446 194.00000 Missing
#> # ... with 143 more rows
ggplot(aq_shift,
aes(x = Ozone_shift,
y = Solar.R_shift,
colour = any_missing)) +
geom_point()
from naniar.
Done!
from naniar.
Related Issues (20)
- update pkgdown to bootstrap 3
- Gives a warning message saying "The `guide` argument in `scale_*()` cannot be `FALSE`" HOT 2
- `shadow_long` throws an error when gathering variables. HOT 2
- Cannot modify ggplot theme for `gg_miss_upset`. HOT 2
- Improve `miss_summary` by providing a nice print method HOT 1
- Helpers to identify which rows and variables have over a certain % or number of missingns
- consider deprecating `cast_shadow` and friends
- use across with scoped variants with replace_with_na (e.g., _if _at _all)
- use cli for error and warning messages, and expect_snapshot for capturing errors HOT 1
- Implement new `gg_miss_upset()` function
- impute_mean_all throws an error when working with dataset containing categorical variables HOT 2
- Imputation of categorical data HOT 2
- Error running MCAR_TEST: Error in test_if_dataframe(data) : could not find function "test_if_dataframe" HOT 1
- Fix package alias issue
- Defining the range of % Miss in gg_miss_var HOT 1
- recode_shadow() special missings are not accounted for by summary functions HOT 1
- miss_scan_count should contain percentage information and default to descending order
- gg_miss_fct() is using a deprecated function from forcats package HOT 3
- Release naniar 1.1.0
- `imputed` as a basic method
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from naniar.